CN114416193A - Method for accurately and quickly determining configuration parameter value field of big data analysis system - Google Patents

Method for accurately and quickly determining configuration parameter value field of big data analysis system Download PDF

Info

Publication number
CN114416193A
CN114416193A CN202111539596.9A CN202111539596A CN114416193A CN 114416193 A CN114416193 A CN 114416193A CN 202111539596 A CN202111539596 A CN 202111539596A CN 114416193 A CN114416193 A CN 114416193A
Authority
CN
China
Prior art keywords
value
configuration parameter
step length
current value
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111539596.9A
Other languages
Chinese (zh)
Inventor
辛锦瀚
喻之斌
陈超
黄世鑫
王峥
杨永魁
郭伟钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202111539596.9A priority Critical patent/CN114416193A/en
Publication of CN114416193A publication Critical patent/CN114416193A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to a method for accurately and quickly determining a configuration parameter value field of a big data analysis system, which is used for quickly determining the value field of a configuration parameter. Firstly, determining an approximate range of values of configuration parameters of a big data analysis program according to the configuration of a hardware system in which the big data analysis program depends; second, starting with the default values of the parameters, the wandering is performed in larger steps until the program makes an error, and the values of the configuration parameters wandered the previous time the error was made are recorded. And thirdly, starting from the value of the configuration parameter recorded in the front, wandering by the reduced step length until the program makes an error, and recording the configuration parameter value wandered in the previous time when the program makes the error. Fourthly, the step size is reduced again, and the operation of the previous step is repeated. Until the step size is smaller than the set threshold value. The configuration parameter value of the previous time of the error is taken as the boundary of the configuration parameter.

Description

Method for accurately and quickly determining configuration parameter value field of big data analysis system
Technical Field
The disclosure relates to the field of big data processing, in particular to a method for accurately and quickly determining a configuration parameter value field of a big data analysis system.
Background
There are many numerical configuration parameters in a big data analysis system, some parameter configuration programs can use the size of the memory, some parameters configure the number of CPU cores that can be used by a task, and other aspects. For example, the configuration parameter Spark of the memory big data analysis engine Apache Spark specifies the size of the memory that can be used by each actuator (actuator).
Currently, widely used big data analysis engines such as Spark and Flink provide default values of configuration parameters, for example, the default value of the parameters is 1024MB, but no value ranges of the parameters are provided (i.e. upper and lower limits where parameter values can be configured), and the absence of the value ranges causes difficulty in reasonable setting of the parameters, is not favorable for configuration optimization, and even causes operation errors of big data analysis programs.
Disclosure of Invention
The present invention aims to solve the following problems: the big data analysis engine provides only default values for numeric configuration parameters and no value ranges. The invention provides a method for accurately and quickly determining a configuration parameter value field of a big data analysis system, which can quickly search the boundary of the configuration parameter. The method comprises the following steps:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
Preferably, in the method, the S300 includes the steps of:
s301, adding the current value of the configuration parameter to the current value of the step length, and taking the result as the current value of the new configuration parameter;
s302, writing the current values of the configuration parameters into a system and operating the system;
s303, judging whether the system can normally operate, and if so, returning to the step S301; otherwise, step S400 is executed.
Preferably, in the method, the S200 determines whether to stop searching for the value range boundary value by:
and if the current value of the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value.
Preferably, in the method, the method further comprises the steps of:
s700, judging whether the boundary search of the value range is finished or not; if not, the step length initial value is evaluated to the step length parameter after taking an inverse number, and the default value is used as the current value of the configuration parameter; the process returns to step S200.
Preferably, in the method, the step initial value is determined by:
s101, determining an approximate range of configuration parameter values of a big data analysis system according to configuration of a hardware system;
and S102, setting a step length initial value according to the approximate range.
Compared with the prior art:
the method disclosed by the invention takes a default value as a center, under the condition of determining the approximate configuration parameter value field range, firstly, a larger step length is used for roughly finding out an error point, the previous time of the error point is used as a starting point during searching again, the step length is reduced, the error point is gradually approached, and the steps are repeated so as to accurately and quickly search the supremum limit and the subprecium limit of the configuration parameter value field range.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a schematic flow chart of a method in an embodiment of the invention;
FIG. 2 is a schematic diagram of a search in an embodiment of the invention;
fig. 3 is a schematic diagram of another search in an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "comprising" and "having," and any variations thereof, in the description and claims of this application are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or apparatus is not necessarily limited to those steps or apparatus explicitly listed, but may include other steps or apparatus not explicitly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention are described in detail below with specific embodiments. Several of the following embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.
In one embodiment, the big data analytics engine provides only default values for numeric configuration parameters, and does not provide value ranges. In order to quickly determine the value range of the configuration parameter, a method flowchart as shown in fig. 1 is adopted, and the specific implementation steps are as follows:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
The above method regards the problem of searching the configuration parameter value field as a problem of quickly searching the boundary from a given position. In this method, a default value is used as the given position. And determining the rough range of values of the program configuration parameters according to the configuration of a hardware system on which the big data analysis program depends. And determining the initial step value according to the approximate range. And then, starting from the default value of the configuration parameter, taking the initial value of the step length as the current value of the step length, adding the current value of the step length to the current value of the configuration parameter, taking the result as the current value of a new configuration parameter, writing the current value of the configuration parameter into the system, allowing the big data analysis program to run under the current value of the configuration parameter until the program makes an error, and recording the configuration parameter value of the previous time making an error. Then, starting from the recorded configuration parameter values, reducing the step length, repeating the previous operation until the program makes an error, and recording the configuration parameter values of the previous time before the error. Repeating the above steps for a certain number of times until the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value, and taking the last configuration parameter value with error as a boundary of the configuration parameter.
For another boundary of the value range, the default value can be used as the current value of the configuration parameter again, the step length initial value is assigned to the step length parameter after taking the inverse number, and the searching process is executed. Regarding the configuration parameter value determined as the boundary, if the configuration parameter value is larger than the default value, the configuration parameter value is used as the supremum boundary of the configuration parameter value domain set; otherwise, it is used as the infimum limit of the configuration parameter value domain set.
In the above process, the step size reduction can be realized by multiplying by a number greater than 0 and less than 1, and the number is preferably one third, and other values can also be adopted. In this way, the search space can be covered at the lowest possible cost, the search time is shortened, the search cost is reduced, and more accurate search is realized.
Fig. 2 and 3 are two cases of searching the configuration parameter value field.
In the schematic diagram of fig. 2, the first iteration of the search encounters a configuration parameter value that causes a system fault or failure, which is just the supremum of the fault that caused the system to fail, i.e., a value less than the configuration parameter value enables the system to operate normally. And when the second round of iterative search is carried out, the system is gradually operated until the configuration parameter value reaches the configuration parameter value with errors from the previous configuration parameter value of the configuration parameter value as a starting point and the step length is one third of the initial step length. Next, a second round of iterative search operations is repeated. As can be seen from the schematic diagram, the recorded configuration parameter values gradually approach the erroneous configuration parameter values until the search is stopped, and the last recorded configuration parameter value is used as a boundary of the configuration parameter.
In the schematic diagram of fig. 3, a configuration parameter value that causes a system fault or failure is encountered during a round of iterative search, which is an upper bound for causing a system failure. And when the second round of iterative search is carried out, the system is gradually operated until the configuration parameter value reaches the configuration parameter value with errors from the previous configuration parameter value of the configuration parameter value as a starting point and the step length is one third of the initial step length. At this point, a new configuration parameter value that causes the system to fail is obtained. Next, a second round of iterative search operations is repeated. It can be seen from the schematic diagram that the recorded configuration parameter values gradually approach the boundary of the erroneous configuration parameter value, and each iteration makes the erroneous configuration parameter value of the system gradually approach the boundary of the erroneous configuration parameter value, and the distance between the two values is shorter and shorter.
Although the two cases are different, the accuracy of determining the boundary of the configuration parameter value field is not affected. In addition, by adopting the mode, the searching is always carried out by taking the configuration parameter value which is in the previous error as a starting point and shortening the step length, so that the searching time can be obviously shortened and the cost can be reduced compared with the prior art.
The method can be applied to a big data analysis system, can also be applied to systems such as big data storage, big data resource scheduling and the like, and can determine the range of the configuration parameter value domain.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the present disclosure may be implemented by software plus necessary general hardware, and may also be implemented by special hardware including special integrated circuits, special CPUs, special memories, special components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, software program implementation is a more preferred implementation for more of the present disclosure.
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims (5)

1. A method for accurately and quickly determining a configuration parameter value field of a big data analysis system is characterized by comprising the following steps:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
2. The method of claim 1, wherein the step S300 comprises the steps of:
s301, adding the current value of the configuration parameter to the current value of the step length, and taking the result as the current value of the new configuration parameter;
s302, writing the current values of the configuration parameters into a system and operating the system;
s303, judging whether the system can normally operate, and if so, returning to the step S301; otherwise, step S400 is executed.
3. The method of claim 1, wherein the S200 determines whether to stop searching for the value range boundary value by:
and if the current value of the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value.
4. The method of claim 1, further comprising the steps of:
s700, judging whether the boundary search of the value range is finished or not; if the boundary search of the value range is not finished, the initial value of the step length is subjected to inverse number assignment to the step length parameter, and the default value is used as the current value of the configuration parameter; s200 is performed.
5. The method of claim 1, wherein the initial step size value is determined by:
s101, determining an approximate range of configuration parameter values of a big data analysis system according to configuration of a hardware system;
and S102, setting a step length initial value according to the approximate range.
CN202111539596.9A 2021-12-15 2021-12-15 Method for accurately and quickly determining configuration parameter value field of big data analysis system Pending CN114416193A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111539596.9A CN114416193A (en) 2021-12-15 2021-12-15 Method for accurately and quickly determining configuration parameter value field of big data analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111539596.9A CN114416193A (en) 2021-12-15 2021-12-15 Method for accurately and quickly determining configuration parameter value field of big data analysis system

Publications (1)

Publication Number Publication Date
CN114416193A true CN114416193A (en) 2022-04-29

Family

ID=81268364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111539596.9A Pending CN114416193A (en) 2021-12-15 2021-12-15 Method for accurately and quickly determining configuration parameter value field of big data analysis system

Country Status (1)

Country Link
CN (1) CN114416193A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101355404A (en) * 2008-09-04 2009-01-28 中兴通讯股份有限公司 Apparatus and method for regulating transmitter parameter with optimization
CN102819651A (en) * 2012-08-20 2012-12-12 西北工业大学 Simulation-based parameter optimizing method for precise casting process of single crystal turbine blade
CN106650028A (en) * 2016-11-28 2017-05-10 中国人民解放军国防科学技术大学 Optimization method and system based on agile satellite design parameters
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method
US20210173670A1 (en) * 2019-12-10 2021-06-10 Salesforce.Com, Inc. Automated hierarchical tuning of configuration parameters for a multi-layer service

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101355404A (en) * 2008-09-04 2009-01-28 中兴通讯股份有限公司 Apparatus and method for regulating transmitter parameter with optimization
CN102819651A (en) * 2012-08-20 2012-12-12 西北工业大学 Simulation-based parameter optimizing method for precise casting process of single crystal turbine blade
CN106650028A (en) * 2016-11-28 2017-05-10 中国人民解放军国防科学技术大学 Optimization method and system based on agile satellite design parameters
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method
US20210173670A1 (en) * 2019-12-10 2021-06-10 Salesforce.Com, Inc. Automated hierarchical tuning of configuration parameters for a multi-layer service

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗妮: "基于机器学习的内存计算优化关键技术研究", 《信息科技辑》, no. 06, pages 137 - 58 *

Similar Documents

Publication Publication Date Title
CN106933733B (en) Method and device for determining memory leak position
CN114048701B (en) Netlist ECO method, device, equipment and readable storage medium
CN107818051B (en) Test case jump analysis method and device and server
CN113064674B (en) Method and device for expanding state machine logic, storage medium and electronic device
CN114416193A (en) Method for accurately and quickly determining configuration parameter value field of big data analysis system
CN111581101A (en) Software model testing method, device, equipment and medium
CN115587545A (en) Parameter optimization method, device and equipment for photoresist and storage medium
US10055341B2 (en) To-be-stubbed target determining apparatus, to-be-stubbed target determining method and non-transitory recording medium storing to-be-stubbed target determining program
CN112732342B (en) Method and device for initializing USID and electronic equipment
CN111684374A (en) Numerical control machining method, numerical control machine tool, and computer storage medium
CN111177014B (en) Software automatic test method, system and storage medium
CN112860267B (en) Kernel cutting method and computing device
WO2023108486A1 (en) Method for accurately and quickly determining configuration parameter value domain of big data analysis system
CN111352852B (en) Regression test case selection method and device
CN113297069A (en) Software testing method and device based on target drive
CN113342698A (en) Test environment scheduling method, computing device and storage medium
JP2008090699A (en) Method, apparatus and program of trace logging
CN114201331B (en) Method, device and equipment for detecting instruction conflict of solid state disk and storage medium
CN114625572B (en) Reverse debugging memory backup method, electronic device and medium
CN116245894B (en) Map segmentation method and device, electronic equipment and medium
CN112329127A (en) Method and device for processing grid around hole and storage medium
CN114911467B (en) Code detection method, device, electronic equipment and storage medium
JP2007249495A (en) Software verification method, information processor and program
CN115964537A (en) Behavior data processing method, device and equipment and readable storage medium
CN116450500A (en) Shared data analysis method for interrupt driven embedded software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination