CN114416193A - Method for accurately and quickly determining configuration parameter value field of big data analysis system - Google Patents
Method for accurately and quickly determining configuration parameter value field of big data analysis system Download PDFInfo
- Publication number
- CN114416193A CN114416193A CN202111539596.9A CN202111539596A CN114416193A CN 114416193 A CN114416193 A CN 114416193A CN 202111539596 A CN202111539596 A CN 202111539596A CN 114416193 A CN114416193 A CN 114416193A
- Authority
- CN
- China
- Prior art keywords
- value
- configuration parameter
- step length
- current value
- configuration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000007405 data analysis Methods 0.000 title claims abstract description 14
- 238000010586 diagram Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000015654 memory Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 238000012517 data analytics Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000005549 size reduction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to a method for accurately and quickly determining a configuration parameter value field of a big data analysis system, which is used for quickly determining the value field of a configuration parameter. Firstly, determining an approximate range of values of configuration parameters of a big data analysis program according to the configuration of a hardware system in which the big data analysis program depends; second, starting with the default values of the parameters, the wandering is performed in larger steps until the program makes an error, and the values of the configuration parameters wandered the previous time the error was made are recorded. And thirdly, starting from the value of the configuration parameter recorded in the front, wandering by the reduced step length until the program makes an error, and recording the configuration parameter value wandered in the previous time when the program makes the error. Fourthly, the step size is reduced again, and the operation of the previous step is repeated. Until the step size is smaller than the set threshold value. The configuration parameter value of the previous time of the error is taken as the boundary of the configuration parameter.
Description
Technical Field
The disclosure relates to the field of big data processing, in particular to a method for accurately and quickly determining a configuration parameter value field of a big data analysis system.
Background
There are many numerical configuration parameters in a big data analysis system, some parameter configuration programs can use the size of the memory, some parameters configure the number of CPU cores that can be used by a task, and other aspects. For example, the configuration parameter Spark of the memory big data analysis engine Apache Spark specifies the size of the memory that can be used by each actuator (actuator).
Currently, widely used big data analysis engines such as Spark and Flink provide default values of configuration parameters, for example, the default value of the parameters is 1024MB, but no value ranges of the parameters are provided (i.e. upper and lower limits where parameter values can be configured), and the absence of the value ranges causes difficulty in reasonable setting of the parameters, is not favorable for configuration optimization, and even causes operation errors of big data analysis programs.
Disclosure of Invention
The present invention aims to solve the following problems: the big data analysis engine provides only default values for numeric configuration parameters and no value ranges. The invention provides a method for accurately and quickly determining a configuration parameter value field of a big data analysis system, which can quickly search the boundary of the configuration parameter. The method comprises the following steps:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
Preferably, in the method, the S300 includes the steps of:
s301, adding the current value of the configuration parameter to the current value of the step length, and taking the result as the current value of the new configuration parameter;
s302, writing the current values of the configuration parameters into a system and operating the system;
s303, judging whether the system can normally operate, and if so, returning to the step S301; otherwise, step S400 is executed.
Preferably, in the method, the S200 determines whether to stop searching for the value range boundary value by:
and if the current value of the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value.
Preferably, in the method, the method further comprises the steps of:
s700, judging whether the boundary search of the value range is finished or not; if not, the step length initial value is evaluated to the step length parameter after taking an inverse number, and the default value is used as the current value of the configuration parameter; the process returns to step S200.
Preferably, in the method, the step initial value is determined by:
s101, determining an approximate range of configuration parameter values of a big data analysis system according to configuration of a hardware system;
and S102, setting a step length initial value according to the approximate range.
Compared with the prior art:
the method disclosed by the invention takes a default value as a center, under the condition of determining the approximate configuration parameter value field range, firstly, a larger step length is used for roughly finding out an error point, the previous time of the error point is used as a starting point during searching again, the step length is reduced, the error point is gradually approached, and the steps are repeated so as to accurately and quickly search the supremum limit and the subprecium limit of the configuration parameter value field range.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a schematic flow chart of a method in an embodiment of the invention;
FIG. 2 is a schematic diagram of a search in an embodiment of the invention;
fig. 3 is a schematic diagram of another search in an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "comprising" and "having," and any variations thereof, in the description and claims of this application are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or apparatus is not necessarily limited to those steps or apparatus explicitly listed, but may include other steps or apparatus not explicitly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention are described in detail below with specific embodiments. Several of the following embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.
In one embodiment, the big data analytics engine provides only default values for numeric configuration parameters, and does not provide value ranges. In order to quickly determine the value range of the configuration parameter, a method flowchart as shown in fig. 1 is adopted, and the specific implementation steps are as follows:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
The above method regards the problem of searching the configuration parameter value field as a problem of quickly searching the boundary from a given position. In this method, a default value is used as the given position. And determining the rough range of values of the program configuration parameters according to the configuration of a hardware system on which the big data analysis program depends. And determining the initial step value according to the approximate range. And then, starting from the default value of the configuration parameter, taking the initial value of the step length as the current value of the step length, adding the current value of the step length to the current value of the configuration parameter, taking the result as the current value of a new configuration parameter, writing the current value of the configuration parameter into the system, allowing the big data analysis program to run under the current value of the configuration parameter until the program makes an error, and recording the configuration parameter value of the previous time making an error. Then, starting from the recorded configuration parameter values, reducing the step length, repeating the previous operation until the program makes an error, and recording the configuration parameter values of the previous time before the error. Repeating the above steps for a certain number of times until the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value, and taking the last configuration parameter value with error as a boundary of the configuration parameter.
For another boundary of the value range, the default value can be used as the current value of the configuration parameter again, the step length initial value is assigned to the step length parameter after taking the inverse number, and the searching process is executed. Regarding the configuration parameter value determined as the boundary, if the configuration parameter value is larger than the default value, the configuration parameter value is used as the supremum boundary of the configuration parameter value domain set; otherwise, it is used as the infimum limit of the configuration parameter value domain set.
In the above process, the step size reduction can be realized by multiplying by a number greater than 0 and less than 1, and the number is preferably one third, and other values can also be adopted. In this way, the search space can be covered at the lowest possible cost, the search time is shortened, the search cost is reduced, and more accurate search is realized.
Fig. 2 and 3 are two cases of searching the configuration parameter value field.
In the schematic diagram of fig. 2, the first iteration of the search encounters a configuration parameter value that causes a system fault or failure, which is just the supremum of the fault that caused the system to fail, i.e., a value less than the configuration parameter value enables the system to operate normally. And when the second round of iterative search is carried out, the system is gradually operated until the configuration parameter value reaches the configuration parameter value with errors from the previous configuration parameter value of the configuration parameter value as a starting point and the step length is one third of the initial step length. Next, a second round of iterative search operations is repeated. As can be seen from the schematic diagram, the recorded configuration parameter values gradually approach the erroneous configuration parameter values until the search is stopped, and the last recorded configuration parameter value is used as a boundary of the configuration parameter.
In the schematic diagram of fig. 3, a configuration parameter value that causes a system fault or failure is encountered during a round of iterative search, which is an upper bound for causing a system failure. And when the second round of iterative search is carried out, the system is gradually operated until the configuration parameter value reaches the configuration parameter value with errors from the previous configuration parameter value of the configuration parameter value as a starting point and the step length is one third of the initial step length. At this point, a new configuration parameter value that causes the system to fail is obtained. Next, a second round of iterative search operations is repeated. It can be seen from the schematic diagram that the recorded configuration parameter values gradually approach the boundary of the erroneous configuration parameter value, and each iteration makes the erroneous configuration parameter value of the system gradually approach the boundary of the erroneous configuration parameter value, and the distance between the two values is shorter and shorter.
Although the two cases are different, the accuracy of determining the boundary of the configuration parameter value field is not affected. In addition, by adopting the mode, the searching is always carried out by taking the configuration parameter value which is in the previous error as a starting point and shortening the step length, so that the searching time can be obviously shortened and the cost can be reduced compared with the prior art.
The method can be applied to a big data analysis system, can also be applied to systems such as big data storage, big data resource scheduling and the like, and can determine the range of the configuration parameter value domain.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the present disclosure may be implemented by software plus necessary general hardware, and may also be implemented by special hardware including special integrated circuits, special CPUs, special memories, special components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, software program implementation is a more preferred implementation for more of the present disclosure.
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.
Claims (5)
1. A method for accurately and quickly determining a configuration parameter value field of a big data analysis system is characterized by comprising the following steps:
s100, acquiring a configuration parameter of a value range to be determined and a default value of the configuration parameter, and taking the default value as a current value of the configuration parameter;
s200, judging whether to stop searching the value domain boundary value or not by taking the current value of the configuration parameter as a starting point; if the search value range boundary value is not stopped, go to step S300; otherwise, executing step S600;
s300, updating the current values of the configuration parameters by using the current values of the step length to enable the system to operate under the current value of each configuration parameter until the configuration parameter value causing the system to have errors is found;
s400, recording the previous configuration parameter value of the system error as the current value of the configuration parameter;
s500, multiplying the current value of the step length by a number which is more than 0 and less than 1, taking the result as a new current value of the step length, and returning to the step S200;
s600, taking the current value of the configuration parameter as a boundary of the value range.
2. The method of claim 1, wherein the step S300 comprises the steps of:
s301, adding the current value of the configuration parameter to the current value of the step length, and taking the result as the current value of the new configuration parameter;
s302, writing the current values of the configuration parameters into a system and operating the system;
s303, judging whether the system can normally operate, and if so, returning to the step S301; otherwise, step S400 is executed.
3. The method of claim 1, wherein the S200 determines whether to stop searching for the value range boundary value by:
and if the current value of the step length is smaller than the set minimum value of the step length, stopping searching the value domain boundary value.
4. The method of claim 1, further comprising the steps of:
s700, judging whether the boundary search of the value range is finished or not; if the boundary search of the value range is not finished, the initial value of the step length is subjected to inverse number assignment to the step length parameter, and the default value is used as the current value of the configuration parameter; s200 is performed.
5. The method of claim 1, wherein the initial step size value is determined by:
s101, determining an approximate range of configuration parameter values of a big data analysis system according to configuration of a hardware system;
and S102, setting a step length initial value according to the approximate range.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111539596.9A CN114416193A (en) | 2021-12-15 | 2021-12-15 | Method for accurately and quickly determining configuration parameter value field of big data analysis system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111539596.9A CN114416193A (en) | 2021-12-15 | 2021-12-15 | Method for accurately and quickly determining configuration parameter value field of big data analysis system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114416193A true CN114416193A (en) | 2022-04-29 |
Family
ID=81268364
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111539596.9A Pending CN114416193A (en) | 2021-12-15 | 2021-12-15 | Method for accurately and quickly determining configuration parameter value field of big data analysis system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114416193A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101355404A (en) * | 2008-09-04 | 2009-01-28 | 中兴通讯股份有限公司 | Apparatus and method for regulating transmitter parameter with optimization |
CN102819651A (en) * | 2012-08-20 | 2012-12-12 | 西北工业大学 | Simulation-based parameter optimizing method for precise casting process of single crystal turbine blade |
CN106648654A (en) * | 2016-12-20 | 2017-05-10 | 深圳先进技术研究院 | Data sensing-based Spark configuration parameter automatic optimization method |
CN106650028A (en) * | 2016-11-28 | 2017-05-10 | 中国人民解放军国防科学技术大学 | Optimization method and system based on agile satellite design parameters |
US20210173670A1 (en) * | 2019-12-10 | 2021-06-10 | Salesforce.Com, Inc. | Automated hierarchical tuning of configuration parameters for a multi-layer service |
-
2021
- 2021-12-15 CN CN202111539596.9A patent/CN114416193A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101355404A (en) * | 2008-09-04 | 2009-01-28 | 中兴通讯股份有限公司 | Apparatus and method for regulating transmitter parameter with optimization |
CN102819651A (en) * | 2012-08-20 | 2012-12-12 | 西北工业大学 | Simulation-based parameter optimizing method for precise casting process of single crystal turbine blade |
CN106650028A (en) * | 2016-11-28 | 2017-05-10 | 中国人民解放军国防科学技术大学 | Optimization method and system based on agile satellite design parameters |
CN106648654A (en) * | 2016-12-20 | 2017-05-10 | 深圳先进技术研究院 | Data sensing-based Spark configuration parameter automatic optimization method |
US20210173670A1 (en) * | 2019-12-10 | 2021-06-10 | Salesforce.Com, Inc. | Automated hierarchical tuning of configuration parameters for a multi-layer service |
Non-Patent Citations (1)
Title |
---|
罗妮: "基于机器学习的内存计算优化关键技术研究", 《信息科技辑》, no. 06, pages 137 - 58 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106933733B (en) | Method and device for determining memory leak position | |
CN114048701B (en) | Netlist ECO method, device, equipment and readable storage medium | |
KR102114547B1 (en) | Testing method and apparatus of target function incluede in target program | |
CN107818051B (en) | Test case jump analysis method and device and server | |
CN113064674B (en) | Method and device for expanding state machine logic, storage medium and electronic device | |
CN114416193A (en) | Method for accurately and quickly determining configuration parameter value field of big data analysis system | |
CN115587545B (en) | Parameter optimization method, device and equipment for photoresist and storage medium | |
CN111581101A (en) | Software model testing method, device, equipment and medium | |
US10055341B2 (en) | To-be-stubbed target determining apparatus, to-be-stubbed target determining method and non-transitory recording medium storing to-be-stubbed target determining program | |
CN112732342B (en) | Method and device for initializing USID and electronic equipment | |
CN111684374A (en) | Numerical control machining method, numerical control machine tool, and computer storage medium | |
CN111177014B (en) | Software automatic test method, system and storage medium | |
CN112860267B (en) | Kernel cutting method and computing device | |
WO2023108486A1 (en) | Method for accurately and quickly determining configuration parameter value domain of big data analysis system | |
CN111352852B (en) | Regression test case selection method and device | |
CN113297069A (en) | Software testing method and device based on target drive | |
JP2008090699A (en) | Method, apparatus and program of trace logging | |
CN114201331B (en) | Method, device and equipment for detecting instruction conflict of solid state disk and storage medium | |
CN114625572B (en) | Reverse debugging memory backup method, electronic device and medium | |
CN116245894B (en) | Map segmentation method and device, electronic equipment and medium | |
CN112329127A (en) | Method and device for processing grid around hole and storage medium | |
CN114911467B (en) | Code detection method, device, electronic equipment and storage medium | |
JP2007249495A (en) | Software verification method, information processor and program | |
CN115964537A (en) | Behavior data processing method, device and equipment and readable storage medium | |
CN116450500A (en) | Shared data analysis method for interrupt driven embedded software |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |