CN106845708A - A kind of data flow processing system Multipurpose Optimal Method based on uncertainty - Google Patents

A kind of data flow processing system Multipurpose Optimal Method based on uncertainty Download PDF

Info

Publication number
CN106845708A
CN106845708A CN201710044897.1A CN201710044897A CN106845708A CN 106845708 A CN106845708 A CN 106845708A CN 201710044897 A CN201710044897 A CN 201710044897A CN 106845708 A CN106845708 A CN 106845708A
Authority
CN
China
Prior art keywords
response delay
current
bound
plan
detection result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710044897.1A
Other languages
Chinese (zh)
Other versions
CN106845708B (en
Inventor
曹朝
盛伟
曲大成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201710044897.1A priority Critical patent/CN106845708B/en
Publication of CN106845708A publication Critical patent/CN106845708A/en
Application granted granted Critical
Publication of CN106845708B publication Critical patent/CN106845708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of data flow processing system Multipurpose Optimal Method based on uncertainty disclosed by the invention, is related to a kind of Multipurpose Optimal Method for data flow processing system, belongs to Computer Applied Technology, real-time big data analysis field.The bound and the bound of throughput of operating lag of the present invention according to specified by user, provide uncertain region area;Based on this target of diminution uncertain region area, one group of Pareto optimal solution with Typical Representative meaning is obtained by recursive two points of probe methods, be that user provides selection space on operating lag and throughput.The present invention has a wide range of application suitable for different real-time big data analysis system multiple-objection optimization scenes, practical, it is easy to promote.Additionally, the present invention is processed in itself just for data, and it is not only restricted to the source of data, it is adaptable to the treatment to the data in all of engineer applied.

Description

Multi-objective optimization method of data stream processing system based on uncertainty
Technical Field
The invention relates to a multi-objective optimization method of a data stream processing system based on uncertainty, in particular to a multi-objective optimization method for the data stream processing system, and belongs to the field of computer application technology and real-time big data analysis.
Background
In recent years, a large number of real-time big data analysis applications, such as social network dynamic analysis, intelligent traffic data analysis, large-scale data center monitoring, gene data analysis and the like, emerge. The application has large data volume and continuously and quickly generates or updates data, and requires a data analysis system to continuously return or update an analysis result in real time, which is called as real-time Big data (Big & fast data) analysis. Such applications have urgent needs for real-time big data analysis systems, and the systems are required to give quantitative guarantees on response delay and throughput rate.
At present, in real-time big data analysis application, the requirements of users on response delay and throughput rate depend on historical experience, IT personnel manually configure a proper execution plan for analysis operation in a data stream processing system, and quantitative guarantee for the response delay and the throughput rate of real-time big data analysis is lacked; even experienced IT personnel can not ensure that a better execution plan is configured, so that the operation efficiency of analysis operation is low, and the requirement of upper-layer application on real-time performance cannot be met.
The method is a multi-objective optimization method designed based on two important indexes in real-time big data, namely response delay and throughput rate. And constructing a multi-objective optimization model based on the given response delay and throughput rate model, and theoretically ensuring that the optimal execution plan is selected. The multi-objective optimization of the real-time big data analysis system has important significance for providing real-time big data analysis cloud service with service quality guarantee, and providing a real-time big data analysis platform and an optimization framework for the national key industry and important monitoring application.
Although the existing multi-objective optimization method based on weight addition solves the pareto optimal problem of response delay and throughput rate of a convex objective function under certain condition constraints, the pareto optimal problem under the condition of a concave objective function cannot be solved; in addition, the multi-objective optimization method based on weight summation returns to the user a set of solutions which are different in decryption degree, difficult to explain and not representative, and the user actually needs a representative set of solutions on the pareto curve. Therefore, the multi-objective optimization method based on weight summation cannot meet the multi-objective optimization in the IT personnel interaction scene.
Disclosure of Invention
The method aims at the defect that the optimal solution of palitor is random because the existing multi-objective optimization method based on weight addition does not consider the condition that users have trade-off on response delay and throughput rate when deployment and use. The invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which aims to solve the technical problems that: aiming at the multi-objective optimization problem of the data stream processing system, the method can avoid the random defect of the pareto optimal solution, obtain a group of pareto optimal solutions with typical representative meanings, and provide a selection space for a user on response delay and throughput rate.
The purpose of the invention is realized by the following technical scheme:
the invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which provides an uncertain area according to an upper and lower bound of response delay and an upper and lower bound of throughput rate specified by a user; based on the goal of reducing the area of an uncertain region, a group of pareto optimal solutions with typical representative meanings are obtained through a recursive dichotomy detection method, and a selection space is provided for a user on response delay and throughput.
The invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which comprises the following steps:
step 1: input an upper bound on the current response delay, denoted Lupper(ii) a Inputting a lower bound of the current response delay, denoted Llower(ii) a And inputting a threshold value of the area of the uncertain region, and recording the threshold value as UA.
Step 2: upper bound L based on current response delayupperAnd a lower bound LlowerAn upper bound and a lower bound of the current throughput rate are calculated, respectively.
Step 2.1: according to current response delayUpper bound LupperCalculating the upper bound of the current throughput rate, denoted as TupperThe calculation formula is as follows:
wherein s.t. represents a constraint; c represents a specific system configuration; λ represents the real-time input data rate;(c,λ)representing the parameter configuration c of a specific system and the throughput rate under the condition of real-time input data rate lambda; psi(c,λ)Representing the parameter configuration c of a specific system and the response delay under the condition of real-time input data rate lambda; the expression above the expression (1) reflects that, given the input data rate λ, the response delay ψ is satisfied(c,λ)Less than the upper bound L of response delayupperIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tupper
Step 2.2: lower bound L based on current response delaylowerCalculating the lower bound of the current throughput rate, denoted as TlowerThe calculation formula is as follows:
the above equation reflects that given an input data rate λ, the response delay ψ is satisfied(c,λ)Less than the lower bound L of response delaylowerIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tlower
And step 3: upper bound L based on current response delayupperAnd a lower bound LlowerThe current probe response delay, the maximum probe throughput rate and the specific system configuration of the maximum probe throughput rate are calculated by bisection.
Step 3.1: according to whenUpper bound L of pre-response delayupperAnd a lower bound LlowerCalculating the current probe response delay, denoted as LmiddleThe calculation formula is as follows:
Lmiddle=(Llower+Lupper)/2; (3)
step 3.2: according to the current probe response delay LmiddleCalculating the current maximum probing throughput and the specific system configuration of the maximum probing throughput, which are respectively denoted as Tmiddle、cmiddleThe calculation formula is as follows:
and 4, step 4: upper bound L based on current response delayupperAnd a lower bound LlowerUpper bound of throughput rate TupperAnd a lower bound TlowerProbe response delay LmiddleAnd maximum sounding throughput rate TmiddleAnd respectively calculating the areas of the uncertain regions of the current left half part and the current right half part.
Step 4.1: lower bound L based on current response delaylowerAnd a probe response delay LmiddleAnd a lower bound T of the current throughput ratelowerAnd maximum sounding throughput rate TmiddleCalculating the area of the uncertain region of the current left half part and recording as ualeftThe calculation formula is as follows:
ualeft=(Lmiddle-Llower)×(Tmiddle-Tlower); (5)
step 4.2: upper bound L based on current response delayupperAnd a probe response delay LmiddleAnd an upper bound T of the current throughput rateupperAnd maximum sounding throughput rate TmiddleCalculating the area of the uncertain region on the right half of the current image, and recording as uarightThe calculation formula is as follows:
uaright=(Lupper-Lmiddle)×(Tupper-Tmiddle); (6)
and 5: and judging whether the areas of the uncertain regions of the current left half part and the right half part are smaller than or equal to a threshold value UA of the areas of the uncertain regions, and determining whether to perform recursive iterative detection.
Step 5.1: judging the area ua of the left half uncertain regionleftWhether the area of the uncertain region is smaller than or equal to a threshold value UA is judged so as to determine whether to carry out recursive iterative detection, and the specific process is as follows:
step 5.1.1: if the area ua of the left half is not determinedleftIf the area is smaller than or equal to the area threshold UA of the uncertain region, setting the left half detection result group as an empty set, and turning to the step 5.2; otherwise, turning to step 5.1.2;
wherein, the left half of the detection result set is marked as planleftThe calculation formula is as follows:
step 5.1.2: delaying a current probe response by LmiddleAs the next probe response delay upper bound LupperLower bound of current response delay LlowerAs the next probe response delay lower bound LlowerRecursively iterating the left half; finally, the left half of the set plan of probe results is recordedleft
Wherein the left half of the detection result set planleftThe calculation formula of (a) is as follows:
planleft=prob(Llower,Lmiddle); (8)
wherein prob (L)lower,Lupper) Represents recursive iterative detection;
step 5.2: judging the uncertain region of the right half partArea uarightWhether the area of the uncertain region is smaller than or equal to a threshold value UA is judged so as to determine whether to carry out recursive iterative detection, and the specific process is as follows:
step 5.2.1: area ua of the indeterminate region in the right halfrightIf the area of the uncertain region is smaller than or equal to the threshold value UA, setting the right half detection result group as an empty set, and turning to the step 6; otherwise, turning to step 5.2.2;
wherein, the right half detection result set is marked as planrightThe calculation formula is as follows:
step 5.2.2: delaying a current probe response by LmiddleAs the next probe response delay lower bound LlowerUpper bound on current response delay LupperAs the next probe response delay upper bound LupperRecursively iterating the right half; finally, the right half of the set of probe results plan is recordedright
Wherein the right half of the detection result set planrightThe calculation formula of (a) is as follows:
planright=prob(Lmiddle,Lupper); (10)
step 6: calculating a current detection result set, and combining the current detection result set with the left half detection result set planleftAnd a right half probe result set planrightAnd merging and returning a final detection result group to obtain a group of pareto optimal solutions with typical representative meanings for multi-objective optimization of the data stream processing system.
Step 6.1: according to the current probe response delay LmiddleMaximum sounding throughput TmiddleAnd specific system configuration c of maximum probe throughputmiddleCalculating the current detection result set, and recording as planmiddleCalculatingThe formula is as follows:
planmiddle={(Lmiddle,Tmiddle,cmiddle)}; (11)
step 6.2: the current detection result set is plandmiddleAnd the left half of the detection result set planleftAnd a right half probe result set planrightMerging, returning a final detection result set, and marking as plan, wherein the calculation formula is as follows:
plan=planleft∪planmiddle∪planright; (12)
and returning a final detection result set plan, namely a pareto optimal solution with typical representative significance for the multi-objective optimization of the data stream processing system.
Has the advantages that:
1. the invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which is based on the upper and lower bounds of response delay, based on the relation between a pareto optimal point and a constraint optimization solution, and uses the area of an uncertain region as the measurement of uncertainty, thereby providing a quantitative determination standard for the uncertainty of detection depth.
2. The invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which improves the efficiency of detecting a pareto optimal solution with typical representative meaning by a binary detection method based on the goal of reducing the area of an uncertain region.
3. The invention discloses a multi-objective optimization method of a data stream processing system based on uncertainty, which can return a series of meaningful and representative palitor optimal solutions within a response delay or throughput rate range specified by a user, and ensure that the user can accept a desired optimal solution within the range;
4. the multi-objective optimization method of the data stream processing system based on the uncertainty is suitable for different real-time big data analysis system multi-objective optimization scenes, and is wide in application range, strong in practicability and easy to popularize.
5. The multi-objective optimization method of the data stream processing system based on the uncertainty only processes data, can obtain a group of pareto optimal solutions with typical representative meanings without being limited by data sources, and is suitable for processing data in all engineering applications.
Drawings
FIG. 1 is a schematic flowchart of the method and embodiment 1 of the "method for multi-objective optimization of data stream processing system based on uncertainty" according to the present invention;
FIG. 2 is a schematic flowchart of recursive iterative detection in embodiment 2 of the "a method for multi-objective optimization of a data stream processing system based on uncertainty" according to the present invention;
FIG. 3 is a comparison graph of the present method and the weighted sum experiment in embodiment 1 of the "multi-objective optimization method for data stream processing system based on uncertainty" of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the drawings and examples, but the present invention is not limited to these examples.
Example 1:
this embodiment describes a process of applying the "multi-objective optimization method for data stream processing system based on uncertainty" in a specific real-time big data analysis system Apache Spark Streaming scenario.
Fig. 1 is a flowchart of an algorithm of the method and a flowchart of the present embodiment. As can be seen from the figure, the method comprises the following steps:
step 1: upper bound L of current response delayupperInitialized to a lower bound L of 10.0, the current response delaylowerIs initialized to 0.5 and the threshold value UA for the area of the uncertainty region is initialized to 10000.
Step 2: an upper bound and a lower bound of the current throughput rate are calculated, respectively, based on the upper bound 10.0 and the lower bound 0.5 of the current response delay.
Step 2.1: according to the upper bound 10.0 of the current response delay, the upper bound T of the current throughput rate is calculatedupperTo 1677367.1230139078, the calculation formula is as follows:
wherein s.t. represents a constraint; c represents a specific system configuration; λ represents the real-time input data rate;(c,λ)representing the parameter configuration c of a specific system and the throughput rate under the condition of real-time input data rate lambda; psi(c,λ)Representing the parameter configuration c of a specific system and the response delay under the condition of real-time input data rate lambda; the above equation reflects that given an input data rate λ, the response delay ψ is satisfied(c,λ)Less than the upper bound L of response delayupperIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tupper
Step 2.2: according to the lower bound 0.5 of the current response delay, calculating the lower bound T of the current throughput ratelowerTo 1288034.188034188, the calculation formula is as follows:
the above equation reflects that given an input data rate λ, the response delay ψ is satisfied(c,λ)Less than the lower bound L of response delaylowerIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tlower
And step 3: and calculating the current probe response delay, the maximum probe throughput rate and the specific system configuration of the maximum probe throughput rate by a dichotomy according to the upper bound 10.0 and the lower bound 0.5 of the current response delay.
Step 3.1: calculating the current probe response delay L according to the upper bound 10.0 and the lower bound 0.5 of the current response delaymiddleTo 5.25, the calculation is as follows:
Lmiddle=(Llower+Lupper)/2=(10.0+0.5)/2=5.25; (3)
step 3.2: calculating the current maximum probing throughput rate T according to the current probing response delay 5.25middleAnd specific system configuration of maximum probe throughput rate cmiddle1674561.0772396186, c respectively1The calculation formula is as follows:
and 4, step 4: the uncertainty region areas for the current left and right halves are calculated based on the upper and lower bounds of the current response delay, 10.0 and 0.5, the upper and lower bounds of the throughput rate, 1677367.1230139078 and 1288034.188034188, the probe response delay, 5.25, and the maximum probe throughput rate, 1674561.0772396186, respectively.
Step 4.1: calculating the area ua of the uncertain region in the current left half according to the lower bound 0.5 of the current response delay and the probe response delay 5.25 as well as the lower bound 1288034.188034188 of the current throughput and the maximum probe throughput 1674561.0772396186leftTo 183600.2723725793, the calculation formula is as follows:
step 4.2: calculating the uncertain region area ua of the current right half according to the upper bound 10.0 of the current response delay and the probe response delay 5.25 as well as the upper bound 1677367.1230139078 of the current throughput and the maximum probe throughput 1674561.0772396186rightTo 13328.71742787275, the calculation formula is as follows:
and 5: and judging whether the areas of the uncertain regions of the current left half part and the right half part are less than or equal to a threshold value 10000 of the areas of the uncertain regions, and determining whether to perform recursive iterative detection.
Step 6: calculating a current detection result set, and combining the current detection result set with the left half detection result set planleftAnd a right half probe result set planrightAnd merging and returning a final detection result group to obtain a group of pareto optimal solutions with typical representative meanings for multi-objective optimization of the data stream processing system.
Step 6.1: specific system configuration c according to current probe response delay 5.25, maximum probe throughput 1674561.0772396186, and maximum probe throughput rate1Calculating the current detection result set planmiddleIs { (5.25,1674561.0772396186, c)1) The calculation formula is as follows:
step 6.2: the current detection result set is plandmiddleAnd the left half of the detection result set planleftAnd a right half probe result set planrightMerging and returning the final detection result set plan to be planleft∪{(5.25,1674561.0772396186,c1)}∪planrightThe calculation formula is as follows:
and returning a final detection result set plan, namely a pareto optimal solution with typical representative significance for the multi-objective optimization of the data stream processing system.
The experimental comparison of the method and the weight summation is shown in fig. 3. Where the abscissa represents response delay (sec), the ordinate represents throughput (million bars/sec), and the point in the graph represents the maximum throughput that can be achieved at a certain response delay, i.e., the pareto optimal solution. The left diagram represents the weight summation method, and the right diagram represents the method. As can be seen from fig. 3, the pareto optimal solution of the weight summation is concentrated in a small part of the region, cannot represent the distribution of response delay and throughput rate in the whole space, and cannot provide a group of pareto optimal solutions with typical representative meanings to users; the solution sets of the method are distributed uniformly in the whole space, various optimal choices of response delay and throughput rate can be provided for users, and a group of pareto optimal solutions with typical representative meanings can be provided for users.
Example 2:
this embodiment specifically illustrates the recursive iterative detection described in step 5 of the present invention and the recursive iterative detection in step 5 of embodiment 1, and the algorithm flow is shown in fig. 2. As can be seen from fig. 2, the specific steps of recursive iterative detection are:
and 5: and judging whether the areas of the uncertain regions of the current left half part and the right half part are less than or equal to a threshold value 10000 of the areas of the uncertain regions, and determining whether to perform recursive iterative detection.
Step 5.1: judging whether the area 183600.2723725793 of the uncertain region in the left half part is smaller than or equal to a threshold 10000 of the area of the uncertain region, thereby determining whether to perform recursive iterative detection, which specifically comprises the following steps:
step 5.1.1: if the area 183600.2723725793 of the uncertain region of the left half is smaller than or equal to the threshold 10000 of the area of the uncertain region, setting the detection result group of the left half as an empty set, and turning to the step 5.2; otherwise, turning to step 5.1.2;
wherein the left half of the detection result set planleftThe calculation formula of (a) is as follows:
step 5.1.2: the current probe response delay of 5.25 is taken as the next probe response delay upper bound LupperThe lower bound of the current response delay 0.5 is used as the lower bound of the next detection response delay LlowerRecursively iterating the left half; finally, the left half of the set plan of probe results is recordedleft
Wherein the left half of the detection result set planleftThe calculation formula of (a) is as follows:
wherein prob (L)lower,Lupper) Represents recursive iterative detection;
step 5.2: judging whether the uncertain region area 13328.71742787275 of the right half part is smaller than or equal to the uncertain region area threshold 10000, thereby determining whether to perform recursive iterative detection, which specifically comprises the following steps:
step 5.2.1: if the uncertain region area 13328.71742787275 of the right half is smaller than or equal to the threshold value 10000 of the uncertain region area, setting the detection result group of the right half as an empty set, and turning to step 6; otherwise, turning to step 5.2.2;
wherein,right half probe result set planrightThe calculation formula of (a) is as follows:
step 5.2.2: the current probe response delay of 5.25 is taken as the next probe response delay lower bound LlowerThe upper bound of the current response delay 10.0 is used as the upper bound of the next probe response delay LupperRecursively iterating the right half; finally, the right half of the set of probe results plan is recordedright
Wherein the right half of the detection result set planrightThe calculation formula of (a) is as follows:
so far, from step 5.1 to step 5.2, the recursive iterative detection of step 5 in embodiment 1 is completed.
Example 3:
the specific real-time big data analysis system Apache Spark Streaming in the embodiment 1 is changed into other real-time big data analysis systems such as Apache Storm, Google Dataflow, etc., that is, the multi-objective optimization method provided by the invention is not limited by the source of data, and is suitable for processing data in all engineering applications.
The technical contents not described in the above embodiments can be implemented by taking or referring to the existing technologies.
While the foregoing is directed to the preferred embodiment of the present invention, it is not intended that the invention be limited to the embodiment and the drawings disclosed herein. Equivalents and modifications may be made without departing from the spirit of the disclosure, which is to be considered as within the scope of the invention.

Claims (5)

1. A multi-objective optimization method of a data stream processing system based on uncertainty is characterized in that: comprises the following steps of (a) carrying out,
step 1: input an upper bound on the current response delay, denoted Lupper(ii) a Inputting a lower bound of the current response delay, denoted Llower(ii) a Inputting a threshold value of the area of the uncertain region, and marking as UA;
step 2: upper bound L based on current response delayupperAnd a lower bound LlowerRespectively calculating an upper bound and a lower bound of the current throughput rate;
and step 3:upper bound L based on current response delayupperAnd a lower bound LlowerCalculating the current detection response delay, the maximum detection throughput rate and the specific system configuration of the maximum detection throughput rate by a bisection method;
step 3.1: upper bound L based on current response delayupperAnd a lower bound LlowerCalculating the current probe response delay, denoted as LmiddleThe calculation formula is as follows:
Lmiddle=(Llower+Lupper)/2; (3)
step 3.2: according to the current probe response delay LmiddleCalculating the current maximum probing throughput and the specific system configuration of the maximum probing throughput, which are respectively denoted as Tmiddle、cmiddleThe calculation formula is as follows:
and 4, step 4: upper bound L based on current response delayupperAnd a lower bound LlowerUpper bound of throughput rate TupperAnd a lower bound TlowerProbe response delay LmiddleAnd maximum sounding throughput rate TmiddleRespectively calculating the areas of uncertain regions of the current left half part and the current right half part;
step 4.1: lower bound L based on current response delaylowerAnd a probe response delay LmiddleAnd a lower bound T of the current throughput ratelowerAnd maximum sounding throughput rate TmiddleCalculating the area of the uncertain region of the current left half part and recording as ualeftThe calculation formula is as follows:
ualeft=(Lmiddle-Llower)×(Tmiddle-Tlower); (5)
step 4.2: upper bound L based on current response delayupperAnd a probe response delay LmiddleAnd an upper bound T of the current throughput rateupperAnd maximum sounding throughput rate TmiddleCalculating the area of the uncertain region on the right half of the current image, and recording as uarightThe calculation formula is as follows:
uaright=(Lupper-Lmiddle)×(Tupper-Tmiddle); (6)
and 5: judging whether the areas of the uncertain regions of the current left half part and the right half part are smaller than or equal to a threshold value UA of the areas of the uncertain regions, and determining whether to perform recursive iterative detection;
step 6: calculating a current detection result set, and combining the current detection result set with the left half detection result set planleftAnd a right half probe result set planrightAnd merging and returning a final detection result group to obtain a group of pareto optimal solutions with typical representative meanings for multi-objective optimization of the data stream processing system.
2. The method of claim 1 for multi-objective uncertainty-based optimization of a data stream processing system, wherein: the specific implementation method of the step 2 is that,
step 2.1: upper bound L based on current response delayupperCalculating the upper bound of the current throughput rate, denoted as TupperThe calculation formula is as follows:
wherein s.t. represents a constraint; c represents a specific system configuration; λ represents the real-time input data rate;(c,λ)representing the parameter configuration c of a specific system and the throughput rate under the condition of real-time input data rate lambda; psi(c,λ)Representing the parameter configuration c of a specific system and the response delay under the condition of real-time input data rate lambda; the expression above the expression (1) reflects that, given the input data rate λ, the response delay ψ is satisfied(c,λ)Less than the upper bound L of response delayupperIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tupper
Step 2.2: lower bound L based on current response delaylowerCalculating the lower bound of the current throughput rate, denoted as TlowerThe calculation formula is as follows:
the above equation reflects that given an input data rate λ, the response delay ψ is satisfied(c,λ)Less than the lower bound L of response delaylowerIn a set of system-specific parameter configurations c seeking to enable throughput rates(c,λ)Maximizing specific system configuration c and recording the maximum throughput rate as Tlower
3. The method of claim 1 or 2 for multi-objective uncertainty-based optimization of a data stream processing system, wherein: the specific implementation method of the step 6 is that,
step 6.1: according to the current probe response delay LmiddleMaximum sounding throughput TmiddleAnd specific system configuration c of maximum probe throughputmiddleCalculating the current detection result set, and recording as planmiddleThe calculation formula is as follows:
planmiddle={(Lmiddle,Tmiddle,cmiddle)}; (11)
step 6.2: the current detection result set is plandmiddleAnd the left half of the detection result set planleftAnd a right half probe result set planrightMerging, returning a final detection result set, and marking as plan, wherein the calculation formula is as follows:
plan=planleft∪planmiddle∪planright;( 12)
and returning a final detection result set plan, namely a pareto optimal solution with typical representative significance for the multi-objective optimization of the data stream processing system.
4. The method of claim 3 for multi-objective uncertainty-based optimization of data stream processing systems, wherein: the specific implementation method of the step 5 is that,
step 5.1: judging the area ua of the left half uncertain regionleftWhether the area of the uncertain region is smaller than or equal to a threshold value UA is judged so as to determine whether to carry out recursive iterative detection, and the specific process is as follows:
step 5.1.1: if the area ua of the left half is not determinedleftIf the area is smaller than or equal to the area threshold UA of the uncertain region, setting the left half detection result group as an empty set, and turning to the step 5.2; otherwise, turning to step 5.1.2;
wherein, the left half of the detection result set is marked as planleftThe calculation formula is as follows:
step 5.1.2: delaying a current probe response by LmiddleAs the next probe response delay upper bound LupperLower bound of current response delay LlowerAs the next probe response delay lower bound LlowerRecursively iterating the left half; finally, the left half of the set plan of probe results is recordedleft
Wherein the left half of the detection result set planleftThe calculation formula of (a) is as follows:
planleft=prob(Llower,Lmiddle); (8)
wherein prob (L)lower,Lupper) Represents recursive iterative detection;
step 5.2: judging the area ua of the uncertain region of the right halfrightWhether the area of the uncertain region is smaller than or equal to a threshold value UA is judged so as to determine whether to carry out recursive iterative detection, and the specific process is as follows:
step 5.2.1: area ua of the indeterminate region in the right halfrightIf the area of the uncertain region is smaller than or equal to the threshold value UA, setting the right half detection result group as an empty set, and turning to the step 6; otherwise, turning to step 5.2.2;
wherein, the right half detection result set is marked as planrightThe calculation formula is as follows:
step 5.2.2: delaying a current probe response by LmiddleAs the next probe response delay lower bound LlowerUpper bound on current response delay LupperAs the next probe response delay upper bound LupperRecursively iterating the right half; finally, the right half of the set of probe results plan is recordedright
Wherein the right half of the detection result set planrightThe calculation formula of (a) is as follows:
planright=prob(Lmiddle,Lupper); (10)。
5. a multi-objective optimization method of a data stream processing system based on uncertainty is characterized in that: giving the area of the uncertain region according to the upper and lower bounds of the response delay and the upper and lower bounds of the throughput rate specified by the user; based on the goal of reducing the area of an uncertain region, a group of pareto optimal solutions with typical representative meanings are obtained through a recursive dichotomy detection method, and a selection space is provided for a user on response delay and throughput.
CN201710044897.1A 2017-01-20 2017-01-20 multi-objective optimization method of data stream processing system based on uncertainty Active CN106845708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710044897.1A CN106845708B (en) 2017-01-20 2017-01-20 multi-objective optimization method of data stream processing system based on uncertainty

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710044897.1A CN106845708B (en) 2017-01-20 2017-01-20 multi-objective optimization method of data stream processing system based on uncertainty

Publications (2)

Publication Number Publication Date
CN106845708A true CN106845708A (en) 2017-06-13
CN106845708B CN106845708B (en) 2019-12-06

Family

ID=59120037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710044897.1A Active CN106845708B (en) 2017-01-20 2017-01-20 multi-objective optimization method of data stream processing system based on uncertainty

Country Status (1)

Country Link
CN (1) CN106845708B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073697A (en) * 2017-12-11 2018-05-25 浙江大学 It is a kind of to reduce and show the probabilistic system of streaming big data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013386A (en) * 2007-02-06 2007-08-08 华中科技大学 Grid task scheduling method based on feedback mechanism
CN103049559A (en) * 2012-12-29 2013-04-17 深圳先进技术研究院 Automatic mass data placement method and device
CN104765870A (en) * 2015-04-26 2015-07-08 成都创行信息科技有限公司 Delay scheduling method related to network data
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013386A (en) * 2007-02-06 2007-08-08 华中科技大学 Grid task scheduling method based on feedback mechanism
CN103049559A (en) * 2012-12-29 2013-04-17 深圳先进技术研究院 Automatic mass data placement method and device
CN104765870A (en) * 2015-04-26 2015-07-08 成都创行信息科技有限公司 Delay scheduling method related to network data
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JONATHAN E等: "multi-objective optimisation in the presence of uncertainty", 《CONGRESS ON EVOLUTION COMPUTATION》 *
邱兴兴等: "混合分解和强度帕累托多目标进化算法", 《计算机应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073697A (en) * 2017-12-11 2018-05-25 浙江大学 It is a kind of to reduce and show the probabilistic system of streaming big data
CN108073697B (en) * 2017-12-11 2020-10-23 浙江大学 System for reducing and showing streaming big data uncertainty

Also Published As

Publication number Publication date
CN106845708B (en) 2019-12-06

Similar Documents

Publication Publication Date Title
Huong et al. Detecting cyberattacks using anomaly detection in industrial control systems: A federated learning approach
Meng et al. A two-stage short-term traffic flow prediction method based on AVL and AKNN techniques
Tang et al. Short-term traffic flow prediction considering spatio-temporal correlation: A hybrid model combing type-2 fuzzy C-means and artificial neural network
CN112365171A (en) Risk prediction method, device and equipment based on knowledge graph and storage medium
CN109086291B (en) Parallel anomaly detection method and system based on MapReduce
CN107846402B (en) BGP stability abnormity detection method and device and electronic equipment
CN111460026B (en) Network flow anomaly detection method based on intuitionistic fuzzy time sequence diagram mining
Chang et al. A hybrid immune-estimation distribution of algorithm for mining thyroid gland data
Liu et al. Multi-step attack scenarios mining based on neural network and Bayesian network attack graph
CN115114484A (en) Abnormal event detection method and device, computer equipment and storage medium
Zhang Financial data anomaly detection method based on decision tree and random forest algorithm
Jahwar et al. A state of the art survey of machine learning algorithms for IoT security
Prabowo et al. Because every sensor is unique, so is every pair: Handling dynamicity in traffic forecasting
Picano et al. Nonlinear dynamic chaos theory framework for passenger demand forecasting in smart city
Ahakonye et al. Classification and characterization of encoded traffic in SCADA network using hybrid deep learning scheme
CN106845708B (en) multi-objective optimization method of data stream processing system based on uncertainty
CN105468669A (en) Adaptive microblog topic tracking method fusing with user relationship
Gao et al. The prediction role of hidden markov model in intrusion detection
Bhaumik et al. STLGRU: Spatio-temporal lightweight graph GRU for traffic flow prediction
CN111737371A (en) Data flow detection classification method and device capable of dynamically predicting
CN115238245B (en) Pollutant monitoring method and device, storage medium and electronic equipment
Zhao et al. Trident: A Universal Framework for Fine-Grained and Class-Incremental Unknown Traffic Detection
Fawzy et al. Data fusion for data prediction: an iot-based data prediction approach for smart cities
CN114362972B (en) Botnet hybrid detection method and system based on flow abstract and graph sampling
Tagawa et al. An approach to chance constrained problems using weighted empirical distribution and differential evolution with application to flood control planning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant