WO2018150550A1 - Dispositif et procédé de gestion de données d'apprentissage - Google Patents

Dispositif et procédé de gestion de données d'apprentissage Download PDF

Info

Publication number
WO2018150550A1
WO2018150550A1 PCT/JP2017/005976 JP2017005976W WO2018150550A1 WO 2018150550 A1 WO2018150550 A1 WO 2018150550A1 JP 2017005976 W JP2017005976 W JP 2017005976W WO 2018150550 A1 WO2018150550 A1 WO 2018150550A1
Authority
WO
WIPO (PCT)
Prior art keywords
learning data
monitoring
prediction model
learning
data management
Prior art date
Application number
PCT/JP2017/005976
Other languages
English (en)
Japanese (ja)
Inventor
悠 藤田
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2017/005976 priority Critical patent/WO2018150550A1/fr
Priority to JP2019500139A priority patent/JP6695490B2/ja
Publication of WO2018150550A1 publication Critical patent/WO2018150550A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the present invention relates to a learning data management apparatus and a learning data management method, and more particularly, to a service developed in “DevOps”, which is a development method in which a developer (Development) and an operations manager (Operations) cooperate with each other. Therefore, it has a function suitable for management of learning data when machine learning is used.
  • baseline information is generated by distinguishing the failure occurrence period of the monitored system from other periods, but the behavior of the monitored system, that is, the distribution of metric values frequently changes. It is not supposed to be.
  • updating is frequently performed by a development method such as DevOps described above, and the behavior of the monitoring target system changes.
  • the internal logic may be changed by updating the monitored system. If the behavior of the target system changes due to the update, there will be a difference between the learning result created using the metric value before the update and the actual operation, and the prediction accuracy of baseline information etc. may be reduced. It was.
  • the present invention has been made in consideration of the above points, and has a function of proposing a learning data management apparatus and a learning data management method capable of generating a prediction model with high prediction accuracy for the future behavior of the monitored system. .
  • a monitoring data acquisition unit that acquires monitoring data from a monitoring target system as a monitoring target, and monitoring data that divides the acquired monitoring data according to the behavior of the monitoring target system
  • the feature extracting unit that extracts the feature from the divided monitoring data, and the extracted feature and the feature of the monitoring data of the monitoring target system that is operating at the time of processing execution, learn the features that are similar to each other
  • a learning data selection unit that selects the learning data to be used for the prediction
  • a prediction model generation unit that generates a prediction model using the selected learning data.
  • a learning data management method in a learning data management device that generates a prediction model using learning data
  • the learning data management device acquires monitoring data from a monitoring target system as a monitoring target.
  • the monitoring data acquisition step the learning data management device divides the acquired monitoring data according to the behavior of the monitoring target system, and the learning data management device is characterized by the divided monitoring data.
  • the feature extraction step for extracting the learning data, and the learning data management device compares the extracted features with the features of the monitoring data of the monitoring target system that is operating at the time of processing execution, and the learning data using the features that are close to each other for learning
  • a learning data selection step for selecting the learning data, and the learning data management device And having a prediction model generating step of generating a prediction model using the data.
  • FIG. 2 is a block diagram illustrating a configuration example of the virtual machine illustrated in FIG. 1 in more detail.
  • FIG. 2 is a block diagram illustrating a configuration example of a service monitoring server illustrated in FIG. 1 in more detail.
  • FIG. 2 is a block diagram illustrating a configuration example of a management server illustrated in FIG. 1 in more detail.
  • FIG. 5 is a table configuration diagram illustrating a configuration example of a monitoring metrics value table illustrated in FIG. 4. It is a table block diagram which shows the structural example of the learning data table classified by version shown in FIG.
  • FIG. 6 is a table calibration diagram showing a configuration example of a program setting table shown in FIG. 5.
  • FIG. 13 is a flowchart showing the learning data storage process shown in FIG. 12 in more detail. 13 is a flowchart showing the learning data selection process shown in FIG. 12 in more detail. It is a figure which shows one of cluster gravity center position calculation and cluster gravity center position comparison.
  • FIG. It is a flowchart which shows an example of the prediction model production
  • FIG. 1 shows a configuration example of a learning data management apparatus according to the first embodiment.
  • an EC service which is one type of Web application
  • the monitoring target system is not limited to the Web application, and can be used for server behavior, for example, storage response performance prediction.
  • the version of the monitored system is used as an element for dividing the behavior of the monitored system.
  • FIG. 1 shows a configuration example of a computer system as a learning data management apparatus according to the first embodiment.
  • the computer system according to the present embodiment includes a monitoring target system 100, a deployment server 101, a service monitoring server 102, a management server 103, a management terminal 105, and a development terminal 104. These are connected to the network 106 by their respective communication interfaces, and are connected to each other via the network 106.
  • monitoring target system 100 for example, a Web application, specifically, an EC (Electronic Commerce) service is exemplified.
  • EC Electronic Commerce
  • the management terminal 105 includes a communication interface 134, a processor 133, a storage device 135, and a memory 136, which are connected by an internal bus 145.
  • An input device 137 and an output device 138 are connected to the internal bus 145.
  • the operation manager 139 performs operations on the EC service 100, the deployment server 101, the service monitoring server 102, and the management server 103 via the input device 137 and the output device 138.
  • the development terminal 104 includes a communication interface 129, a processor 130, a storage device 131, and a memory 132, which are connected by an internal bus 144. An input device 147 and an output device 148 are connected to the internal bus 144.
  • the developer 140 develops an application using the development terminal 104.
  • the developed application source code is stored in the storage device 119 of the deployment server 101 via the network 106.
  • the EC service 100 includes a virtual machine 112 provided by virtualization software 111 operating on the physical server 110.
  • the physical server 110 includes a communication interface 113, a processor 114, a storage device 115, and a memory 116. A part of the processor 114, the storage device 115, and the memory 116 of the physical machine 110 is allocated to the virtual machine 112. Operations on the virtual machine 112 are performed via the communication interface 113 of the physical server 110.
  • the communication interface 113, the processor 114, the storage device 115, and the memory 116 are connected by an internal bus 146.
  • the deploy server 101 includes a communication interface 117, a processor 118, a storage device 119, and a memory 120.
  • the communication interface 117, the processor 118, the storage device 119, and the memory 120 are connected by an internal bus 141.
  • the service monitoring server 102 includes a communication interface 121, a processor 122, a storage device 123, and a memory 124.
  • the communication interface 121, the processor 122, the storage device 123, and the memory 124 are connected by an internal bus 142.
  • the management server 103 includes a communication interface 125, a processor 126, a storage device 127, and a memory 128.
  • the communication interface 125, the processor 126, the storage device 127, and the memory 128 are connected by an internal bus 143.
  • FIG. 2 is a block diagram showing a configuration example of the deployment server 101.
  • the deployment server 101 has a function of building the source code stored in the source code repository 201 and updating the application program 300 running on the virtual machine 112.
  • the deployment program 200 is stored in the memory 120.
  • the storage device 119 stores a source code repository 201.
  • the source code repository 201 stores the source code of the application program 300 provided on the virtual machine 112. This source code is developed by the developer 140 using the development terminal 104 and is stored in the source code repository 201 via the network 106.
  • the deployment program 200 Upon receiving a deployment instruction via the input device 137 of the management terminal 105, the deployment program 200 builds the source code stored in the source code repository 201, and the application program executable file generated by such build Is replaced with the application program 300 running on the virtual machine 112 to update the application.
  • FIG. 3 shows a configuration diagram of the virtual machine 112 operating in the EC service 100.
  • An application program 300 that provides the EC service 100 is running on the memory 116 assigned to the virtual machine 112.
  • the storage device 115 stores a product data DB 303.
  • the product data DB (database) 303 stores product information including product names, product prices, and the number of products in stock.
  • the application program 300 acquires product information stored in the product data DB 303 and provides a service based on the product information.
  • the application program 300 is open to the network 106.
  • the monitoring agent program 301 acquires the operation information of the application program 300 and transmits the monitoring metric value obtained by monitoring to the service management manager program 400 (FIG. 4) of the service monitoring server 102 via the network 106. .
  • FIG. 4 shows a configuration diagram of the service monitoring server 102.
  • the service monitoring server 102 receives and saves the monitoring result from the monitoring manager program 301 running on the virtual machine 112.
  • the service management manager program 400 is stored in the memory 124.
  • the storage metric value table 401 is stored in the storage device 123.
  • the service monitoring manager program 400 receives the monitoring metric value acquired by the monitoring agent program 300 running on the virtual machine 112 and stores it in the monitoring metric value table 401 in the storage device 123. Details of the monitoring metrics value table 401 will be described later.
  • FIG. 5 shows a configuration diagram of the management server 103.
  • the management server 103 has a function of learning the monitoring metric value acquired by the service monitoring server 102 and creating a sales number prediction model in the application program 300.
  • the processor 126 predicts the number of sales in the EC service 100 using this sales number prediction model.
  • the memory 128 stores a learning data storage program 500, a learning data selection program 501, a prediction model generation program 502, and an inventory management program 503.
  • the storage device 127 stores a version-specific learning data table 504, a cluster centroid position table 505, a prediction model table 506, and a program setting table 507.
  • the learning data storage program 500 reads a value from the monitoring metric value table 401 of the service monitoring server 102, executes the processing, and then stores it in the learning data table 504 classified by version.
  • the learning data selection program 501 calculates the cluster centroid position based on the learning data in the version-specific learning data table 504 and stores it in the cluster centroid position table 505.
  • the learning data selection program 501 selects a table used for learning based on the cluster centroid position stored in this way.
  • records are stored in the normal table, in this embodiment, for convenience of explanation, it is assumed that the version-specific learning data table 504 has a table corresponding to the record.
  • the prediction model generation program 502 generates a prediction model based on the learning data in the table selected by the learning data selection program, and stores the prediction model in the prediction model table 506. Details of these processes will be described later.
  • the inventory management program 503 acquires the latest prediction model from the prediction model table 506, and holds the acquired prediction model as a sales number prediction model 508.
  • the inventory management program 503 predicts the number of sales in the EC service 100 based on the sales number prediction model 508.
  • the operation manager 139 adjusts the order quantity from this prediction information.
  • the setting file is used in the learning data storage program 500 and the learning data selection program 501.
  • FIG. 6 is a diagram illustrating an example of the monitoring metric value table 401 stored in the storage device 123 of the service monitoring server 102.
  • the management metrics value table 401 manages version 601, date and time 602, number of accesses 603, number of users 604, transition rate 605, and purchase rate 606.
  • the metric value in the access number 603 indicates a numerical value such as 5000 times, and the metric indicates the access number item itself.
  • Monitoring data refers to a collection of metrics values for each metric at a certain date and time.
  • the metrics value of the application program 300 is sent to the service monitoring server 102 by the monitoring agent program 301 running on the virtual machine 112 and stored by the monitoring manager program 400.
  • the monitoring metrics value table 401 stores version 601, date and time 602, number of accesses 603, number of users 604, transition rate 605 and purchase rate 606.
  • Version 601 indicates version information of the application program 300 running on the virtual machine 112.
  • the date and time 602 indicates the date and time when the monitoring metric value is acquired
  • the number of accesses 603 indicates the number of times that an introduction page of a product sold by the EC service 100 is accessed within a unit time.
  • the number of users 604 indicates the number of users registered in the application program 300 when the metrics value is acquired.
  • the transition rate 605 indicates the rate of transition from the product introduction page to the purchase page in the number of accesses 603.
  • the purchase rate 606 indicates the rate of purchase of a product among the number of accesses.
  • the number of sales indicates the number of products purchased.
  • FIG. 7 is a diagram illustrating an example of the version-specific learning data table 504 stored in the storage device 127 of the management server 103.
  • learning data of version 2.03, learning data of version 2.04, and learning data of version 2.05 are stored in different learning data tables 701, 702, and 703, respectively. .
  • Learning data refers to data stored in the version-specific learning data table 504 by the learning data storage program 500.
  • the contents of the version-specific learning data table 504 are normalized by the learning data storage program 500.
  • This version-specific learning data table 504 stores values normalized by selecting only the metrics used for learning from the metrics in the monitoring metrics value table 401.
  • items 705, 706, and 707 values obtained by normalizing the number of accesses, the transition rate, and the purchase rate are stored. Since the item 704 has a role of an ID of learning data, it is stored as it is without being normalized. In this embodiment, since the number of users 604 is not used for learning, it is not stored in the version-specific learning data table 504.
  • FIG. 8 is a diagram illustrating an example of the cluster centroid position table 505 stored in the storage device 127 of the management server 103.
  • the cluster centroid position table 505 stores the calculation result 802 of the cluster centroid position in the table related to the version 801.
  • the cluster center-of-gravity position refers to an average of coordinates obtained by mapping learning data of each version of the learning data table 504 in the coordinate space.
  • the cluster centroid position table 505 is used and updated when the learning data selection program 501 is executed.
  • FIG. 9 is a diagram illustrating an example of the prediction model table 506 stored in the storage device 127 of the management server 103.
  • the prediction model table 506 stores the date and time 900 when the prediction model was generated, the version 901 used when generating the prediction model, and the prediction model information 902 generated thereby.
  • each version is listed as [2.01] and [2.02] of version 901.
  • the prediction model information stores information on the prediction model itself. For example, when the prediction model is created using a neural network as shown in FIG. 10, the weight of each node is stored in the prediction model information.
  • the prediction model table is updated when the prediction model generation program 502 is executed.
  • FIG. 10 shows a configuration example of the neural network of the sales number prediction model 508 created by the prediction model generation program, which is divided into, for example, an input layer, a hidden layer, and an output layer.
  • the input is the number of accesses, the transition rate, and the purchase rate
  • the output is the number of sales.
  • Node 1 has one input, which is weighted “w1_0”.
  • weights “wN_0”, “wN_1”, “wN_2”, “wN_3”, and “wN_4” are applied to each input.
  • the weight value is stored in the prediction model information 903 and 904 shown in FIG.
  • FIG. 11 is a diagram illustrating an example of the program setting table 1000 stored in the storage device 127 of the management server 103.
  • the program setting table 1000 settings used in the learning data storage program 500 and the learning data selection program 501 are stored.
  • the processing time execution interval setting 1001 stores the execution interval of the learning process S1100 (FIG. 12) executed by the management server 103.
  • a learning metric selection setting 1002 is used when a metric used for learning is selected by the learning data selection program 501.
  • the data number threshold setting 1003 is used when determining whether or not to execute processing in the learning data selection program 501.
  • the cluster centroid position threshold setting 1004 is used when the learning data selection program 501 selects a learning data table used for learning.
  • the program setting table 1000 stores settings via the network 106 by the operation manager 139 using the input device 137 of the management terminal 105 before the learning process S1100 is executed.
  • the stored metrics are the access count 603, the user count 604, the transition rate 605, and the purchase rate 606.
  • the present embodiment exemplifies an EC service as the monitoring target system 100, but is not limited to this, and can be applied to, for example, prediction of storage response performance.
  • the monitoring metric value table 401 stores the processor usage rate, the cache usage rate, the cache size, and the like.
  • FIG. 12 is a flowchart illustrating an example of a learning process S1100 for generating a prediction model. This flowchart is executed by the management server 103.
  • the learning data storage process S1101 corresponds to the learning data storage program 500
  • the learning data selection process S1102 corresponds to the learning data selection program 501
  • the prediction model generation process S1103 includes a prediction model generation. This corresponds to the program 502.
  • these programs 500, 501, 502 are expanded in the memory 128 of the management server 103, and processing included in each program 500, 501, 502 is executed by the processor 126.
  • Learning process S1100 is executed at regular time intervals based on a process execution time interval setting 509 predetermined by the operation manager 139 (step S1105).
  • the process execution time interval setting 1001 of the program setting table 1000 shown in FIG. 11 only the process execution interval is described. If this process execution time interval setting 1001 describes 1 hour, it means that the process is executed every hour.
  • the date and time when the previous process was executed is output, and the date and time when the previous process was executed is used in the next learning data storage process S1101.
  • the processor 126 executes a learning data storage process (step S1101), and reads the amount of data increased from the previous execution time of the process S1100 from the monitoring metrics table 401 based on the previous execution date and time. Saved in the version-specific learning data table 504.
  • the processor 126 executes a learning data selection process (step S1102), and selects a learning data table used for prediction model generation from the version-specific learning data table 504.
  • the processor 126 outputs a learning data table of the version-specific learning data table 504 used for learning (step S1102), and then the processor 126 executes prediction model generation processing (step S1103).
  • a new prediction model is generated using the learning data table of the version-specific learning data table 504 passed by the learning data selection processing S1102 and stored in the prediction model table 506.
  • the processor 126 acquires the prediction model generated in the prediction model generation process (step S1103) from the prediction model table 506 from the prediction model table 506, and the new prediction model from which the sales number prediction model 507 of the inventory management program 503 is generated. (Step S1104).
  • FIG. 13 is a flowchart showing details of the learning data storage process shown in FIG.
  • the processor 126 acquires the monitoring data for the amount increased from the previous execution from the monitoring metrics value table 401 of the service monitoring server 102 via the network 106 (step S1201).
  • the processor 126 uses the date and time when the previous process was executed as described above to determine whether or not the increased amount of monitoring data, and monitoring data indicating the date and time after the date and time when the previous process was executed. To get.
  • the processor 126 refers to the learning metric selection setting 1002 set by the operation manager 139 in advance, and selects a metric used for learning from the monitoring data in the monitoring metric value table 401 (step S1202).
  • the learning metrics selection setting 1002 lists metrics used for learning. For example, three metrics of access count, transition rate, and purchase rate are listed.
  • the processor 126 normalizes the metric value of the metric selected as described above (step S1203).
  • the normalization here means that conversion is made so that a numerical value between 0 and 1 indicates where the metric value is located between the maximum value and the minimum value that can be taken by the metric value of each metric.
  • the process S1204 stores the normalized metric value in the version-specific learning data table 504 for each version.
  • step S1202 when the monitoring data of the date 2016/10/10 13:00 is acquired in step S1201 as the monitoring data increased from the monitoring metrics value table 504, the processor 126, the access count of the metrics, the transition rate, and the purchase A rate is selected (step S1202).
  • the processor 126 normalizes the metric value of the metric selected as described above (step S1203), and stores the normalized metric value in the table 703 of the version-specific learning data table 504 (step S1204).
  • FIG. 14 is a flowchart showing details of the learning data selection process shown in FIG.
  • the processor 126 confirms whether or not there is a sufficient number of versions of the application program 300 running on the virtual machine 112 with reference to the version-specific learning data table 504 (step S1301). ). Whether the number of data is sufficient is determined according to the data number threshold setting 1003 previously determined by the operation manager 139.
  • the data number threshold setting 1003 only a value indicating how many data is sufficient is stored. For example, when “300” is set in advance by the operation manager 139, if there are 300 or more pieces of learning data in the table 703 of the version-specific learning data table 504, it is determined that the number of data is sufficient. If the number of data is sufficient, the processor 126 calculates the cluster centroid position using the version-specific learning data table 504 for each version (step S1303).
  • step S1302 the processor 126 compares the number of data used in the previous cluster centroid position calculation and the cluster centroid position calculation, and calculates the cluster centroid position only when the number of data increases. This result is stored in the cluster centroid position table 505.
  • the learning data selection program 501 holds the number of data used for cluster centroid position calculation.
  • the processor 126 selects a learning data table used for learning (step S1304).
  • the processor 126 follows the cluster centroid position threshold setting 1004 preset by the operation manager 139, and the “distance from the cluster centroid position” of the version of the application program 300 running on the virtual machine 112 is the threshold value.
  • the learning data table of the version-specific learning data table 504 that fits in the version is selected.
  • the distance from the cluster centroid position indicates a difference between values of a plurality of cluster centroid positions.
  • the cluster centroid position threshold setting 1004 stores only the threshold value.
  • the cluster centroid position of each version is acquired from the cluster centroid position table 505. If the number of learning data is not sufficient in step S1301, the processor 126 does not execute the cluster centroid position calculation process.
  • the processor 126 selects the version selected at the time of the previous learning data selection process execution, and also selects the learning data table of the version running on the virtual machine 112 (step S1305).
  • the selected learning data table is held by the learning data selection program 501.
  • the processor 126 selects the learning data table and executes the following prediction model generation process (S1103).
  • FIGS. 15A to 15C are conceptual diagrams showing an example of calculating the cluster centroid position in step S1303, and FIG. 15D shows an example of selecting learning data in step S1304. It is a conceptual diagram.
  • the processor 126 maps the learning data in the learning data table to the coordinate space for each version as shown in FIGS. 15 (A) to 15 (C), and calculates the center of gravity of the mapped learning data. Calculation is performed (step S1400).
  • the learning data of the versions [2.03], [2.04], and [2.05] are mapped, and the center of gravity position is obtained by calculation.
  • FIG. 15D only the cluster centroid position of each version is mapped, and the distance from version 2.05 is compared (step S1401).
  • the threshold value in the figure is a threshold value of the cluster centroid position set by the operation manager 139.
  • the cluster centroid position of the version [2.05] shown in FIG. 15C is, for example, “0.61”. Therefore, the cluster centroid position “0.56” of the version [2.03] shown in FIG. 15A is within the threshold value, whereas the version [2.04] shown in FIG. The cluster centroid position 0.72 is not within the threshold. Based on the above, the version [2.03] and the currently operating version [2.05] are selected as the learning data table.
  • FIG. 16 is a flowchart showing details of the prediction model generation process (step S1103).
  • the processor 126 refers to the prediction model table 506 and selects a past prediction model corresponding to the selected learning data table. An example of selecting a prediction model is shown below.
  • the processor 126 determines from the prediction model table 506.
  • the prediction model generated from the learning data table of the versions [2.03] and [2.05] is searched. If this prediction model does not exist, the processor 126 searches for a prediction model generated only from one of the selected learning data tables.
  • the prediction model does not exist, if the versions [2.03] and [2.05] are selected as described above, the prediction model generated by the version [2.03], and , One of the prediction models generated in version [2.05] corresponds.
  • step S1102 the processor 126 selects the learning data table 701 of the version [2.03] from the version-specific learning data table 504, and the version [2 .05] learning data table 703 is selected.
  • step S1501 the processor 126 selects the prediction model created by version 2.03 from the prediction model table 506.
  • step S1502 the processor 126 determines whether or not the corresponding prediction model has been selected in step S1501.
  • the processor 126 learns difference learning data for the past prediction model selected in step S1501 described above, and generates a new prediction model. (Step S1503), and this is registered in the prediction model table 506 (see FIG. 9).
  • the difference indicates learning data after this date and time based on the item 900 (date and time when the prediction model was created) in the prediction model table 506.
  • a new prediction can be obtained by additionally learning difference learning data for this past prediction model.
  • a model is generated, and the new prediction model is added to the prediction model table 506.
  • the processor 126 when there is no corresponding prediction model and the past prediction model cannot be used, the processor 126 generates a prediction model using all the learning data included in the table selected by the learning data selection program 1102 ( In step S1504), the prediction model is added to the prediction model table 506.
  • FIG. 17 is a diagram illustrating a state in which the number of sales indicated by the inventory management program 503 is predicted.
  • the vertical axis represents the number of sales
  • the horizontal axis represents time. Below the horizontal axis, which version is running in the application program 300 is shown.
  • the change in the number of sales indicated by the solid line is an actual measurement value
  • the change in the number of sales indicated by the dotted line is a prediction value based on the prediction model
  • the change in the sales number indicated by the dashed line is a prediction value based on the old prediction model It is.
  • the prediction model newly generated in step S1104 described above is updated to the predicted value by the prediction model newly generated in step S1104 described above (corresponding to the dotted line in the figure).
  • the version of the application program 300 is updated from [2.04] to [2.05], and the predicted value based on the old model has a large deviation from the actually measured value indicated by the solid line. By updating this to a new model, it is possible to make a prediction closer to the actual measurement value.
  • the processor 126 acquires monitoring data from the monitoring target system 100, and the acquired monitoring data Are divided according to the behavior of the monitoring target system 100.
  • the processor 126 compares the feature extracted from the divided monitoring data with the feature of the monitoring data of the monitoring target system that is operating at the time of processing execution, selects a feature that is close to both features as learning data to be used for learning, A prediction model is generated using the selected learning data.
  • such learning data is managed for each behavior of the monitoring target system, and only learning data close to the behavior of the currently operating monitoring target system 100 is selected and learned, thereby generating by learning.
  • the prediction accuracy of the prediction model to be performed can be improved. Thereby, it is possible to generate a prediction model with high prediction accuracy for the future behavior of the monitored system, and it is possible to increase the prediction accuracy for the future behavior.
  • the processor 126 performs the first step in the process S1204 (see FIG. 13) of the learning data storage process S1101 by the learning data storage program 501.
  • learning data is learned for each date and time by dividing the case where the behavior differs depending on the time zone such as weekdays and holidays. It is stored in the data table 1700.
  • the second embodiment differs from the first embodiment in that the processor 126 uses the learning data in the date-specific learning data table 1700 for generating a prediction model, as will be described later. More specific description will be given below.
  • FIG. 18 is a block diagram illustrating a configuration example of the management server 103A according to the second embodiment.
  • the management server 103A has substantially the same configuration as the management server 103 according to the first embodiment, but instead of the version-specific learning data table 504, the next date-specific learning data table 1700 that can store learning data by date and time. Is different.
  • FIGS. 19A to 19C show table configuration examples of the learning data table 1700 classified by date and time shown in FIG.
  • the date-specific learning data table 1700 is stored in the memory 128 of the management server 103.
  • Each learning data table is given a table name consisting of the date and time, and represents the date and time at which the learning data is stored.
  • FIG. 19A illustrates a learning data table 1701 at 9:00 on October 8, 2016, and FIG. 19B illustrates the learning data table 1701 at 9:00 on October 9, 2016.
  • FIG. 19C illustrates a learning data table 1703 as of October 10, 2016 at 9:00.
  • Each learning data table manages, for example, date 704, number of accesses 705, transition rate 706, and purchase rate 707.
  • FIG. 20 is a flowchart of the learning data storage process S1101A according to the second embodiment. 20 in the second embodiment corresponds to FIG. 13 in the first embodiment, and steps S1201, S1202, and S1203 in the second embodiment are the same as those in the first embodiment. This corresponds to steps S1201, S1202, and S1203.
  • This learning data storage process S1101A is executed by a learning data storage program 500 instead of the learning data storage process S1100 shown in FIGS. 12 and 13 according to the first embodiment.
  • the learning data storage program 500 is expanded in the memory 128 and executed by the processor 126.
  • steps S1201 to S1203 according to the second embodiment are the same as those of the first embodiment, and thus description thereof is omitted.
  • the processor 126 separates the learning data by the date and time instead of the version. At this time, the processor 126 reads the table division setting 1805 set in advance by the operation manager 139 and separates the learning data based on the table division setting 1805.
  • the table division setting 1805 is stored in the program setting table 507 of the management server 103, and the operation manager 139 performs setting via the network 106 using the input device 137 of the management terminal 105.
  • the table division setting 1805 describes at which date and time the learning table is divided. For this reason, the learning table may be separated in the same time zone, or may be separated in different time zones.
  • the table 1701, the table 1702, and the table 1703 are all divided at 9:00, but only the table 1702 is divided at 12:00 on October 9, 2016. May be.
  • the date normalized learning data table 1800 stores the data normalized in step S1203 (step S1204A).
  • step S1101A the processor 126 executes the learning data selection process shown in FIG. 12 as in the first embodiment. (Step S1102).
  • the processor 126 executes substantially the same operation as in the first embodiment, but the learning data table to be processed is not the version-specific learning data table 504 but the date-specific learning data.
  • the table 1700 is different from the first embodiment.
  • the processor 126 performs substantially the same processing using the learning data table by date 1700 instead of the learning data table by version 504 (see FIG. 14) in step S1306 described above.
  • the table is selected in substantially the same manner as in the first embodiment (steps S1304 and S1305).
  • the processor 126 selects a learning data table from the date-specific learning data table 1700, and generates a prediction model using the learning data table as an input (step S1103). ).
  • the present invention relates to a learning data management method when machine learning is used for a service developed by “DevOps”, which is a development method in which a developer and an operations manager cooperate with each other. It can be widely applied to learning data management devices using
  • SYMBOLS 103 Management server, 500 ... Data storage program, 501 ... Learning data selection program, 502 ... Prediction model generation program, 504 ... Version-specific learning data table, 103 ... Cluster centroid position table, 506 ... Prediction model table, 1100 ... learning process, S1101 ... learning data storage process, S1102 ... learning data selection process, S1103 ... prediction model generation process, S1201 ... monitoring metrics value acquisition process, S1202 ... learning metrics Selection processing, S1203 ... Metric value normalization processing, S1204 ... Version-specific learning data storage processing, S1303 ... Cluster centroid position calculation processing, S1305 ... Learning data table selection processing, S1503 ... Prediction model generation processing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'objectif de l'invention est de garantir que la précision de prédiction des informations de ligne de base, etc. ne se dégrade pas même lorsqu'une différence survient entre une opération réelle et un résultat d'apprentissage créé à l'aide d'une valeur de mesure avant une mise à jour lorsque le comportement du système concerné change en raison de la mise à jour. À cet effet, l'invention concerne un dispositif de gestion de données d'apprentissage qui compare une caractéristique extraite et une caractéristique dans les données de surveillance d'un système en fonctionnement qui est surveillé pendant l'exécution du processus, sélectionne un élément dans lequel les deux caractéristiques sont proches des données d'apprentissage utilisées pour l'apprentissage, puis génère un modèle de prédiction à l'aide des données d'apprentissage sélectionnées.
PCT/JP2017/005976 2017-02-17 2017-02-17 Dispositif et procédé de gestion de données d'apprentissage WO2018150550A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2017/005976 WO2018150550A1 (fr) 2017-02-17 2017-02-17 Dispositif et procédé de gestion de données d'apprentissage
JP2019500139A JP6695490B2 (ja) 2017-02-17 2017-02-17 学習データ管理装置及び学習データ管理方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/005976 WO2018150550A1 (fr) 2017-02-17 2017-02-17 Dispositif et procédé de gestion de données d'apprentissage

Publications (1)

Publication Number Publication Date
WO2018150550A1 true WO2018150550A1 (fr) 2018-08-23

Family

ID=63169197

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/005976 WO2018150550A1 (fr) 2017-02-17 2017-02-17 Dispositif et procédé de gestion de données d'apprentissage

Country Status (2)

Country Link
JP (1) JP6695490B2 (fr)
WO (1) WO2018150550A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020066697A1 (fr) * 2018-09-27 2020-04-02 ソニー株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations et programme
KR20200084441A (ko) * 2018-12-26 2020-07-13 단국대학교 산학협력단 머신러닝의 학습 데이터셋 생성을 위한 애플리케이션 자동화 빌드 장치 및 방법
KR20210132500A (ko) * 2020-04-27 2021-11-04 한국전자기술연구원 연합 학습 시스템 및 방법
JP2022512233A (ja) * 2018-12-10 2022-02-02 インタラクティブ-エーアイ,エルエルシー 多言語スタイル依存音声言語処理のためのニューラル調整コード
WO2024023917A1 (fr) * 2022-07-26 2024-02-01 日本電信電話株式会社 Dispositif d'entraînement, procédé d'entraînement et programme

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102247182B1 (ko) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 클러스터링 기법을 이용한 신규 데이터 생성 방법, 장치 및 프로그램

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011192097A (ja) * 2010-03-16 2011-09-29 Hitachi Ltd 異常検知方法およびそれを用いた情報処理システム
US20120284213A1 (en) * 2011-05-04 2012-11-08 Google Inc. Predictive Analytical Modeling Data Selection
WO2013030984A1 (fr) * 2011-08-31 2013-03-07 株式会社日立エンジニアリング・アンド・サービス Procédé de surveillance d'état d'installation et dispositif pour celui-ci
JP2015082259A (ja) * 2013-10-23 2015-04-27 本田技研工業株式会社 時系列データ予測装置、時系列データ予測方法、及びプログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011192097A (ja) * 2010-03-16 2011-09-29 Hitachi Ltd 異常検知方法およびそれを用いた情報処理システム
US20120284213A1 (en) * 2011-05-04 2012-11-08 Google Inc. Predictive Analytical Modeling Data Selection
WO2013030984A1 (fr) * 2011-08-31 2013-03-07 株式会社日立エンジニアリング・アンド・サービス Procédé de surveillance d'état d'installation et dispositif pour celui-ci
JP2015082259A (ja) * 2013-10-23 2015-04-27 本田技研工業株式会社 時系列データ予測装置、時系列データ予測方法、及びプログラム

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020066697A1 (fr) * 2018-09-27 2020-04-02 ソニー株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations et programme
JP2022512233A (ja) * 2018-12-10 2022-02-02 インタラクティブ-エーアイ,エルエルシー 多言語スタイル依存音声言語処理のためのニューラル調整コード
KR20200084441A (ko) * 2018-12-26 2020-07-13 단국대학교 산학협력단 머신러닝의 학습 데이터셋 생성을 위한 애플리케이션 자동화 빌드 장치 및 방법
KR102167767B1 (ko) * 2018-12-26 2020-10-19 단국대학교 산학협력단 머신러닝의 학습 데이터셋 생성을 위한 애플리케이션 자동화 빌드 장치 및 방법
KR20210132500A (ko) * 2020-04-27 2021-11-04 한국전자기술연구원 연합 학습 시스템 및 방법
KR102544531B1 (ko) * 2020-04-27 2023-06-16 한국전자기술연구원 연합 학습 시스템 및 방법
WO2024023917A1 (fr) * 2022-07-26 2024-02-01 日本電信電話株式会社 Dispositif d'entraînement, procédé d'entraînement et programme

Also Published As

Publication number Publication date
JPWO2018150550A1 (ja) 2019-07-25
JP6695490B2 (ja) 2020-05-20

Similar Documents

Publication Publication Date Title
WO2018150550A1 (fr) Dispositif et procédé de gestion de données d'apprentissage
US11288002B2 (en) System and method for providing high availability data
US20080097802A1 (en) Time-Series Forecasting
US20200090085A1 (en) Digital twin graph
JP6460095B2 (ja) 学習モデル選択システム、学習モデル選択方法及びプログラム
US20170160880A1 (en) System and Method for Integrating Microservices
US12032533B2 (en) Code generator platform for data transformation
US20180114136A1 (en) Trend identification using multiple data sources and machine learning techniques
US10685319B2 (en) Big data sourcing simulator
CN105074724A (zh) 使用列式数据库中的直方图进行有效查询处理
US10614101B2 (en) Virtual agent for improving item identification using natural language processing and machine learning techniques
US20220045847A1 (en) Determining a change to product information or user information via hashing
US11366821B2 (en) Epsilon-closure for frequent pattern analysis
US9268814B2 (en) Enablement of quasi time dependency in organizational hierarchies
CN111309712A (zh) 基于数据仓库的优化任务调度方法、装置、设备及介质
CN112202617A (zh) 资源管理系统监控方法、装置、计算机设备和存储介质
CA3044689A1 (fr) Plateforme de partage d'attribut destinee a des systemes de traitement de donnees
US20180039901A1 (en) Predictor management system, predictor management method, and predictor management program
US10885468B2 (en) Dynamic search system for real-time dynamic search and reporting
CN110659155B (zh) 用于备份拓扑图的系统和方法
CN111932338A (zh) 一种商品推荐方法、装置、设备及存储介质
US10417228B2 (en) Apparatus and method for analytical optimization through computational pushdown
WO2010082885A1 (fr) Procédé pour empêcher une perte de clients
US20200349629A1 (en) Data-driven hardware configuration recommendation system based on user satisfaction rating
US11609935B2 (en) Managing configuration datasets from multiple, distinct computing systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17896476

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019500139

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17896476

Country of ref document: EP

Kind code of ref document: A1