CN111897707A - Method and device for optimizing business system, computer system and storage medium - Google Patents

Method and device for optimizing business system, computer system and storage medium Download PDF

Info

Publication number
CN111897707A
CN111897707A CN202010689223.9A CN202010689223A CN111897707A CN 111897707 A CN111897707 A CN 111897707A CN 202010689223 A CN202010689223 A CN 202010689223A CN 111897707 A CN111897707 A CN 111897707A
Authority
CN
China
Prior art keywords
job
performance index
target operation
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010689223.9A
Other languages
Chinese (zh)
Other versions
CN111897707B (en
Inventor
梁杰
金童
高童
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010689223.9A priority Critical patent/CN111897707B/en
Publication of CN111897707A publication Critical patent/CN111897707A/en
Application granted granted Critical
Publication of CN111897707B publication Critical patent/CN111897707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/72Code refactoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/875Monitoring of systems including the internet

Abstract

The present disclosure provides a method for optimizing a service system, including: acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, wherein the performance index of each dimension comprises a plurality of characteristic variables; inputting the performance indexes of multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and a contribution degree ranking of multiple characteristic variables through the model; and adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the business system. The present disclosure provides an optimization apparatus of a business system, a computer system, and a storage medium.

Description

Method and device for optimizing business system, computer system and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for optimizing a business system, a computer system, and a storage medium.
Background
With the rapid development of computers and the internet, the business scenes of various industries are more and more complex. The amount of data for business systems implementing various business scenario functions is also increasing. In order to ensure the reliable operation of the service system, the service system can be modified according to the performance problem of the service system in the operation process, so as to improve the reliability of the service system.
At present, an independent performance early warning platform can be generally set up, and the performance early warning platform is utilized to test service data in a service system so as to discover potential performance hazards which may appear in the operation process of the system.
In implementing the disclosed concept, the inventors discovered that often performance testing of already running systems is performed using a separate performance early warning platform. Aiming at the early stage of the deployment of the service system in the production environment, particularly the development stage of the service system, the performance test can not be carried out, and the potential performance hazard of the service system can not be discovered as early as possible, so that the reliability of the system is reduced.
Disclosure of Invention
In view of the above, the present disclosure provides a method and an apparatus for optimizing a business system, a computer system, and a storage medium.
One aspect of the present disclosure provides a method for optimizing a service system, including: acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, wherein the performance index of each dimension comprises a plurality of characteristic variables; inputting the performance indexes of multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and a contribution degree ranking of multiple characteristic variables through the model; and adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the business system.
According to an embodiment of the present disclosure, acquiring performance indexes of multiple dimensions of a target job according to a service logic of the target job in a service system, a script code for implementing the service logic, and cluster information that needs to be interacted in a process of implementing the service logic includes: acquiring a first-dimension performance index of the target operation according to the service logic of the target operation, wherein the first-dimension performance index comprises a logic layer to which the target operation belongs, an algorithm type of data loaded by the target operation, the number of operations which are simultaneously operated with the target operation at the same time and the number of a target operation group to which the target operation belongs; acquiring a second-dimension performance index of the target operation according to a script code for realizing the business logic, wherein the second-dimension performance index comprises the number of times of occurrence of keywords in the script code and the number of generated temporary tables, and the keywords comprise a script for realizing selection operation, a script for representing selection conditions, a script for realizing grouping operation, a script for realizing statistical information collection operation and a script for realizing table association operation; and acquiring a third-dimensional performance index of the target operation according to cluster information needing interaction in the process of realizing the service logic, wherein the third-dimensional performance index comprises the size of a source table for providing a data source for the service system, whether a landing table generated by the target operation is an index table, whether the landing table is a log table, the number of main keys in the landing table, whether the landing table comprises a trigger, whether the landing table allows migration, the number of nodes in the cluster, the communication bandwidth among the nodes in the cluster, the hash mode of distributed storage of the landing table in the cluster, and the number of storage of files corresponding to the same landing table in the cluster.
According to an embodiment of the present disclosure, inputting the performance indexes of the multiple dimensions of the target job into a pre-constructed model, so as to output a performance index corresponding to the target job and a contribution ranking of the multiple characteristic variables through the model comprises: performing data cleaning on a plurality of characteristic variables included in the performance index of each dimension to obtain normalized characteristic variables; coding the normalized characteristic variables to obtain coded characteristic variables; inputting the coded characteristic variables into the model so as to output a performance index corresponding to the target operation and a contribution degree ranking of the plurality of characteristic variables through the model.
According to an embodiment of the present disclosure, the performance index includes an operation duration; adjusting the target job according to the performance index corresponding to the target job and the contribution ranking of the plurality of characteristic variables to optimize the business system comprises: determining whether the running time corresponding to the target operation is greater than a preset threshold value; and under the condition that the running time corresponding to the target operation is determined to be longer than the preset threshold value, adjusting the target operation according to the contribution degree ranking of the characteristic variables corresponding to the target operation so as to optimize the service system.
According to the embodiment of the disclosure, the method further comprises training the model in advance, wherein the training process comprises: determining a plurality of already running jobs; for each running job, acquiring performance indexes of multiple dimensions of the job and historical running time of the job in a preset time period; and training an initial model according to the performance indexes of the multiple dimensions of the operation and the historical running time of the operation in a preset time period to obtain the model.
According to an embodiment of the present disclosure, training an initial model according to performance indexes of multiple dimensions of the job and a historical running time of the job in a preset time period includes: setting an initial value for a model parameter associated with each of the characteristic variables, wherein the value of the model parameter is used for characterizing the contribution degree of the characteristic variable; and training the initial model by taking the characteristic variables as samples and the historical running time of each job in a preset time period as a label to obtain the model.
Another aspect of the present disclosure provides an optimization apparatus for a business system, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes used for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, and the performance index of each dimension comprises multiple characteristic variables; the output module is used for inputting the performance indexes of the multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and the contribution degree ranking of the multiple characteristic variables through the model; and the optimization module is used for adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the business system.
According to an embodiment of the present disclosure, the first obtaining module includes: a first obtaining unit, configured to obtain a performance index of a first dimension of the target job according to a service logic of the target job, where the performance index of the first dimension includes a logic layer to which the target job belongs, an algorithm type of loading data of the target job, a number of jobs that are concurrent with the target job at the same time, and a number of a target job group to which the target job belongs; a second obtaining unit, configured to obtain a performance index of a second dimension of the target job according to a script code for implementing the business logic, where the performance index of the second dimension includes the number of occurrences of a keyword in the script code and the number of generated temporary tables, and the keyword includes a script for implementing a selection operation, a script for characterizing a selection condition, a script for implementing a grouping operation, a script for implementing a statistical information collection operation, and a script for implementing a data table association operation; and a third obtaining unit, configured to obtain a third-dimensional performance index of the target job according to cluster information that needs to be interacted in a process of implementing the service logic, where the third-dimensional performance index includes a size of a source table used for providing a data source for the service system, whether a landing table generated by the target job is an index table, whether the landing table is a log table, a number of primary keys in the landing table, whether the landing table includes a trigger, whether the landing table allows migration, a number of nodes in the cluster, a communication bandwidth between nodes in the cluster, a hash manner in which the landing table is distributively stored in the cluster, and a number of files corresponding to the same landing table stored in the cluster.
Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.
Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.
Another aspect of the present disclosure provides a computer system comprising: one or more processors; storage means for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described above.
According to the embodiment of the disclosure, the technical means of optimizing the business system is adopted, the performance indexes of multiple dimensions of the target operation are obtained according to the business logic of the target operation in the business system, the script codes for realizing the business logic and the cluster information needing interaction in the process of realizing the business logic, the multiple feature vectors in the performance indexes of each dimension are input into the model, the model outputs the performance indexes and the contribution ranking of the feature variables corresponding to the target operation, and the operation is adjusted according to the contribution ranking of the performance indexes and the feature variables. The indexes can be obtained in the development stage, the indexes are input into the model, the performance early warning of the operation can be realized in the development stage according to the performance index of the operation output by the model, so that the technical problem that the reliability of the system is low because the performance hidden danger cannot be found early due to the performance early warning of the operated system in the test stage in the related technology is at least partially solved, and the technical effect of improving the reliability of the system is further achieved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an exemplary system architecture to which the optimization method and apparatus of a business system of the embodiments of the present disclosure may be applied;
FIG. 2 schematically illustrates a flow chart of a method of optimization of a business system according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a flow chart of a method of adjusting a target job according to a performance index corresponding to the target job and a contribution ranking of a plurality of feature variables, according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow chart of a method of obtaining performance indicators for multiple dimensions of a target job according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow diagram of a method of outputting, by a model, a performance index corresponding to a target job and a contribution ranking for a plurality of feature variables, in accordance with an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow chart of a method of pre-training a derived model according to an embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow chart of a method of training an initial model according to an embodiment of the present disclosure;
FIG. 8 schematically illustrates a block diagram of an optimization apparatus of a business system, in accordance with an embodiment of the present disclosure; and
FIG. 9 schematically shows a block diagram of a computer system according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
The embodiment of the present disclosure provides a method for optimizing a service system, including: acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, wherein the performance index of each dimension comprises a plurality of characteristic variables; inputting the performance indexes of multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and a contribution degree ranking of multiple characteristic variables through the model; and adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the business system.
Fig. 1 schematically illustrates an exemplary system architecture 100 to which the optimization method and apparatus of the business system of the embodiments of the present disclosure may be applied. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, a system architecture 100 according to this embodiment may include an electronic device 101, a server 102, and a database cluster 103.
The electronic device 101 may run a plurality of business systems, which may be, for example, a personal customer relationship business system, an equity business system, a financial accounting business system, and so on. Each business system may include a plurality of job groups, and each job group may include a plurality of jobs. For example, the business system may be an individual customer relationship system, the business system may include a workgroup of customer figures, a workgroup of individual star management, and the like. The job group of the customer figure may include, for example, a job for loading individual customer information, a job for counting individual customer assets, and the like. The task group of the personal star level management can comprise tasks of updating the personal client star level, downloading the personal client star level and the like, and each task can contain a complete section of business logic.
Each job, when run, may interact with server 102 and database cluster 103. The database cluster 103 may be a big data cluster for carrying service data, and may be configured to store service data in various complex service scenarios, and in a big data application scenario, the flow and storage of mass data may generally be performed between different service systems in an Extract-Transform-Load (ETL) operation manner.
The server 102 may be used to schedule data resources needed by the business system. For example, the running of a job in the business system requires acquiring information in the source table, the server 102 may acquire the source table from the database cluster 103, acquire business-related information from the source table through operations such as data extraction, information collection, and the like, and then may transmit the acquired information to the electronic device 101. The electronic device 101 may perform the service data processing by using the information sent by the server 102, and during the service data processing, the electronic device 101 may further generate a floor table corresponding to the running service job, where the floor table stores data related to the job, and may be an entity table, a log table, or the like. The electronic device 101 may further send the generated floor table to the server 102, and the server 102 may send the floor table related to the job sent by the electronic device 101 to the plurality of nodes in the database cluster 103, so that the business data of the job is distributedly stored at the plurality of cluster nodes.
It should be noted that the optimization method of the business system provided by the embodiment of the present disclosure can be generally executed by the server 102. Accordingly, the optimization device of the business system provided by the embodiment of the present disclosure may be generally disposed in the server 102. The optimization method of the business system provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 102 and is capable of communicating with the electronic device 101 and/or the server 102. Correspondingly, the optimization device of the business system provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster that is different from the server 102 and is capable of communicating with the electronic device 101 and/or the server 102. Alternatively, the method for optimizing the business system provided by the embodiment of the present disclosure may also be executed by the electronic device 101, or may also be executed by another electronic device different from the electronic device 101. Accordingly, the optimization apparatus of the business system provided by the embodiment of the present disclosure may also be disposed in the electronic device 101, or disposed in another electronic device different from the electronic device 101.
It should be understood that the number of electronic devices, servers, database clusters, and nodes in a database cluster in FIG. 1 are merely illustrative. There may be any number of electronic devices, servers, database clusters, and cluster nodes, as desired for implementation.
Fig. 2 schematically shows a flow chart of an optimization method of a business system according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S201 to S203.
In operation S201, performance indexes of multiple dimensions of a target job are obtained according to a service logic of the target job in a service system, a script code for implementing the service logic, and cluster information that needs to be interacted in a process of implementing the service logic, where the performance index of each dimension includes multiple feature variables.
According to the embodiment of the disclosure, each job of the business system may contain a business logic, for example, the job of loading personal customer information needs to import a customer information table, the job of counting personal customer assets needs to scatter the imported customer information table according to deposit subjects, extract information of a customer deposit field, and the like. Each business logic may be implemented in code that needs to interact with the database cluster in the course of executing the code to implement the business logic.
According to the embodiment of the disclosure, the performance index of the job can be acquired from three dimensions of the business logic of the job, the code for realizing the business logic and the information of the database cluster interacted in the process of realizing the code, and the performance index of each dimension can further comprise a plurality of characteristic variables.
For example, the performance index acquired from the dimension of the business logic of the job may include the number of job groups to which the job belongs, the order of processing between the job groups, and the like. The performance indicators obtained from the dimension of the code implementing the business logic may include the number of times selection operations occur in the code, the number of times association operations occur, and the like. The performance index obtained from the cluster dimension may include the number of nodes in the cluster, the communication bandwidth between the nodes in the cluster, and the like.
According to the embodiment of the disclosure, the performance indexes can be obtained in a service development stage and also can be obtained in a test stage, so that the embodiment of the disclosure can be applied to performance early warning in a development stage, and can discover possible performance risks of a service system as soon as possible.
In operation S202, performance indexes of a plurality of dimensions of a target job are input into a pre-constructed model to output a performance index corresponding to the target job and a contribution ranking of a plurality of feature variables through the model.
According to the embodiment of the disclosure, when the running time of a job exceeds a certain threshold, the job may have a performance problem, and thus, the performance index may be the running time.
According to an embodiment of the present disclosure, the pre-constructed model may be a model trained using historical operation data of already-operated jobs. The input of the model can be performance indicators of each dimension of the already running job, and the output can be the running time length of the job.
According to the embodiment of the disclosure, the performance index of each dimension may include a plurality of characteristic variables, when the plurality of characteristic variables of a job are used as input, different weights may be set for different characteristic variables, and each characteristic variable may be used to characterize the contribution degree of the characteristic variable. Therefore, the contribution degree of each feature variable, or the contribution degree ranking of a plurality of feature variables, can be output by using the trained model.
According to the embodiment of the disclosure, for a job in a currently newly developed business system or a job needing to be tested, feature variables included in performance indexes of multiple dimensions of the job can be input into a trained model, so that the model outputs the running time of the job and the contribution ranking of the feature variables of the job.
In operation S203, the target job is adjusted according to the performance index corresponding to the target job and the contribution ranking of the plurality of characteristic variables to optimize the business system.
Fig. 3 schematically illustrates a flowchart of a method of adjusting a target job according to a performance index corresponding to the target job and a contribution ranking of a plurality of feature variables according to an embodiment of the present disclosure.
As shown in fig. 3, operation S203 includes operations S301 to S302.
In operation S301, it is determined whether an operation time period corresponding to the target job is greater than a preset threshold.
In operation S302, in a case that it is determined that the running time corresponding to the target job is greater than the preset threshold, the target job is adjusted according to the contribution ranking of the plurality of feature variables corresponding to the target job, so as to optimize the business system.
According to the embodiment of the disclosure, the operation time length threshold of the operation can be determined according to the actual production operation condition. For example, the running duration threshold for the statistical personal customer property job may be set to 2.5 hours, indicating that the statistical personal customer property job may have performance problems if the running duration of the model output is 3 hours. The development personnel can process the operation as early as possible, thereby avoiding the business risk brought after the operation is put into production and improving the reliability of the system.
According to the embodiment of the disclosure, the job can be adjusted according to the contribution degree ranking of the feature variables of the job, for example, in the contribution degree ranking of the feature variables of the job, the feature variable ranking of the connection operation times in the code related indexes is first, and a developer can be suggested to pay attention to the times of the associated operations in the script codes, so that the developer can modify the script codes more specifically, and the optimization efficiency of a service system is improved.
According to the embodiment of the disclosure, performance indexes of multiple dimensions of a target job are obtained according to the service logic of the target job in a service system, script codes for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, multiple feature vectors in the performance indexes of each dimension are input into a model, the model outputs a performance index corresponding to the target job and a contribution ranking of feature variables, and the job is adjusted according to the contribution ranking of the performance index and the feature variables, so that the service system is optimized. The indexes can be obtained in the development stage, and the performance index of the operation is output through the model, so that the performance early warning of the operation can be realized in the development stage, and the risk brought by the business production is reduced. And the correction suggestion can be given according to the contribution degree ranking of the characteristic variables, so that developers can modify the characteristic variables in a targeted manner, and the optimization efficiency of the service system is improved.
Fig. 4 schematically shows a flowchart of a method of obtaining performance indicators for multiple dimensions of a target job according to an embodiment of the present disclosure.
As shown in fig. 4, operation S201 includes operations S401 to S403.
In operation S401, a performance index of a first dimension of a target job is obtained according to a service logic of the target job.
The performance index of the first dimension comprises a logic layer to which the target operation belongs, an algorithm type of data loading of the target operation, the number of operations which run simultaneously with the target operation at the same time, the number of a target operation group to which the target operation belongs, and the sequence of processing between the target operation group and other operation groups.
According to the embodiment of the disclosure, the performance index of the first dimension of the target job can be obtained from the logic layer where the job is located. In the big data ETL scenario, the large data ETL scenario may include multiple logical layers, for example, a file import layer, a data extraction layer, a record aggregation layer, a file export layer, a mart job layer, and the like, where the file import layer may implement loading and importing of a data table. The data extraction layer can scatter the data sheet according to protocol subjects, deposit subjects and the like to extract data of each subject. The record summary layer can collect information related to the business logic to form a data table related to the job business. The file export layer may export the data table, download to the user device, and so on. The mart operating layer can comprise at least two of a file import layer, a data extraction layer, a record summary layer and a file export layer according to the service logic.
According to the disclosed embodiments, there may be performance risks to jobs at different logical layers. For example: large files may have performance problems at the file import level, as may the mart job level for complex logic. The logical LAYER where the job is located may be used as a feature vector in the first dimension performance index, and the feature vector may be denoted as ETL _ LAYER.
According to an embodiment of the present disclosure, the algorithm types of job loading data may include an append algorithm, an overlay algorithm, an update algorithm, and a history zipper algorithm. The adding algorithm may add a data table, the overwriting algorithm may overwrite an original data table with a new data table, the updating algorithm may update the data table, and the history zipper algorithm may store history state information of the data table, for example, a history state of deposit change. Different loading algorithms may also have performance risks, for example, an additional algorithm is often used for the business logic storage of the detail type, and a large table is very likely to appear, so that the potential performance hazard exists. The ALGORITHM type of the job loading data is used as a characteristic variable of the first dimension performance index and can be marked as ALGORITHM.
According to the embodiment of the disclosure, when the business system architecture is designed, the concurrent amount of the jobs at the same time can affect the performance of the jobs. The number of jobs that are concurrent with the target job at the same time is taken as a characteristic variable of the first dimension performance index and may be denoted as CTRL _ VALUE.
According to the embodiment of the disclosure, the service system may include a plurality of job groups, different numbers may be assigned to each job group, and the different job group numbers may be used to represent a dependency relationship or a sequence existing between different job groups. The number of the target job group to which the target job belongs may be denoted as jobsconid as a feature variable of the first-dimension performance index.
According to the embodiment of the present disclosure, the characteristic variables included in the performance index of the first dimension of the target job may be represented by table 1 below.
TABLE 1
ETL_LAYER ALGORITHM CTRL_VALUE JOBSSIONID
File import layer Additional algorithm 1 222
Data extraction layer Covering algorithm 2 333
Recording summary layer Updating algorithm 3 111
Marketing work layer History zipper algorithm 4 444
File export layer 5
In operation S402, a performance index of a second dimension of the target job is acquired according to the script code for implementing the business logic.
The performance index of the second dimension comprises the number of times of occurrence of keywords in the script codes and the number of generated temporary tables, and the keywords comprise scripts for realizing selection operation, scripts for representing selection conditions, scripts for realizing grouping operation, scripts for realizing statistical information collection operation and scripts for realizing data table association operation.
According to the embodiment of the disclosure, the script for implementing the selection operation may be SELECT, for example, and how many times of SELECT in the code may affect the job performance. The script for characterizing the selection conditions may be, for example, WHERE, and the number of times of WHERE may affect the execution tasks of the cluster. The script for implementing the grouping operation may be, for example, a GROUP BY, which represents grouping statistics and may affect task execution of the cluster. The script used for realizing the statistical information collection operation can be ANALYZE which can count and collect information, and the performance risk can be reduced by the good ANALYZE habit. The script for implementing the data TABLE association operation may be, for example, JOIN, and the number of JOINs may affect the generated temporary TABLE TMP _ TABLE, for example, each JOIN operation may generate a temporary TABLE, which may reduce the performance risk in case of complex logic, but may increase the performance risk in case of large data volume.
According to an embodiment of the present disclosure, the performance index of the second dimension may include: the number of SELECT times, the number of WHERE times, the number of GROUP BY times, the number of ANALYZE times, the number of JOIN times, and the number of TMP _ TABLE.
In operation S403, a performance index of a third dimension of the target job is obtained according to the cluster information that needs to be interacted in the process of implementing the service logic.
The performance index of the third dimension includes the size of a source table used for providing a data source for a service system, whether a floor table generated by the target job is an index table, whether the floor table is a log table, the number of main keys in the floor table, whether the floor table includes a trigger, whether the floor table allows migration, the number of nodes in the cluster, the communication bandwidth among the nodes in the cluster, the hash mode of distributed storage of the floor table in the cluster, and the number of files corresponding to the same floor table stored in the cluster.
According to an embodiment of the present disclosure, the feature variable may be represented by SOURCE _ TABLE _ SIZE if the SOURCE TABLE is likely to pose a performance risk. The index table can reduce the performance risk, and whether the floor table is a characteristic variable of the index table can be represented by Relhasindex. The log table is more likely to cause performance risk than the business table, and the characteristic variable of whether the landing table is the log table can be represented by RelPercisence. The number of the main keys of the floor list can influence the service performance, and the characteristic variable of the number of the main keys of the floor list can be represented by Relhaspkey. The presence of a trigger in the floor table may increase the trigger overhead and the characteristic variables of whether the floor table includes a trigger may be denoted by relhastiggers. Whether the drop table allows migration may also affect business performance, and the characteristic variable of whether the drop table allows migration may be represented by Relrowmovement. The number of NODEs in the cluster may be represented by NODE _ NUM. The communication bandwidth between nodes in the cluster may be denoted as BAND _ WITH. The HASH of the distributed storage may be represented by HASH _ TYPE. The number of copies of the same file stored in the distributed system can be represented by STORE _ BUM.
According to the embodiment of the disclosure, all characteristic variables which may have potential performance hazards can be fully covered by introducing the performance indexes of the three dimensions.
Fig. 5 schematically illustrates a flowchart of a method of outputting, by a model, a performance index corresponding to a target job and a contribution ranking of a plurality of feature variables, according to an embodiment of the present disclosure.
As shown in fig. 5, operation S202 includes operations S501 to S503.
In operation S501, data cleaning is performed on a plurality of characteristic variables included in the performance index of each dimension, so as to obtain normalized characteristic variables.
In operation S502, the normalized feature variables are encoded to obtain encoded feature variables.
According to the embodiment of the disclosure, the data cleaning process can be completed by performing null filling, data format conversion, outlier removal, normalization and the like on the data. The continuous characteristic variable variables can be converted into discrete characteristic variables, the discrete characteristic variables can be divided into data of different intervals through a box separation algorithm, and then one-bit effective coding of each characteristic variable can be realized through single-hot coding, so that the mode entering data which can be used for model input can be obtained.
In operation S503, the encoded feature variables are input into the model to output a performance index corresponding to the target job and a contribution degree ranking of the plurality of feature variables through the model.
According to the embodiment of the disclosure, the model can output the running time length corresponding to the job and the contribution degree ranking of the plurality of characteristic variables of the model to input the model entering data into the pre-trained model.
FIG. 6 schematically illustrates a flow chart of a method of pre-training a derived model according to an embodiment of the disclosure.
As shown in fig. 6, operations S601 to S603 are included.
In operation S601, a plurality of already running jobs are determined.
In operation S602, for each job that has been run, performance indexes of a plurality of dimensions of the job and a history run time length of the job within a preset time period are acquired.
In operation S603, an initial model is trained according to the performance indexes of the multiple dimensions of the job and the historical operating duration of the job in a preset time period, so as to obtain a final model.
According to the embodiment of the disclosure, because the scene is a regression-type scene, a boosting idea can be adopted to select a corresponding model for model comparison. For example, Adaboost, GBDT, XGBboost models may be selected for training, and the optimal model, which may be the XGBboost model, may be selected based on the model evaluation indices ACC, Recall, F1-score.
According to the embodiment of the disclosure, model training can be completed by cross validation (five-fold), training set to validation set ratio (7: 3), and modification of corresponding parameters (tree depth, leaf number, gradient descent rate, etc.).
According to the embodiment of the disclosure, model training can be performed by using historical operation data of an already-operated job. For each running job, the performance index and the historical running time of each dimension of the job can be obtained for model training.
FIG. 7 schematically shows a flow chart of a method of training an initial model according to an embodiment of the disclosure.
As shown in fig. 7, operation S603 includes operations S701 to S702.
In operation S701, an initial value is set for a model parameter associated with each of the characteristic variables, wherein the value of the model parameter is used to characterize the contribution degree of the characteristic variable.
In operation S702, an initial model is trained by using a plurality of characteristic variables as samples and using a historical operating time of each job in a preset time period as a label, so as to obtain a final model.
According to the embodiment of the present disclosure, when a plurality of feature variables of a job are input, an initial value may be set to a model parameter associated with each feature variable, and the value of the model parameter associated with each feature variable may be used to characterize the degree of contribution of the feature variable. Therefore, the contribution degree of each feature variable, or the contribution degree ranking of a plurality of feature variables, can be output by using the trained model.
According to the embodiment of the disclosure, the model can be deployed in a development environment or a test environment, for example, in a development environment operation performance detection platform, and the model can be packaged as an interface for performance detection.
According to the embodiment of the disclosure, after the development of the job is completed, a developer and a tester call an interface of the embodiment of the disclosure on a performance detection platform, directly input a job name, obtain a corresponding characteristic variable by calling a characteristic engineering script, complete performance inspection after the characteristic variable is modelled, and correspondingly give a contribution ranking of the characteristic variable. If the performance indication exceeds the production threshold, the performance indication is suggested to be modified, and business logic of relevant dimensions causing the operation performance is modified and optimized in a targeted mode through characteristic contribution degree ranking. Performance risks can be eliminated from the initial stages of the job lifecycle.
Fig. 8 schematically shows a block diagram of an optimization apparatus of a business system according to an embodiment of the present disclosure.
As shown in fig. 8, the optimization apparatus 800 of the business system includes a first obtaining module 801, an output module 802, and an optimization module 803.
The first obtaining module 801 is configured to obtain performance indexes of multiple dimensions of a target job according to a service logic of the target job in a service system, a script code for implementing the service logic, and cluster information that needs to be interacted in a process of implementing the service logic, where the performance index of each dimension includes multiple feature variables.
The output module 802 is configured to input the performance indexes of the multiple dimensions of the target job into a pre-constructed model, so as to output a performance index corresponding to the target job and a contribution ranking of the multiple characteristic variables through the model.
The optimizing module 803 is configured to adjust the target job according to the performance index corresponding to the target job and the contribution ranking of the plurality of characteristic variables, so as to optimize the business system.
According to the embodiment of the present disclosure, the first obtaining module 801 includes a first obtaining unit, a second obtaining unit, and a third obtaining unit.
The first obtaining unit is used for obtaining a performance index of a first dimension of the target operation according to the service logic of the target operation, wherein the performance index of the first dimension comprises a logic layer to which the target operation belongs, an algorithm type of data loaded by the target operation, the number of operations which are concurrent with the target operation at the same time and the number of a target operation group to which the target operation belongs.
The second obtaining unit is used for obtaining a performance index of a second dimension of the target operation according to the script code for realizing the business logic, wherein the performance index of the second dimension comprises the number of times of occurrence of keywords in the script code and the number of generated temporary tables, and the keywords comprise a script for realizing selection operation, a script for representing selection conditions, a script for realizing grouping operation, a script for realizing statistical information collection operation and a script for realizing data table association operation.
The third obtaining unit is configured to obtain a third-dimensional performance index of the target job according to cluster information that needs to be interacted in a process of implementing the service logic, where the third-dimensional performance index includes a size of a source table used for providing a data source for the service system, whether a landing table generated by the target job is an index table, whether the landing table is a log table, a number of main keys in the landing table, whether the landing table includes a trigger, whether the landing table allows migration, a number of nodes in the cluster, a communication bandwidth between the nodes in the cluster, a hash mode in which the landing table is stored in the cluster in a distributed manner, and a number of times that files corresponding to the same landing table are stored in the cluster.
According to an embodiment of the present disclosure, the output module 802 includes a first processing unit, a second processing unit, and an output unit.
The first processing unit is used for carrying out data cleaning on a plurality of characteristic variables included in the performance index of each dimension to obtain normalized characteristic variables.
The second processing unit is used for coding the normalized characteristic variables to obtain coded characteristic variables.
The output unit is used for inputting the coded characteristic variables into the model so as to output the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables through the model.
According to an embodiment of the present disclosure, the performance index includes an operation duration.
According to an embodiment of the present disclosure, the optimization module 803 includes a determination unit and an optimization unit.
The determining unit is used for determining whether the running time length corresponding to the target operation is larger than a preset threshold value.
The optimization unit is used for adjusting the target operation according to the contribution degree ranking of the characteristic variables corresponding to the target operation under the condition that the running time corresponding to the target operation is determined to be larger than the preset threshold value, so that the business system is optimized.
According to the embodiment of the present disclosure, the optimization apparatus 800 of the business system further includes a determining module, a second obtaining module, and a training module.
The determination module is used for determining a plurality of running jobs.
The second acquisition module is used for acquiring performance indexes of multiple dimensions of the job and historical running time of the job in a preset time period aiming at each running job.
The training module is used for training the initial model according to the performance indexes of multiple dimensions of the operation and the historical running time of the operation in a preset time period so as to obtain a final model.
According to an embodiment of the present disclosure, a training module includes a setup unit and a training unit.
The setting unit is used for setting an initial value for the model parameter associated with each characteristic variable, wherein the value of the model parameter is used for representing the contribution degree of the characteristic variable.
The training unit is used for training an initial model by taking a plurality of characteristic variables as samples and the historical running time of each job in a preset time period as a label to obtain a final model.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any plurality of the first obtaining module 801, the output module 802 and the optimizing module 803 may be combined and implemented in one module/unit/sub-unit, or any one of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least part of the functionality of one or more of these modules/units/sub-units may be combined with at least part of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to an embodiment of the present disclosure, at least one of the first obtaining module 801, the output module 802 and the optimizing module 803 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware by any other reasonable manner of integrating or packaging a circuit, or may be implemented in any one of or a suitable combination of software, hardware and firmware. Alternatively, at least one of the first obtaining module 801, the output module 802 and the optimization module 803 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.
It should be noted that, the optimization device portion of the service system in the embodiment of the present disclosure corresponds to the optimization method portion of the service system in the embodiment of the present disclosure, and the description of the optimization device portion of the service system specifically refers to the optimization method portion of the service system, which is not described herein again.
FIG. 9 schematically shows a block diagram of a computer system suitable for implementing the above described method according to an embodiment of the present disclosure. The computer system illustrated in FIG. 9 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.
As shown in fig. 9, a computer system 900 according to an embodiment of the present disclosure includes a processor 901 which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. Processor 901 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 901 may also include on-board memory for caching purposes. The processor 901 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.
In the RAM 903, various programs and data necessary for the operation of the system 900 are stored. The processor 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904. The processor 901 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 902 and/or the RAM 903. Note that the programs may also be stored in one or more memories other than the ROM 902 and the RAM 903. The processor 901 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
System 900 may also include an input/output (I/O) interface 905, input/output (I/O) interface 905 also connected to bus 904, according to an embodiment of the present disclosure. The system 900 may also include one or more of the following components connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The computer program, when executed by the processor 901, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 902 and/or the RAM 903 described above and/or one or more memories other than the ROM 902 and the RAM 903.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (10)

1. A method for optimizing a business system comprises the following steps:
acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, wherein the performance index of each dimension comprises a plurality of characteristic variables;
inputting the performance indexes of multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and a contribution degree ranking of multiple characteristic variables through the model; and
and adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the business system.
2. The method of claim 1, wherein obtaining performance indicators of multiple dimensions of a target job according to a business logic of the target job in a business system, a script code for implementing the business logic, and cluster information that needs to be interacted during implementation of the business logic comprises:
acquiring a first-dimension performance index of the target operation according to the service logic of the target operation, wherein the first-dimension performance index comprises information of a logic layer to which the target operation belongs;
acquiring a second-dimension performance index of the target operation according to a script code for realizing the business logic, wherein the second-dimension performance index comprises the occurrence times of keywords in the script code and the number of generated temporary tables; and
and acquiring a third-dimension performance index of the target operation according to the cluster information needing interaction in the process of realizing the service logic.
3. The method of claim 1 or 2, wherein inputting the performance indicators of the target job in multiple dimensions into a pre-built model to output a performance index corresponding to the target job and a contribution ranking of the plurality of feature variables through the model comprises:
performing data cleaning on a plurality of characteristic variables included in the performance index of each dimension to obtain normalized characteristic variables;
coding the normalized characteristic variables to obtain coded characteristic variables; and
inputting the coded characteristic variables into the model so as to output a performance index corresponding to the target operation and a contribution degree ranking of the plurality of characteristic variables through the model.
4. The method of claim 1 or 2, wherein the performance index comprises a length of run time;
adjusting the target job according to the performance index corresponding to the target job and the contribution ranking of the plurality of characteristic variables to optimize the business system comprises:
determining whether the running time corresponding to the target operation is greater than a preset threshold value; and
and under the condition that the running time corresponding to the target operation is determined to be longer than the preset threshold value, adjusting the target operation according to the contribution degree ranking of the characteristic variables corresponding to the target operation so as to optimize the service system.
5. The method of claim 1 or 2, further comprising:
pre-training to obtain the model, wherein the training process comprises the following steps:
determining a plurality of already running jobs;
for each running job, acquiring performance indexes of multiple dimensions of the job and historical running time of the job in a preset time period; and
and training an initial model according to the performance indexes of the multiple dimensions of the operation and the historical running time of the operation in a preset time period to obtain the model.
6. The method of claim 5, wherein training an initial model based on performance indicators for multiple dimensions of the job and a historical length of time the job has been run over a preset period of time comprises:
setting an initial value for a model parameter associated with each of the characteristic variables, wherein the value of the model parameter is used for characterizing the contribution degree of the characteristic variable; and
and training the initial model by taking the characteristic variables as samples and the historical running time of each job in a preset time period as a label to obtain the model.
7. An optimization apparatus of a business system, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring performance indexes of multiple dimensions of a target operation according to the service logic of the target operation in a service system, script codes used for realizing the service logic and cluster information needing interaction in the process of realizing the service logic, and the performance index of each dimension comprises multiple characteristic variables;
the output module is used for inputting the performance indexes of the multiple dimensions of the target operation into a pre-constructed model so as to output a performance index corresponding to the target operation and the contribution degree ranking of the multiple characteristic variables through the model; and
and the optimization module is used for adjusting the target operation according to the performance index corresponding to the target operation and the contribution degree ranking of the characteristic variables so as to optimize the service system.
8. The apparatus of claim 1, wherein the first obtaining means comprises:
a first obtaining unit, configured to obtain a performance index of a first dimension of the target job according to a service logic of the target job, where the performance index of the first dimension includes information of a logic layer to which the target job belongs;
a second obtaining unit, configured to obtain a performance index of a second dimension of the target job according to a script code for implementing the business logic, where the performance index of the second dimension includes the number of occurrences of a keyword in the script code and the number of generated temporary tables; and
and the third obtaining unit is used for obtaining a third-dimension performance index of the target operation according to the cluster information needing to be interacted in the process of realizing the service logic.
9. A computer system, comprising:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-6.
10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 6.
CN202010689223.9A 2020-07-16 2020-07-16 Optimization method and device for business system, computer system and storage medium Active CN111897707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010689223.9A CN111897707B (en) 2020-07-16 2020-07-16 Optimization method and device for business system, computer system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010689223.9A CN111897707B (en) 2020-07-16 2020-07-16 Optimization method and device for business system, computer system and storage medium

Publications (2)

Publication Number Publication Date
CN111897707A true CN111897707A (en) 2020-11-06
CN111897707B CN111897707B (en) 2024-01-05

Family

ID=73191025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010689223.9A Active CN111897707B (en) 2020-07-16 2020-07-16 Optimization method and device for business system, computer system and storage medium

Country Status (1)

Country Link
CN (1) CN111897707B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722198A (en) * 2021-09-02 2021-11-30 中国建设银行股份有限公司 Script job submission control method and device, storage medium and electronic equipment
CN114546525A (en) * 2022-02-17 2022-05-27 阳光保险集团股份有限公司 System, method, device and storage medium for analyzing data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080015916A1 (en) * 2002-05-22 2008-01-17 International Business Machines Corporation Using configurable programmatic rules for automatically changing a trust status of candidates contained in a private business registry
CN103973496A (en) * 2014-05-21 2014-08-06 华为技术有限公司 Fault diagnosis method and device
CN110033123A (en) * 2019-03-12 2019-07-19 阿里巴巴集团控股有限公司 Method and apparatus for business assessment
CN110178123A (en) * 2017-07-12 2019-08-27 华为技术有限公司 Performance indicator appraisal procedure and device
KR20190130212A (en) * 2018-04-24 2019-11-22 주식회사 피도텍 Engineering big data-driven design expert system and design method thereof
CN110944048A (en) * 2019-11-29 2020-03-31 腾讯科技(深圳)有限公司 Service logic configuration method and device
CN111401788A (en) * 2020-04-10 2020-07-10 支付宝(杭州)信息技术有限公司 Attribution method and device of service timing sequence index

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080015916A1 (en) * 2002-05-22 2008-01-17 International Business Machines Corporation Using configurable programmatic rules for automatically changing a trust status of candidates contained in a private business registry
CN103973496A (en) * 2014-05-21 2014-08-06 华为技术有限公司 Fault diagnosis method and device
CN110178123A (en) * 2017-07-12 2019-08-27 华为技术有限公司 Performance indicator appraisal procedure and device
KR20190130212A (en) * 2018-04-24 2019-11-22 주식회사 피도텍 Engineering big data-driven design expert system and design method thereof
CN110033123A (en) * 2019-03-12 2019-07-19 阿里巴巴集团控股有限公司 Method and apparatus for business assessment
CN110944048A (en) * 2019-11-29 2020-03-31 腾讯科技(深圳)有限公司 Service logic configuration method and device
CN111401788A (en) * 2020-04-10 2020-07-10 支付宝(杭州)信息技术有限公司 Attribution method and device of service timing sequence index

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722198A (en) * 2021-09-02 2021-11-30 中国建设银行股份有限公司 Script job submission control method and device, storage medium and electronic equipment
CN114546525A (en) * 2022-02-17 2022-05-27 阳光保险集团股份有限公司 System, method, device and storage medium for analyzing data

Also Published As

Publication number Publication date
CN111897707B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
US20220391763A1 (en) Machine learning service
US10698742B2 (en) Operation efficiency management with respect to application compile-time
US8996452B2 (en) Generating a predictive model from multiple data sources
US11182691B1 (en) Category-based sampling of machine learning data
US8806628B2 (en) Tuning of data loss prevention signature effectiveness
Eismann et al. Predicting the costs of serverless workflows
US20150379423A1 (en) Feature processing recipes for machine learning
US10606450B2 (en) Method and system for visual requirements and component reuse driven rapid application composition
CN106980623A (en) A kind of determination method and device of data model
US9703822B2 (en) System for transform generation
US11720825B2 (en) Framework for multi-tenant data science experiments at-scale
US11620683B2 (en) Utilizing machine-learning models to create target audiences with customized auto-tunable reach and accuracy
US20200117530A1 (en) Application performance management system with collective learning
CN114490375B (en) Performance test method, device, equipment and storage medium of application program
CN111897707B (en) Optimization method and device for business system, computer system and storage medium
US11507434B2 (en) Recommendation and deployment engine and method for machine learning based processes in hybrid cloud environments
US20220269505A1 (en) Client-side enrichment and transformation via dynamic logic for analytics
US10025838B2 (en) Extract transform load input suggestion
US20220245492A1 (en) Constructing a statistical model and evaluating model performance
US11074508B2 (en) Constraint tracking and inference generation
CN113344369A (en) Method and device for attributing image data, electronic equipment and storage medium
US10248411B2 (en) Dynamic data ingestion
US20230010147A1 (en) Automated determination of accurate data schema
US20230229735A1 (en) Training and implementing machine-learning models utilizing model container workflows
US20230325871A1 (en) Subgroup analysis in a/b testing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant