CN115495251B - Intelligent control method and system for computing resources in data integration operation - Google Patents

Intelligent control method and system for computing resources in data integration operation Download PDF

Info

Publication number
CN115495251B
CN115495251B CN202211440650.9A CN202211440650A CN115495251B CN 115495251 B CN115495251 B CN 115495251B CN 202211440650 A CN202211440650 A CN 202211440650A CN 115495251 B CN115495251 B CN 115495251B
Authority
CN
China
Prior art keywords
node
evaluation
expert
task
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211440650.9A
Other languages
Chinese (zh)
Other versions
CN115495251A (en
Inventor
曹源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Deepexi Technology Co Ltd
Original Assignee
Beijing Deepexi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Deepexi Technology Co Ltd filed Critical Beijing Deepexi Technology Co Ltd
Priority to CN202211440650.9A priority Critical patent/CN115495251B/en
Publication of CN115495251A publication Critical patent/CN115495251A/en
Application granted granted Critical
Publication of CN115495251B publication Critical patent/CN115495251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a system for intelligently controlling computing resources in data integration operation, wherein the method comprises the following steps: when the data integration task is executed, monitoring the use condition of the system log and the resident memory of the DEC; determining a computing resource allocation strategy based on the system log and the use condition of the resident memory; and executing the computing resource allocation strategy. The method and the system for intelligently controlling the computing resources in the data integration operation can effectively reduce task abnormal interruption caused by OOM in the data integration task, can ensure the stable operation of a high-priority task by a mechanism of dynamically increasing memory resources, and reduce the probability of the occurrence of the condition that the middle end of the task has influence on the downstream. Meanwhile, due to intelligent allocation in a single server node, the resource allocation pressure of the whole cluster can be effectively relieved, and the operation and maintenance cost of an enterprise is reduced.

Description

Intelligent control method and system for computing resources in data integration operation
Technical Field
The invention relates to the field of computer software, in particular to a method and a system for intelligently controlling computing resources in data integration operation.
Background
At present, in the field of big data, more and more scenes of data-driven services are provided, and the demands for computing around data are more and more abundant, and in the whole link process of data processing, data integration (ETL or ELT task) is the first step of all enterprises in the process of extracting values of data, and is also a very important step. The data integration operation theory is to extract and guarantee the real-time operation of the real-time service in real time or calculate that the off-line task is depended on upstream and downstream. The requirements on stability are high. In a single compute node, a key factor in determining whether a process can operate stably is memory resources. Whether memory resource allocation is proper or not has a critical influence on whether the data integration operation can be stably operated or not. Existing data integration products, for example: dataX, canal, etc., generally only focus on data source extraction and target database writing. The control of resources is only to this level allocated to a certain node. The particle size is relatively coarse. Based on the docker container technique, resource allocation is relatively fixed and inflexible once completed. Once the resource allocation is completed, the resource quota does not change again during the whole task running period, which may lead to resource waste if the preset resource is too high. If the preset resources are too low, the task often fails due to the problem of the OOM, and the downstream service is influenced.
Therefore, a solution is needed.
Disclosure of Invention
One of the objectives of the present invention is to provide an intelligent control method for computing resources in data integration operation, which can effectively reduce task abnormal interruption caused by an OOM in a data integration task, and simultaneously can ensure stable operation of a high-priority task by dynamically increasing a mechanism of memory resources, and reduce the probability of occurrence of a situation in which an end in the task has an influence on downstream. Meanwhile, due to intelligent allocation in a single server node, the resource allocation pressure of the whole cluster can be effectively relieved, and the operation and maintenance cost of an enterprise is reduced.
The embodiment of the invention provides an intelligent control method for computing resources in data integration operation, which comprises the following steps:
when the data integration task is executed, monitoring the system log and the resident memory use condition of the DEC;
determining a computing resource allocation strategy based on the system log and the use condition of the resident memory;
a computing resource allocation policy is enforced.
Preferably, the determining a computing resource allocation policy based on the system log and the resident memory usage includes:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if DEC is P1 level, restarting the data integration task and increasing the memory resource allocation of DEC; if DEC is P2 level, restarting the data integration task;
if DEC is P0 grade, determining whether RES occupation of DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of DEC;
if DEC is P2 level or P3 level, determining whether RES occupation of DEC is continuously lower than a preset percentage based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: the memory resource allocation of DEC is actively reduced.
Preferably, the P0 level is an active guarantee level;
the P1 level is a failure retry guarantee level;
the P2 level is a conventional level;
the P3 level is a low priority level.
Preferably, the method for intelligently controlling computing resources in data integration operation further comprises:
when a user inputs a manual remote computing resource control request, task information of a data integration task is obtained;
performing feature extraction on the task information to obtain a plurality of task features;
constructing a first task description vector based on the plurality of task features;
acquiring a preset computing resource control expert database, wherein the computing resource control expert database comprises: a plurality of groups of expert nodes and second task description vectors which correspond one to one;
acquiring node information of the expert node, wherein the node information comprises: in the latest preset first time, an expert performs manual remote computing resource control to obtain a plurality of customer evaluation records;
based on the node information, selecting a better expert node from the expert nodes;
calculating a first vector similarity between the first task description vector and a second task description vector corresponding to any better expert node;
taking the better expert node corresponding to the maximum first vector similarity as a suitable expert node;
continuously delivering the system log and the use condition of the resident memory to a proper expert node;
acquiring a computing resource control strategy suitable for the expert node to reply;
a computing resource control policy is executed.
Preferably, based on the node information, selecting a better expert node from the expert nodes, including:
preprocessing the node information;
extracting the characteristics of the preprocessing result to obtain a plurality of node information characteristics;
and selecting a better expert node from the expert nodes based on the information characteristics of the plurality of nodes.
Preferably, the node information is preprocessed, including:
extracting sight line movement tracks within a second time preset before and after any evaluation option is selected when a customer fills in an evaluation questionnaire from the customer evaluation records;
traversing the evaluation options in a reverse order according to the order of the options in the evaluation questionnaire;
during each traversal, extracting a content structure in the traversed evaluation option;
acquiring a preset first track description vector corresponding to a content structure;
performing feature extraction on sight movement tracks within a second time preset before and after the client selects the traversed evaluation option to obtain a plurality of track features;
constructing a second trajectory description vector based on the plurality of trajectory features;
calculating a second vector similarity between the second trajectory description vector and the first trajectory description vector;
if the second vector similarity is smaller than or equal to a preset vector similarity threshold, rejecting corresponding customer evaluation records;
finishing preprocessing after all client evaluation records needing to be eliminated in the node information are eliminated;
and/or the presence of a gas in the atmosphere,
extracting the stay time when the customer selects any evaluation option when filling in the evaluation questionnaire from the customer evaluation record;
traversing the evaluation options in a reverse order according to the option sequence of the evaluation options in the evaluation questionnaire;
during each traversal, acquiring a preset stay time threshold corresponding to the traversed evaluation option;
if the staying time when the client fills in the evaluation questionnaire and selects the traversed evaluation option is less than or equal to the staying time threshold, rejecting the corresponding client evaluation record;
and finishing preprocessing after all the client evaluation records needing to be removed in the node information are removed.
Preferably, based on the information characteristics of the plurality of nodes, the method for selecting the better expert node from the expert nodes comprises the following steps:
constructing a first node description vector based on a plurality of node information characteristics;
acquiring a preset node evaluation library, wherein the node evaluation library comprises: a plurality of groups of one-to-one corresponding second node description vectors and first evaluation values;
matching the first node description vector with any second node description vector;
and if the matching is matched, if the first evaluation value corresponding to the matched second node description vector is greater than or equal to a preset first evaluation threshold value, taking the corresponding expert node as a better expert node.
The embodiment of the invention provides an intelligent control system for computing resources in data integration operation, which comprises:
the monitoring module is used for monitoring the service conditions of the system log and the resident memory of the DEC when the data integration task is executed;
the determining module is used for determining a computing resource allocation strategy based on the system log and the use condition of the resident memory;
and the execution module is used for executing the calculation resource allocation strategy.
Preferably, the determining module determines the computing resource allocation policy based on the system log and the usage of the resident memory, and includes:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if DEC is P1 level, restarting the data integration task and increasing the memory resource allocation of DEC; if DEC is P2 level, restarting the data integration task;
if DEC is P0 level, determining whether RES occupation of DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of DEC;
if DEC is P2 level or P3 level, determining whether RES occupation of DEC is continuously lower than a preset percentage based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: the memory resource allocation of DEC is actively reduced.
Preferably, the P0 level is an active guarantee level;
the P1 level is a failure retry guarantee level;
the P2 level is a conventional level;
the P3 level is a low priority level.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a diagram illustrating a method for intelligently controlling computing resources in a data integration operation according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an exemplary application of a method for intelligent control of computing resources in data integration operations according to an embodiment of the present invention;
FIG. 3 is a diagram of an intelligent control system for computing resources in data integration operations according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.
An embodiment of the present invention provides an intelligent control method for computing resources in data integration operations, as shown in fig. 1, including:
step 1: when the data integration task is executed, monitoring the use condition of the system log and the resident memory of the DEC;
and 2, step: determining a computing resource allocation strategy based on the system log and the use condition of the resident memory;
and step 3: and executing the computing resource allocation strategy.
Determining a computing resource allocation strategy based on the system log and the resident memory usage, comprising:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if DEC is P1 level, restarting the data integration task and increasing the memory resource allocation of DEC; if DEC is P2 level, restarting the data integration task;
if DEC is P0 level, determining whether RES occupation of DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of DEC;
if DEC is P2 level or P3 level, determining whether RES occupation of DEC is continuously lower than a preset percentage based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: the memory resource allocation of DEC is actively reduced.
The P0 level is an active guarantee level;
the P1 level is a failure retry guarantee level;
the P2 level is a conventional level;
the P3 level is a low priority level.
The working principle and the beneficial effects of the technical scheme are as follows:
the minimum unit controlled by the control method is DEC (Deepexi Data extract Container), is a program combination which is independently developed and oriented to the Data integration field by the Dipper science and technology and supports Data integration among a plurality of heterogeneous Data sources, and is a professional Container mirror image which can be used after opening a box. The DEC is essentially a set of packages that can be run directly in accordance with a configuration file. The runtime is a single independent process. The basic unit controlled by the control method is DEC.
Task queues of 4 levels are defined, which respectively correspond to different resource control policies, and are used to deal with data integration tasks under different scenarios, generally, a data integration task target of a test property is set to be a low priority level P3, and the level of DEC is shown in table 1 below:
Figure DEST_PATH_IMAGE001
the resource control of DEC is based on the cgroup mechanism, and the memory resource therein is used as the key content in the control method. The triggering of the control is divided into active control and passive control, and is mainly realized by combining system log monitoring and RES (resident memory) monitoring of each DEC, and the structure and the control method strategy are shown in FIG. 2.
DEC-C is a physical realization of the control method, is an independently running process and is responsible for receiving basic information started by DEC and controlling the resources of the actual controlled units related to DEC according to different control levels corresponding to DEC. Hereinafter abbreviated as DEC-C is Controller.
Computing resource allocation policies are divided into active policies and passive policies. The active policy is based on the monitoring of the use of the resident memory from each DEC, and is triggered and determined according to the different levels of the DEC: when detecting that RES occupation of DEC of P0 level is close to cgoup.limits, actively increasing allocation of resources for the DEC is triggered; when a percentage of the P2, P3 level DEC is detected when the occupancy of RES continues to be below a preset value, a trigger is made to actively reduce the allocation of resources of the DEC. The passive policy is determined according to different levels of DEC, when OOM (memory overflow) is found, based on monitoring of each system log: if the level is P1, restarting the task and increasing the memory resource allocation of the DEC; if the task is in the P2 level, the task is restarted, and the memory cannot be increased; if it is the P3 level, no restart is performed. The amount of increase or decrease may be configured by a worker in advance when increasing or decreasing the resource allocation of the DEC.
The key parameters of DEC are shown in table 2 below:
Figure DEST_PATH_IMAGE002
the key parameters of the Controller are shown in the following table:
Figure DEST_PATH_IMAGE003
when the computing resources of the task are found to be insufficient or are about to be fully occupied, the computing resources are automatically increased, failure caused by insufficient resource allocation is avoided, when the resource rate of the task is found to be low and the computing resources are excessively allocated, the computing resources preset by the task are automatically recovered, and when the task is found to be restarted due to insufficient resource failure, the resource allocation can be automatically improved while the task is automatically restarted; in a single computing node, the problems of task delay, failure and the like caused by mismatching of preset resource allocation are avoided; in a single computing node, effective resource allocation can be realized, and the computing efficiency of the single node is improved. Each node can improve the resource utilization rate, the total required resources can be properly reduced for the whole operation cluster, and the hardware cost is saved. After initialization is completed, resource allocation in the nodes is realized, and the whole process is automatic and intelligent; the labor operation and maintenance cost can be reduced. By the control method, the data integration task can be effectively managed on the basis of the memory, and the problem caused by mismatching of resource allocation due to the problem of data fluctuation is reduced. The method can effectively reduce task abnormal interruption caused by OOM in the data integration task, and can ensure the stable operation of the high-priority task and reduce the probability of the occurrence of the condition that the middle end of the task affects the downstream through a mechanism of dynamically increasing memory resources. Meanwhile, due to intelligent allocation in a single server node, the resource allocation pressure of the whole cluster can be effectively relieved, and the operation and maintenance cost of an enterprise is reduced.
In one embodiment, the method for intelligently controlling computing resources in data integration operation further comprises the following steps:
when a user inputs a manual remote computing resource control request, task information of a data integration task is obtained;
performing feature extraction on the task information to obtain a plurality of task features;
constructing a first task description vector based on a plurality of task features;
acquiring a preset computing resource control expert database, wherein the computing resource control expert database comprises: a plurality of groups of expert nodes and second task description vectors which correspond one to one;
acquiring node information of the expert node, wherein the node information comprises: in the latest preset first time, an expert performs manual remote computing resource control to obtain a plurality of customer evaluation records;
based on the node information, selecting a better expert node from the expert nodes;
calculating a first vector similarity between the first task description vector and a second task description vector corresponding to any better expert node;
taking the better expert node corresponding to the maximum first vector similarity as a suitable expert node;
continuously delivering the system log and the use condition of the resident memory to a proper expert node;
acquiring a computing resource control strategy suitable for the expert node to reply;
a computing resource control policy is executed.
The working principle and the beneficial effects of the technical scheme are as follows:
generally, some data integration operation clients have the requirement of manual remote computing resource control, and the manual remote computing resource control is more humanized because the system can be communicated with the clients in real time in the control process, so that the problem of manual allocation for remote computing resource control is involved.
When a user requests, task information of the data integration task is acquired. The task information may be a type and a data amount of the task, etc. Task features of the task information are extracted, for example: the type of task and the amount of data. A first task description vector is constructed. The expert nodes and the second task description vectors which are in one-to-one correspondence are specifically as follows: the expert node is a network node and is in communication docking with an operation terminal of an engineer whose back end can serve a customer for manual remote computing resource control, and the corresponding second task description vector is a description vector constructed by characteristics extracted from task information of a data integration task which is adept by the engineer for computing resource control. And screening out better expert nodes with better evaluation based on the node information of the expert nodes. And calculating the first vector similarity between the first task description vector and a second task description vector corresponding to any better expert node, wherein the greater the first vector similarity is, the better the corresponding engineer is in the control of the computing resources of the data integration task, and therefore, the better expert node corresponding to the maximum first vector similarity is taken as the appropriate expert node. And continuously delivering the system log and the use condition of the resident memory to a suitable expert node, so that an engineer can check the system log and the use condition of the resident memory through an operation terminal, and the engineer can give a reply of a computing resource control strategy based on the system log and the use condition of the resident memory and finally execute the strategy.
When the remote computing resource control is manually distributed, the engineers who are most adept at the data integration task of the client are selected to control the computing resources and evaluate better, and the suitability and the distribution efficiency of manual distribution are improved.
In one embodiment, the selecting of the preferred expert nodes from the expert nodes based on the node information includes:
preprocessing the node information;
extracting the characteristics of the preprocessing result to obtain a plurality of node information characteristics;
and selecting a better expert node from the expert nodes based on the information characteristics of the plurality of nodes.
The working principle and the beneficial effects of the technical scheme are as follows:
the purpose of the preprocessing is to remove less genuine customer evaluations from the node information. Extracting node information characteristics of the preprocessing result, such as: the number of good comments, the number of medium comments, the number of bad comments and the like. And selecting a better expert node from the expert nodes based on the information characteristics of the plurality of nodes. The rationality of the engineer selection is improved.
In one embodiment, the node information is preprocessed, including:
extracting sight line movement tracks within a second time preset before and after any evaluation option is selected when a customer fills in an evaluation questionnaire from the customer evaluation records;
traversing the evaluation options in a reverse order according to the option sequence of the evaluation options in the evaluation questionnaire;
during each traversal, extracting a content structure in the traversed evaluation option;
acquiring a preset first track description vector corresponding to a content structure;
performing feature extraction on sight movement tracks within a second time preset before and after the client selects the traversed evaluation option to obtain a plurality of track features;
constructing a second trajectory description vector based on the plurality of trajectory features;
calculating a second vector similarity between the second trajectory description vector and the first trajectory description vector;
if the second vector similarity is smaller than or equal to a preset vector similarity threshold value, rejecting corresponding customer evaluation records;
finishing preprocessing after all client evaluation records needing to be eliminated in the node information are eliminated;
and/or the presence of a gas in the atmosphere,
extracting the stay time when the customer selects any evaluation option when filling in the evaluation questionnaire from the customer evaluation record;
traversing the evaluation options in a reverse order according to the option sequence of the evaluation options in the evaluation questionnaire;
during each traversal, acquiring a preset stay time threshold corresponding to the traversed evaluation option;
if the stay time length when the client selects the traversed evaluation option when filling the evaluation questionnaire is less than or equal to the stay time length threshold, rejecting the corresponding client evaluation record;
and finishing preprocessing after all the client evaluation records needing to be removed in the node information are removed.
The working principle and the beneficial effects of the technical scheme are as follows:
typically, a customer fills in an evaluation questionnaire during evaluation. The evaluation questionnaire has a plurality of evaluation options, and the contents of the evaluation options are, for example: "how your remote assistance of an engineer contributes to the stability of your data integration" "1, excellent; 2. generally; 3. is poor. ". When removing the client evaluation with lower authenticity, the evaluation needs to be determined according to the condition that the client fills in each evaluation option. There are two ways:
firstly, a sight line movement track within a second time preset before and after any evaluation option is selected when a customer fills in an evaluation questionnaire is extracted from a customer evaluation record, wherein the preset second time can be 12 seconds, and the acquisition of the sight line movement track belongs to the field of the prior art and is not described in detail; extracting a content structure in the evaluation option, and introducing a preset first trajectory description vector corresponding to the content structure, for example: the content structure is in a character direction from top to bottom, if a user carefully views the content, the sight line track is from top to small and has a certain stopping point, and the first track description vector is a vector constructed by track characteristics of the sight line track generated by the user if carefully views the content; extracting track characteristics of the sight line movement track and constructing a second track description vector; calculating second vector similarity between the second track description vector and the first track description vector, wherein the larger the second vector similarity is, the more carefully the customer views the content of the evaluation option, the more real the evaluation option is given; if the second vector similarity is smaller than or equal to the preset vector similarity threshold, it indicates that the content of the evaluation option is not carefully viewed and should be removed when the client views the content of the evaluation option.
Secondly, extracting the stay time when the customer selects any evaluation option when filling in the evaluation questionnaire from the customer evaluation record, wherein the stay time can be determined by the duration of continuous display of the page of the evaluation option; and introducing a preset stay time threshold corresponding to the evaluation option, wherein the stay time threshold is the minimum stay time for the user to carefully check the content of the evaluation option, and if the stay time is less than or equal to the stay time threshold, the situation that the content is not carefully checked and should be removed when the user checks the content of the evaluation option is shown.
This application introduces two kinds of modes and carries out the preliminary treatment to customer's evaluation, rejects the lower evaluation of authenticity, has indirectly promoted the accurate nature that the engineer selected, simultaneously, has also guaranteed the fairness.
In addition, generally, a user carefully fills in the previous evaluation options, and when filling in the next evaluation options, the evaluation options may be relatively popular due to lack of patience and the like, so that the evaluation options are traversed in reverse order according to the order of the options of the evaluation options in the evaluation questionnaire, and the rejection efficiency is improved.
In one embodiment, the selecting out the better expert node from the expert nodes based on the information characteristics of the plurality of nodes comprises:
constructing a first node description vector based on a plurality of node information characteristics;
acquiring a preset node evaluation library, wherein the node evaluation library comprises: a plurality of groups of one-to-one corresponding second node description vectors and first evaluation values;
matching the first node description vector with any one of the second node description vectors;
and if the matching is matched, if the first evaluation value corresponding to the matched second node description vector is greater than or equal to a preset first evaluation threshold value, taking the corresponding expert node as a better expert node.
The working principle and the beneficial effects of the technical scheme are as follows:
and constructing a first node description vector based on the plurality of node information characteristics. The one-to-one correspondence between the second node description vector and the first evaluation value is specifically: the second node description vector is a vector constructed in advance according to the characteristics of different node information, the first evaluation value is an evaluation value given in advance according to the expert evaluation goodness reflected by the node information, and the larger the first evaluation value is, the better the evaluation is, for example: the node information is characterized by a bad score 20, which constructs a second node description vector, and the first evaluation value is 0. And matching the first node description vector with any one of the second node description vectors, if the matching is matched, indicating that the condition of the node information reaction is matched with the condition corresponding to the matched second node description vector, and outputting a corresponding first evaluation value, and if the first evaluation value is greater than or equal to a preset first evaluation threshold value, indicating that the node evaluation is better and serving as a better expert node. Based on vector construction and vector matching, the node evaluation condition is quickly determined, and the evaluation determination efficiency is improved.
An embodiment of the present invention provides an intelligent control system for computing resources in data integration operations, as shown in fig. 3, including:
the monitoring module 1 is used for monitoring the service conditions of a system log and a resident memory (DEC) when a data integration task is executed;
the determining module 2 is used for determining a computing resource allocation strategy based on the system log and the resident memory use condition;
and the execution module 3 is used for executing the calculation resource allocation strategy.
The determining module 2 determines a computing resource allocation strategy based on the system log and the resident memory usage, and comprises the following steps:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if DEC is P1 level, restarting the data integration task and increasing the memory resource allocation of DEC; if DEC is P2 level, restarting the data integration task;
if DEC is P0 level, determining whether RES occupation of DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of DEC;
if DEC is P2 level or P3 level, determining whether RES occupation of DEC is continuously lower than a preset percentage based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: the memory resource allocation of DEC is actively reduced.
The P0 level is an active guarantee level;
the P1 level is a failure retry guarantee level;
the P2 level is a conventional level;
the P3 level is a low priority level.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (9)

1. An intelligent control method for computing resources in data integration operation is characterized by comprising the following steps:
when the data integration task is executed, monitoring the use condition of the system log and the resident memory of the DEC;
determining a computing resource allocation strategy based on the system log and the resident memory use condition;
executing the computing resource allocation policy;
further comprising:
when a user inputs a manual remote computing resource control request, acquiring task information of the data integration task;
performing feature extraction on the task information to obtain a plurality of task features;
constructing a first task description vector based on the plurality of task features;
acquiring a preset computing resource control expert database, wherein the computing resource control expert database comprises: a plurality of groups of expert nodes and second task description vectors which correspond one to one;
acquiring node information of the expert node, wherein the node information comprises: in the latest preset first time, an expert performs manual remote computing resource control to obtain a plurality of customer evaluation records;
selecting a better expert node from the expert nodes based on the node information;
calculating a first vector similarity between the first task description vector and the second task description vector corresponding to any one of the preferred expert nodes;
taking the better expert node corresponding to the maximum first vector similarity as a suitable expert node;
continuously delivering the system log and the use condition of the resident memory to the appropriate expert node;
acquiring a computing resource control strategy suitable for the expert node to reply;
and executing the computing resource control strategy.
2. The method as claimed in claim 1, wherein determining a computing resource allocation policy based on the system log and the resident memory usage comprises:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if the DEC is in a P1 level, restarting the data integration task, and increasing the memory resource allocation of the DEC; if the DEC is in a P2 level, restarting the data integration task;
if the DEC is in the P0 level, determining whether the RES occupation of the DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of the DEC;
if the DEC is in a P2 level or a P3 level, determining whether the RES occupation of the DEC is continuously lower than a preset percentage or not based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: actively reducing memory resource allocation of the DEC.
3. The method of claim 2, wherein the P0 level is an active security level;
the P1 level is a failure retry guarantee level;
the P2 level is a regular level;
the P3 level is a low priority level.
4. The method as claimed in claim 1, wherein the step of selecting a better expert node from the expert nodes based on the node information comprises:
preprocessing the node information;
extracting the characteristics of the preprocessing result to obtain a plurality of node information characteristics;
and selecting a better expert node from the expert nodes based on the plurality of node information characteristics.
5. The method as claimed in claim 4, wherein the preprocessing of the node information comprises:
extracting sight line movement tracks within a second time preset before and after any evaluation option is selected when a customer fills in an evaluation questionnaire from the customer evaluation records;
traversing the evaluation options in a reverse order according to the option sequence of the evaluation options in the evaluation questionnaire;
extracting the content structure in the traversed evaluation option during each traversal;
acquiring a preset first track description vector corresponding to the content structure;
performing feature extraction on sight movement tracks within a second time preset before and after the client selects the traversed evaluation option to obtain a plurality of track features;
constructing a second trajectory description vector based on the plurality of trajectory features;
calculating a second vector similarity between the second trajectory description vector and the first trajectory description vector;
if the second vector similarity is smaller than or equal to a preset vector similarity threshold value, rejecting the corresponding customer evaluation record;
finishing preprocessing after the client evaluation records needing to be removed in the node information are all removed;
and/or the presence of a gas in the gas,
extracting the stay time when the customer selects any evaluation option when filling in the evaluation questionnaire from the customer evaluation record;
traversing the evaluation options in a reverse order according to the option sequence of the evaluation options in the evaluation questionnaire;
during each traversal, acquiring a preset stay time threshold corresponding to the traversed evaluation option;
if the dwell time when the client selects the traversed evaluation option when filling in the evaluation questionnaire is less than or equal to the dwell time threshold, rejecting the corresponding client evaluation record;
and finishing preprocessing after the client evaluation records needing to be removed in the node information are all removed.
6. The method as claimed in claim 4, wherein the step of selecting a preferred expert node from the expert nodes based on the information characteristics of the plurality of nodes comprises:
constructing a first node description vector based on the plurality of node information characteristics;
acquiring a preset node evaluation library, wherein the node evaluation library comprises: a plurality of groups of one-to-one corresponding second node description vectors and first evaluation values;
matching the first node description vector with any of the second node description vectors;
and if the matching is matched, if the first evaluation value corresponding to the matched second node description vector is greater than or equal to a preset first evaluation threshold value, taking the corresponding expert node as a better expert node.
7. An intelligent control system for computing resources in data integration operation is characterized by comprising:
the monitoring module is used for monitoring the service conditions of the system logs and the resident memory of the DEC when the data integration task is executed;
the determining module is used for determining a computing resource allocation strategy based on the system log and the use condition of the resident memory;
an execution module to execute the computing resource allocation policy;
the execution module further comprises:
when a user inputs a manual remote computing resource control request, acquiring task information of the data integration task;
performing feature extraction on the task information to obtain a plurality of task features;
constructing a first task description vector based on the plurality of task features;
acquiring a preset computing resource control expert database, wherein the computing resource control expert database comprises: a plurality of groups of expert nodes and second task description vectors which correspond one to one;
acquiring node information of the expert node, wherein the node information comprises: in the latest preset first time, experts carry out manual remote computing resource control to obtain a plurality of client evaluation records;
selecting a better expert node from the expert nodes based on the node information;
calculating a first vector similarity between the first task description vector and the second task description vector corresponding to any one of the preferred expert nodes;
taking the better expert node corresponding to the maximum first vector similarity as a suitable expert node;
continuously delivering the system log and the use condition of the resident memory to the appropriate expert node;
acquiring a computing resource control strategy suitable for the expert node to reply;
and executing the computing resource control strategy.
8. The system of claim 7, wherein said determining module determines a computing resource allocation policy based on said system log and said resident memory usage, comprising:
determining whether an OOM event occurs based on the system log;
if yes, determining that the computing resource allocation strategy is as follows: if the DEC is in the P1 level, restarting the data integration task and increasing the memory resource allocation of the DEC; if the DEC is in a P2 level, restarting the data integration task;
if the DEC is in the P0 level, determining whether the RES occupation of the DEC is close to cgoup.
If yes, determining that the computing resource allocation strategy is as follows: actively increasing the memory resource allocation of the DEC;
if the DEC is in a P2 level or a P3 level, determining whether the RES occupation of the DEC is continuously lower than a preset percentage or not based on the use condition of the resident memory;
if yes, determining that the computing resource allocation strategy is as follows: actively reducing memory resource allocation of the DEC.
9. The system as claimed in claim 8, wherein the P0 level is an active security level;
the P1 level is a failure retry guarantee level;
the P2 level is a regular level;
the P3 level is a low priority level.
CN202211440650.9A 2022-11-17 2022-11-17 Intelligent control method and system for computing resources in data integration operation Active CN115495251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211440650.9A CN115495251B (en) 2022-11-17 2022-11-17 Intelligent control method and system for computing resources in data integration operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211440650.9A CN115495251B (en) 2022-11-17 2022-11-17 Intelligent control method and system for computing resources in data integration operation

Publications (2)

Publication Number Publication Date
CN115495251A CN115495251A (en) 2022-12-20
CN115495251B true CN115495251B (en) 2023-02-07

Family

ID=85116084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211440650.9A Active CN115495251B (en) 2022-11-17 2022-11-17 Intelligent control method and system for computing resources in data integration operation

Country Status (1)

Country Link
CN (1) CN115495251B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495211B1 (en) * 2014-03-04 2016-11-15 Google Inc. Allocating computing resources based on user intent
WO2017133351A1 (en) * 2016-02-05 2017-08-10 华为技术有限公司 Resource allocation method and resource manager

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108196970A (en) * 2017-12-29 2018-06-22 东软集团股份有限公司 The dynamic memory management method and device of Spark platforms
CN115168026A (en) * 2022-05-31 2022-10-11 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Important process anti-false killing method and system under limited memory resource condition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495211B1 (en) * 2014-03-04 2016-11-15 Google Inc. Allocating computing resources based on user intent
WO2017133351A1 (en) * 2016-02-05 2017-08-10 华为技术有限公司 Resource allocation method and resource manager

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Challenges in Heterogeneous Die-Stacked and Off-Chip Memory Systems;Gabriel H Loh等;《3rd Workshop on SoCs, Heterogeneous Architectures and Workloads》;20120229;全文 *
中医药大数据资源仓库构建及处方分析应用研究;吴纪龙;《中国优秀硕士学位论文全文数据库(电子期刊)医药卫生科技辑》;20220215;全文 *
云计算系统中索引与查询处理技术研究;王金宝;《中国博士学位论文全文数据库(电子期刊)信息科技辑》;20140115;全文 *

Also Published As

Publication number Publication date
CN115495251A (en) 2022-12-20

Similar Documents

Publication Publication Date Title
CN109104336B (en) Service request processing method and device, computer equipment and storage medium
US6912533B1 (en) Data mining agents for efficient hardware utilization
CN109783237B (en) Resource allocation method and device
US20050096949A1 (en) Method and system for automatic continuous monitoring and on-demand optimization of business IT infrastructure according to business objectives
CN112488706B (en) Cloud service management method and system based on block chain
CN112799817A (en) Micro-service resource scheduling system and method
CN112306656A (en) Cloud computing task tracking processing method, cloud computing system and server
CN111445244A (en) Intelligent contract management system for block chain
CN113835874A (en) Deep learning service scheduling method, system, terminal and storage medium
US20170090986A1 (en) System and method for managing workload performance on billed computer systems
US20050089063A1 (en) Computer system and control method thereof
CN115495251B (en) Intelligent control method and system for computing resources in data integration operation
CN114090376A (en) Service processing method and device based on alliance chain system
CN111262783B (en) Dynamic routing method and device
CN111475251A (en) Cluster container scheduling method, system, terminal and storage medium
CN116360994A (en) Scheduling method, device, server and storage medium of distributed heterogeneous resource pool
EP3866010A1 (en) Method and system for processing transactions in a block-chain network
CN113938429A (en) Flow control method, flow control device and computer readable storage medium
CN111784359B (en) Multi-mode wind control grading disaster recovery method and device
CN114237910A (en) Client load balancing implementation method and device
CN111143386A (en) Method and device for processing bond line data
CN112559142B (en) Container control method, device, edge computing system, medium and equipment
CN112306666B (en) Cloud resource management system and method, and non-transitory computer readable recording medium
CN109241053B (en) Identification code allocation method, device and server
WO2024055139A1 (en) Autonomous quota management for shared resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant