CN116841737A - Batch task distribution method, device, equipment and storage medium - Google Patents

Batch task distribution method, device, equipment and storage medium Download PDF

Info

Publication number
CN116841737A
CN116841737A CN202310785763.0A CN202310785763A CN116841737A CN 116841737 A CN116841737 A CN 116841737A CN 202310785763 A CN202310785763 A CN 202310785763A CN 116841737 A CN116841737 A CN 116841737A
Authority
CN
China
Prior art keywords
task
sample data
feature
abnormal
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310785763.0A
Other languages
Chinese (zh)
Inventor
张盛荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202310785763.0A priority Critical patent/CN116841737A/en
Publication of CN116841737A publication Critical patent/CN116841737A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Abstract

The invention relates to an artificial intelligence technology, and discloses a batch processing task distribution method, which comprises the following steps: performing missing value filling, abnormal value replacement and feature screening on each piece of obtained historical batch processing task data to obtain corresponding target sample data; constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes; and when task data of the batch processing task to be distributed is received, distributing the batch processing task to be distributed to a proper target distribution server node for executing the task based on the task data and the target node distribution model. The invention also provides a batch task distribution device, equipment and medium, which can be used in the financial field, and improves the accuracy of batch task distribution such as insurance application data verification and the like.

Description

Batch task distribution method, device, equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technology and financial technology, and in particular, to a method and apparatus for distributing batch processing tasks, an electronic device, and a storage medium.
Background
In the field of financial insurance, a distributed system is used for batch task processing due to task processing performance of various data batch processing tasks, and in the distributed system, the distribution selection of batch processing tasks is an important link in the system. To improve the efficiency and availability of the system, it is necessary to task distribute batch tasks to appropriate server nodes for task execution (e.g., batch tasks for insurance application data verification require screening of server nodes for task distribution lines appropriate for data comparison).
However, the existing batch task distribution only screens corresponding server nodes according to the task characteristics of single batch task, and the screened dimension is single, so that the accuracy of batch task distribution such as insurance application data verification is low.
Disclosure of Invention
The invention provides a batch task distribution method, a batch task distribution device, electronic equipment and a storage medium, and mainly aims to improve the accuracy of batch task distribution such as insurance application data verification and the like.
Acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data;
Filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data;
feature screening is carried out on the initial sample data to obtain target sample data;
constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
when task data of a batch task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
and distributing the batch processing task to be distributed to the target distribution server node for task execution.
Optionally, the performing missing value filling and outlier replacement on the historical batch processing task data to obtain initial sample data includes:
confirming task features with empty task feature values in the historical batch processing task data as missing task features;
calculating all task feature values which are not empty and correspond to the missing task features in all the historical batch processing task data to obtain missing filling values corresponding to the missing task features;
Filling a missing filling value corresponding to the missing task feature as a task feature value of the missing task feature into the historical batch processing task data to obtain filling sample data;
and carrying out outlier replacement on the filling sample data to obtain the initial sample data.
Optionally, the performing outlier replacement on the filling sample data to obtain the initial sample data includes:
performing abnormality detection on the task feature value of each task feature in the filling sample data to determine the task feature corresponding to the detected abnormal task feature value as an abnormal task feature;
calculating or screening based on all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features;
and replacing the task characteristic value of the abnormal task characteristic in the filling sample data with a corresponding abnormal replacement characteristic value to obtain initial sample data.
Optionally, the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
And calculating the average value of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task features.
Optionally, the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
and screening statistics values of preset types in all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features.
Optionally, the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
calculating standard deviations of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain initial replacement feature values corresponding to the abnormal task features;
and determining a preset multiple of the initial replacement characteristic value corresponding to the abnormal task characteristic as an abnormal replacement characteristic value corresponding to the abnormal task characteristic.
Optionally, the feature screening of the initial sample data to obtain target sample data includes:
summarizing all the initial sample data to obtain an initial sample data set;
calculating an overall entropy of the initial sample dataset based on the distribution server node;
calculating the conditional entropy of each task feature in the initial sample data set on the initial sample data set, and calculating based on the overall entropy to obtain a feature importance coefficient of each task feature in the initial sample data set;
screening all kinds of task features in the initial sample data set based on the feature importance coefficients to obtain target task features;
and reserving all target task characteristics and corresponding task characteristic values in the initial sample data to obtain the target sample data.
In order to solve the above problems, the present invention also provides a batch task distribution apparatus, the apparatus comprising:
the data processing module is used for acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data; filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data; feature screening is carried out on the initial sample data to obtain target sample data;
The model construction module is used for constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task characteristics in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
the task distribution module is used for screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes when receiving the task data of the batch processing task to be distributed; and distributing the batch processing task to be distributed to the target distribution server node for task execution.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
a memory storing at least one computer program; a kind of electronic device with high-pressure air-conditioning system
And the processor executes the computer program stored in the memory to realize the batch task distribution method.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned batch task distribution method.
In the embodiment of the invention, all the target sample data are used as training sets to construct a decision tree model, so as to obtain a target node distribution model which takes task characteristics in the target sample data as decision nodes and takes all distribution server nodes as leaf nodes; when task data of a batch task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes; the batch task to be distributed is distributed to the target distribution server node for task execution, and the server node is screened based on the decision tree model constructed by the multidimensional task characteristics, so that the server node suitable for the task can be evaluated for distribution according to the multidimensional task characteristics in the batch task to be distributed.
Drawings
FIG. 1 is a flow chart of a method for distributing batch tasks according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of a batch task distribution device according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an internal structure of an electronic device for implementing a batch task distribution method according to an embodiment of the present application;
the achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The embodiment of the application provides a batch processing task distribution method. The execution body of the batch task distribution method includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the batch task distribution method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: the server can be an independent server, or can be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDNs), basic cloud computing services such as big data and artificial intelligent platforms, and the like.
Referring to fig. 1, which is a schematic flow chart of a batch task distribution method according to an embodiment of the present invention, in an embodiment of the present invention, the batch task distribution method includes the following steps:
s1, acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data;
in the embodiment of the invention, the historical batch task data is task data of batch tasks (such as insurance investigation, insurance application data verification and the like) in the distributed financial insurance field, and comprises different task features and task feature values of each task feature, wherein the task features comprise but are not limited to: the type of task, the requirement of task, the time required for task execution. The distribution server node is a server node for distributing batch processing tasks corresponding to the historical batch processing task data, and the distribution server node is a node server of a preset distributed system used in the financial insurance field and capable of executing the batch processing tasks.
S2, filling missing values and replacing abnormal values of the historical batch processing task data to obtain initial sample data;
In the embodiment of the invention, in order to ensure the data availability of the historical batch processing task data, the data preprocessing is required to be carried out on the historical batch processing task data, so that the missing value filling and the abnormal value replacement are carried out on the historical batch processing task data, and initial sample data are obtained.
In detail, in the embodiment of the present invention, performing missing value filling and outlier replacement on the historical batch task data to obtain initial sample data includes:
confirming task features with empty task feature values in the historical batch processing task data as missing task features;
calculating all task feature values which are not empty and correspond to the missing task features in all the historical batch processing task data to obtain missing filling values corresponding to the missing task features;
filling a missing filling value corresponding to the missing task feature as a task feature value of the missing task feature into the historical batch processing task data to obtain filling sample data;
and carrying out outlier replacement on the filling sample data to obtain the initial sample data.
Further, in the embodiment of the present invention, calculating or filtering based on all task feature values corresponding to the missing task features in all the historical batch task data that are not empty to obtain missing fill values corresponding to the missing task features includes:
And calculating the average value of all task feature values which are not empty and correspond to the missing task features, and obtaining the missing filling value which corresponds to the missing task features.
In an embodiment of the present invention, calculating or filtering based on all task feature values corresponding to the missing task features in all the historical batch task data that are not empty to obtain missing filling values corresponding to the missing task features includes:
and screening the median of all the task feature values which are not empty and correspond to the missing task features to obtain missing filling values which correspond to the missing task features.
In an embodiment of the present invention, calculating or filtering based on all task feature values corresponding to the missing task features in all the historical batch task data that are not empty to obtain missing filling values corresponding to the missing task features includes:
and screening the modes of all the task feature values which are not null and correspond to the missing task features to obtain missing filling values which correspond to the missing task features.
In the embodiment of the present invention, the step of obtaining the missing filling value corresponding to the missing task feature may be replaced by the step of determining the preset task feature threshold as the missing filling value corresponding to the missing task feature, where the missing filling value is obtained by calculating all task feature values corresponding to the missing task feature in all the historical batch processing task data.
Further, in the embodiment of the present invention, performing outlier replacement on the filling sample data to obtain the initial sample data includes:
performing abnormality detection on the task feature value of each task feature in the filling sample data to determine the task feature corresponding to the detected abnormal task feature value as an abnormal task feature;
calculating or screening based on all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features;
and replacing the task characteristic value of the abnormal task characteristic in the filling sample data with a corresponding abnormal replacement characteristic value to obtain initial sample data.
Specifically, in the embodiment of the present invention, the performing anomaly detection on the task feature value of each task feature in the filling sample data to determine the task feature corresponding to the detected abnormal task feature value as an abnormal task feature includes:
calculating the average value of all task feature values corresponding to each task feature in all the filling sample data to obtain the average value of each task feature;
Calculating standard deviations of all task feature values corresponding to all task features in the filling sample data to obtain the standard deviation of each task feature;
performing anomaly detection on each task feature value of the task features in all the filling sample data according to the average value and standard deviation of the task features to detect the abnormal task feature value corresponding to the task feature, and determining the task feature corresponding to the detected abnormal task feature value as the abnormal task feature
Specifically, in the embodiment of the present invention, according to the average value and the standard deviation of the task feature, performing anomaly identification on each task feature value of the task feature in all the filling sample data to identify an abnormal task feature value corresponding to the task feature, where the anomaly identification includes:
and when the absolute value of the difference between the task characteristic value of the task characteristic and the average value of the task characteristic is 3 times of standard deviation between the calculated absolute value of the difference and the task characteristic, judging that the task characteristic value is an abnormal task characteristic value.
For example: the task feature value of the task feature a to be identified abnormally is 10, the average value of the task feature a is 50, the standard deviation of the task feature a is 10, the absolute value of the difference between the task feature value 10 and the average value of the task feature is |10-50|=40, and 40 is greater than 3 times of standard deviation (3×10=30), and then the task feature value 10 of the task feature a is the abnormal task feature value.
In another embodiment of the present invention, according to the average value and standard deviation of the task features, performing anomaly identification on each task feature value of the task feature in all the filling sample data to identify an abnormal task feature value corresponding to the task feature, where the anomaly identification includes:
calculating the difference between the task feature value of the task feature and the average value of the task feature, and judging the task feature value as abnormal when the ratio of the calculated difference to the standard deviation of the task feature is greater than 3
For example: the task characteristic value of the task characteristic A to be identified abnormally is 90, the average value of the task characteristic A is 50, the standard deviation of the task characteristic A is 10, the difference value between the task characteristic value 10 and the average value of the task characteristic A is 90-50=40, the ratio of 40 to the standard deviation is 40/10=4 and is larger than 3, and then the task characteristic value 90 of the task characteristic A is the abnormal task characteristic value.
In detail, in the embodiment of the present invention, calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
and calculating the average value of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task features.
In an embodiment of the present invention, the calculating or filtering based on all task feature values corresponding to the abnormal task features in all the filling sample data includes:
and screening statistics values of preset types in all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features.
Specifically, in the embodiment of the present invention, the statistics of the preset type include: mode, median, etc.
In an embodiment of the present invention, the calculating or filtering based on all task feature values corresponding to the abnormal task features in all the filling sample data includes:
calculating standard deviations of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain initial replacement feature values corresponding to the abnormal task features;
and determining a preset multiple of the initial replacement characteristic value corresponding to the abnormal task characteristic as an abnormal replacement characteristic value corresponding to the abnormal task characteristic.
Preferably, in the embodiment of the present invention, the preset multiple is 3 times, and the embodiment of the present invention does not limit the preset multiple.
In an embodiment of the present invention, performing outlier replacement on the filling sample data to obtain the initial sample data includes:
performing abnormality detection on the task feature value of each task feature in the filling sample data to determine the task feature corresponding to the detected abnormal task feature value as an abnormal task feature;
and replacing the task characteristic value of the abnormal task characteristic in the filling sample data with the logarithm of the corresponding classification task characteristic value to obtain initial sample data.
In the embodiment of the invention, missing value filling and abnormal value replacement are carried out on the historical batch processing task data, the distribution server nodes corresponding to the data are not affected, and each initial sample data also has a corresponding distribution server node, for example: and filling the missing value and replacing the abnormal value of the historical batch processing task data A to obtain initial sample data A, wherein the historical batch processing task data A corresponds to the distribution server node A, and then the initial sample data A also corresponds to the distribution server node A.
S3, performing feature screening on the initial sample data to obtain target sample data;
in the embodiment of the invention, in order to screen which task characteristics are greatly influenced by which task nodes are distributed to different server nodes, therefore, the characteristic screening is carried out on the initial sample data to obtain target sample data.
In detail, in the embodiment of the present invention, the screening of the initial sample data to obtain the target sample data includes:
summarizing all the initial sample data to obtain an initial sample data set;
calculating an overall entropy of the initial sample dataset based on the distribution server node;
calculating the conditional entropy of each task feature in the initial sample data set on the initial sample data set, and calculating based on the overall entropy to obtain a feature importance coefficient of each task feature in the initial sample data set;
screening all kinds of task features in the initial sample data set based on the feature importance coefficients to obtain target task features;
and reserving all target task characteristics and corresponding task characteristic values in the initial sample data to obtain the target sample data.
In detail, in the embodiment of the present invention, calculating the overall entropy of the initial sample data set based on the distribution server node using the following formula includes:
wherein S is the overall entropy, n is the total number of the distribution server nodes, i represents the distribution server nodes i, p i Representing the proportion of initial sample data of the corresponding distribution server node as the distribution server node i to the initial sample data set.
Specifically, in the embodiment of the present invention, the following formula is used to calculate the conditional entropy of each task feature in the initial sample data set to the initial sample data set, including:
wherein t is the task feature t, p in the initial sample dataset j For the task feature value of the task feature t in the initial sample data set, the number of initial sample data with the class number j of the task feature value in the initial sample data set is the proportion of the initial sample data set, m is the class number of the task feature value corresponding to the task feature t in the initial sample data set, j is the class number of the task feature value corresponding to the task feature t in the initial sample data set, and p ji And the number of the initial sample data for the distribution server node i, which is the distribution server node corresponding to the type serial number j of the task characteristic value of the task characteristic t in the initial sample data set, accounts for the proportion of the initial sample data set.
Further, in the embodiment of the present invention, the following formula is used to calculate the feature importance coefficient of each task feature in the initial sample data set, including:
Y t =S-X t
wherein Y is t And (5) the characteristic importance coefficient of the task characteristic t in the initial sample data set.
In detail, in the embodiment of the present invention, filtering task features of all categories in the initial sample dataset based on the feature importance coefficients to obtain target task features includes:
Sorting task features of all categories in the initial sample data set according to the sequence from large to small based on the feature importance coefficients to obtain a task feature sequence;
and determining task features before the preset ranking in the task feature sequence as the target task features.
In an embodiment of the present invention, filtering task features of all categories in the initial sample data set based on the feature importance coefficients to obtain target task features includes:
determining a feature importance coefficient larger than a preset feature importance coefficient threshold as a target feature importance coefficient;
and determining the task characteristics corresponding to the target characteristic importance coefficients as the target task characteristics.
S4, constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
in the embodiment of the invention, in order to evaluate which distribution server node different batch tasks are suitable for, a target node distribution model is constructed to evaluate the node distribution of the batch tasks to be distributed.
In detail, in the embodiment of the invention, all the target sample data are used as training sets to construct a decision tree model, task features in the target sample data are used as decision nodes, and all the distribution server nodes are used as target node distribution models of leaf nodes.
The embodiment of the invention does not limit the construction type of the decision tree.
S5, when task data of a batch processing task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
in the embodiment of the invention, the batch task to be distributed is a batch task to be distributed and executed, and the task data and the historical batch task data are the same type of data. Optionally, in the embodiment of the present invention, the batch task to be distributed is a batch task for verifying unverified insurance application data.
Further, in order to distinguish which distribution server node is more suitable for executing the batch processing task to be distributed, the method in the embodiment of the invention screens all distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
In detail, in the embodiment of the present invention, the filtering, based on the task data and the target node distribution model, all the distribution server nodes to obtain a target distribution server node includes:
acquiring a characteristic value of each target characteristic in the task data to obtain a target characteristic value of each target characteristic value;
and selecting a corresponding branch from a root node of the target node distribution model based on the target characteristic value of each target characteristic value, entering the next node from one stage to one stage until reaching a leaf node, and determining a distribution server node corresponding to the reached leaf node as the target distribution server node.
Further, in the embodiment of the present invention, extracting a feature value of each target feature in the task data to obtain a target feature value of each target feature includes:
filling the missing value and replacing the abnormal value of the task data to obtain initial task data;
and extracting the characteristic value of each target characteristic in the initial task data to obtain the target characteristic value of each target characteristic.
In detail, in the embodiment of the present invention, performing missing value filling and outlier replacement on the task data to obtain initial task data includes:
Filling the missing value of the task data to obtain preprocessing task data;
and carrying out outlier replacement on the preprocessing task data to obtain the initial task data.
And S6, distributing the batch processing task to be distributed to the target distribution server node for task execution.
In the embodiment of the invention, the batch task to be distributed is distributed to the target distribution server node for task execution, thereby realizing automatic identification and distribution of the batch task to be processed.
FIG. 2 is a functional block diagram of a batch task dispensing apparatus according to the present invention.
The batch task dispensing apparatus 100 of the present invention may be installed in an electronic device. Depending on the functions implemented, the batch task dispensing apparatus may include a data processing module 101, a model building module 102, a task dispensing module 103, which may also be referred to herein as a unit, refers to a series of computer program segments capable of being executed by a processor of an electronic device and of performing a fixed function, which are stored in a memory of the electronic device.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the data processing module 101 is configured to obtain a plurality of historical batch task data, and a distribution server node corresponding to each historical batch task data; filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data; feature screening is carried out on the initial sample data to obtain target sample data;
The model construction module 102 is configured to construct a decision tree model by using all the target sample data as a training set, so as to obtain a target node distribution model by using task features in the target sample data as decision nodes and using all the distribution server nodes as leaf nodes;
the task distribution module 103 is configured to, when receiving task data of a batch task to be distributed, screen all the distribution server nodes based on the task data and the target node distribution model, and obtain a target distribution server node; and distributing the batch processing task to be distributed to the target distribution server node for task execution.
In detail, each module in the batch task dispensing apparatus 100 in the embodiment of the present invention adopts the same technical means as the batch task dispensing method described in fig. 1 and can produce the same technical effects when in use, and will not be described herein.
Fig. 3 is a schematic structural diagram of an electronic device for implementing the batch task distribution method according to the present invention.
The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program, such as a batch task distribution program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as code of a batch task distribution program, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing Unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device and processes data by running or executing programs or modules (e.g., batch task distribution programs, etc.) stored in the memory 11, and calling data stored in the memory 11.
The communication bus 12 may be a peripheral component interconnect standard (PerIPheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure classification circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
Optionally, the communication interface 13 may comprise a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The batch task distribution program stored in the memory 11 in the electronic device is a combination of a plurality of computer programs, which when run in the processor 10, can implement:
Acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data;
filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data;
feature screening is carried out on the initial sample data to obtain target sample data;
constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
when task data of a batch task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
and distributing the batch processing task to be distributed to the target distribution server node for task execution.
In particular, the specific implementation method of the processor 10 on the computer program may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
Further, the electronic device integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. The computer readable medium may be non-volatile or volatile. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, may implement:
acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data;
filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data;
feature screening is carried out on the initial sample data to obtain target sample data;
constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
when task data of a batch task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
and distributing the batch processing task to be distributed to the target distribution server node for task execution.
Further, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. A method for distributing batch processing tasks, the method comprising:
acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data;
filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data;
feature screening is carried out on the initial sample data to obtain target sample data;
constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task features in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
When task data of a batch task to be distributed is received, screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes;
and distributing the batch processing task to be distributed to the target distribution server node for task execution.
2. The batch task distribution method according to claim 1, wherein the performing missing value filling and outlier replacement on the historical batch task data to obtain initial sample data includes:
confirming task features with empty task feature values in the historical batch processing task data as missing task features;
calculating all task feature values which are not empty and correspond to the missing task features in all the historical batch processing task data to obtain missing filling values corresponding to the missing task features;
filling a missing filling value corresponding to the missing task feature as a task feature value of the missing task feature into the historical batch processing task data to obtain filling sample data;
and carrying out outlier replacement on the filling sample data to obtain the initial sample data.
3. The batch task dispensing method of claim 2, wherein said performing outlier replacement on said filled sample data to obtain said initial sample data comprises:
performing abnormality detection on the task feature value of each task feature in the filling sample data to determine the task feature corresponding to the detected abnormal task feature value as an abnormal task feature;
calculating or screening based on all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features;
and replacing the task characteristic value of the abnormal task characteristic in the filling sample data with a corresponding abnormal replacement characteristic value to obtain initial sample data.
4. The batch task distribution method according to claim 3, wherein the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
and calculating the average value of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task features.
5. The batch task distribution method according to claim 2, wherein the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
and screening statistics values of preset types in all task feature values corresponding to the abnormal task features in all the filling sample data to obtain abnormal replacement feature values corresponding to the abnormal task features.
6. The batch task distribution method according to claim 1, wherein the calculating or filtering based on all task feature values corresponding to the abnormal task feature in all the filling sample data to obtain an abnormal replacement feature value corresponding to the abnormal task feature includes:
calculating standard deviations of all task feature values corresponding to the abnormal task features in all the filling sample data to obtain initial replacement feature values corresponding to the abnormal task features;
and determining a preset multiple of the initial replacement characteristic value corresponding to the abnormal task characteristic as an abnormal replacement characteristic value corresponding to the abnormal task characteristic.
7. The batch task dispensing method according to any one of claims 1 to 6, wherein the feature screening of the initial sample data to obtain target sample data includes:
summarizing all the initial sample data to obtain an initial sample data set;
calculating an overall entropy of the initial sample dataset based on the distribution server node;
calculating the conditional entropy of each task feature in the initial sample data set on the initial sample data set, and calculating based on the overall entropy to obtain a feature importance coefficient of each task feature in the initial sample data set;
screening all kinds of task features in the initial sample data set based on the feature importance coefficients to obtain target task features;
and reserving all target task characteristics and corresponding task characteristic values in the initial sample data to obtain the target sample data.
8. A batch task dispensing apparatus, comprising:
the data processing module is used for acquiring a plurality of historical batch processing task data and a distribution server node corresponding to each historical batch processing task data; filling the missing value and replacing the abnormal value of the historical batch processing task data to obtain initial sample data; feature screening is carried out on the initial sample data to obtain target sample data;
The model construction module is used for constructing a decision tree model by taking all the target sample data as a training set to obtain a target node distribution model by taking task characteristics in the target sample data as decision nodes and taking all the distribution server nodes as leaf nodes;
the task distribution module is used for screening all the distribution server nodes based on the task data and the target node distribution model to obtain target distribution server nodes when receiving the task data of the batch processing task to be distributed; and distributing the batch processing task to be distributed to the target distribution server node for task execution.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the batch task distribution method of any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the batch task distribution method of any one of claims 1 to 7.
CN202310785763.0A 2023-06-29 2023-06-29 Batch task distribution method, device, equipment and storage medium Pending CN116841737A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310785763.0A CN116841737A (en) 2023-06-29 2023-06-29 Batch task distribution method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310785763.0A CN116841737A (en) 2023-06-29 2023-06-29 Batch task distribution method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116841737A true CN116841737A (en) 2023-10-03

Family

ID=88171948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310785763.0A Pending CN116841737A (en) 2023-06-29 2023-06-29 Batch task distribution method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116841737A (en)

Similar Documents

Publication Publication Date Title
CN113327136B (en) Attribution analysis method, attribution analysis device, electronic equipment and storage medium
CN114612194A (en) Product recommendation method and device, electronic equipment and storage medium
CN113688923A (en) Intelligent order abnormity detection method and device, electronic equipment and storage medium
CN113516417A (en) Service evaluation method and device based on intelligent modeling, electronic equipment and medium
CN111694844A (en) Enterprise operation data analysis method and device based on configuration algorithm and electronic equipment
CN114491047A (en) Multi-label text classification method and device, electronic equipment and storage medium
CN113868529A (en) Knowledge recommendation method and device, electronic equipment and readable storage medium
CN114881616A (en) Business process execution method and device, electronic equipment and storage medium
CN114781832A (en) Course recommendation method and device, electronic equipment and storage medium
CN112733531A (en) Virtual resource allocation method and device, electronic equipment and computer storage medium
CN113627160B (en) Text error correction method and device, electronic equipment and storage medium
CN117193975A (en) Task scheduling method, device, equipment and storage medium
CN116841737A (en) Batch task distribution method, device, equipment and storage medium
CN115225489B (en) Dynamic control method for queue service flow threshold, electronic equipment and storage medium
CN113657546B (en) Information classification method, device, electronic equipment and readable storage medium
CN114723488B (en) Course recommendation method and device, electronic equipment and storage medium
CN117235480B (en) Screening method and system based on big data under data processing
CN116484296A (en) Financial fund collection risk analysis method, device, equipment and storage medium
CN114202434A (en) Claims settlement scheme generation method, device, equipment and medium based on priority algorithm
CN115659026A (en) Client recommendation method and device, electronic equipment and storage medium
CN115841279A (en) Supply chain data evaluation method, device, equipment and storage medium
CN115878773A (en) Questionnaire generation method, device and equipment based on information entropy weight and storage medium
CN117391864A (en) Risk identification method and device based on data flow direction, electronic equipment and medium
CN116483974A (en) Dialogue reply screening method, device, equipment and storage medium
CN114742471A (en) Task scheduling method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination