CN115130679A

CN115130679A - Data management and control method, related device and medium program product

Info

Publication number: CN115130679A
Application number: CN202210683587.5A
Authority: CN
Inventors: 郭传亮
Original assignee: Hope Zhizhou Technology Shenzhen Co ltd
Current assignee: Hope Zhizhou Technology Shenzhen Co ltd
Priority date: 2022-02-11
Filing date: 2022-02-11
Publication date: 2022-09-30
Also published as: CN114169536A; CN114169536B

Abstract

The embodiment of the application provides a data management and control method, a related device and a medium program product, wherein the method comprises the following steps: determining a target learning task and a first training set corresponding to the current working condition information according to the current working condition information and reference working condition information prestored in a learning training set database; removing the first training data set with problems from the first training set to obtain a second training set; training according to the second training set and the target learning task to obtain a target learning result; and storing the target learning result into the learning training set database to obtain an updated learning training set database. Therefore, the efficiency of machine learning can be improved, and the precision of machine learning can also be improved.

Description

Data management and control method, related device and medium program product

Technical Field

The application belongs to the technical field of general data processing of the Internet industry, and particularly relates to a data management and control method, a related device and a medium program product.

Background

At present, a training set is newly established in a traditional machine learning mode every time, existing historical training data are not fully utilized, so that the machine learning efficiency is low, and training data causing quality problems in production batches are not eliminated, so that the machine learning precision is poor.

Disclosure of Invention

The application provides a data management and control method, a related device and a medium program product, which aim to improve the efficiency and the precision of machine learning.

In a first aspect, an embodiment of the present application provides a data management and control method, where the method is applied to a server, and the method includes:

determining a target learning task and a first training set corresponding to the current working condition information according to the current working condition information and reference working condition information prestored in a learning training set database, wherein the target learning task is used for indicating that production parameters under the current working condition are learned through a preset model, and the first training set is used for indicating a training data set used for learning in the target learning task;

removing the first training data set with problems from the first training set to obtain a second training set;

training according to the second training set and the target learning task to obtain a target learning result;

and storing the target learning result into the learning training set database to obtain an updated learning training set database.

In a second aspect, an embodiment of the present application provides a data management and control apparatus, where the apparatus is applied to a server, and the apparatus includes:

the system comprises a determining unit, a learning training set database and a processing unit, wherein the determining unit is used for determining a target learning task and a first training set corresponding to current working condition information according to the current working condition information and reference working condition information prestored in the learning training set database, the target learning task is used for indicating that production parameters under the current working condition are learned through a preset model, and the first training set is used for indicating a training data set used for learning in the target learning task; a clearing unit, configured to clear a first training data set with a problem from the first training set to obtain a second training set; the training unit is used for training according to the second training set and the target learning task to obtain a target learning result; and the storage unit is used for storing the target learning result into the learning training set database to obtain an updated learning training set database.

In a third aspect, embodiments of the present application provide a server comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps in the first aspect of embodiments of the present application.

In a fourth aspect, an embodiment of the present application provides a computer storage medium for storing a computer program for electronic data exchange, where the computer program makes a computer perform part or all of the steps described in the first aspect of the embodiment.

In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps as described in the first aspect of the embodiments of the present application. The computer program product may be a software installation package.

It can be seen that, in the embodiment of the present application, a target learning task and a first training set corresponding to current working condition information are determined according to the current working condition information and reference working condition information prestored in a learning training set database, a first training data set with problems is removed from the first training set to obtain a second training set, training is performed according to the second training set and the target learning task to obtain a target learning result, and finally the target learning result is stored in the learning training set database to obtain an updated learning training set database. Therefore, each training data is stored in the database for repeated use of machine learning later, the machine learning efficiency can be effectively improved, abnormal training data is eliminated, and the machine learning precision is guaranteed.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a system architecture diagram according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a data management and control method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a first client interaction page provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of a second client interaction page provided in an embodiment of the present application;

FIG. 5 is a schematic diagram of a third client interaction page provided in an embodiment of the present application;

FIG. 6 is a schematic diagram of a fourth client interaction page provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of abnormal training data clearing provided by an embodiment of the present application;

fig. 8a is a block diagram illustrating functional units of a data management apparatus according to an embodiment of the present disclosure;

fig. 8b is a block diagram illustrating functional units of another data management and control apparatus according to an embodiment of the present disclosure;

fig. 9 is a block diagram of a server according to an embodiment of the present disclosure.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The following description will first be made with respect to terms related to the present application.

Production batch: the method refers to a one-time production process for obtaining production results by processing original materials through production equipment, and correspondingly, each production batch has a production batch number which is in one-to-one correspondence with the production batch number and is used for distinguishing different production batches.

Working conditions are as follows: the method is a combination of variation intervals of input element characteristic values in a production process, and the combination of the elements is partitioned and coded according to the characteristic intervals of the elements in actual production to form different working condition codes. Different working condition combinations have obvious influence on the control parameters of the production process.

And (3) machine learning: machine learning evolved into a branch of artificial intelligence in the second half of the 20 th century, which was predicted by obtaining knowledge from data through self-learning algorithms. Machine learning does not require manual analysis of large amounts of data in advance, followed by rule extraction and model building, but rather provides a more efficient method to capture knowledge in the data and gradually improve the performance of the predictive model to complete data-driven decisions.

At present, when a traditional machine learning training set is managed and controlled, a training set is newly established every time, existing historical training data are not fully utilized, so that the machine learning efficiency is low, and the training data which cause quality problems in production batches are not eliminated, so that the machine learning precision is poor.

In order to solve the above problem, an embodiment of the present application provides a data management and control method, which may be applied to the field of manufacturing business. By the method, when an enterprise needs to perform machine learning in different working conditions, a learning task and a training set are determined based on reference working condition information stored in the database, abnormal training data in the training set are eliminated, and machine learning accuracy is guaranteed while machine learning efficiency is improved. The application can be applied to various application scenarios requiring machine learning training set management, including but not limited to the above-mentioned application scenarios.

The following describes a system architecture according to an embodiment of the present application.

Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present disclosure. As shown in fig. 1, the system architecture 10 includes a server 11 and a client 12, where the server 11 is in communication connection with the client 12, and when determining a target learning task and a first training set corresponding to current working condition information according to the current working condition information and reference working condition information pre-stored in a learning training set database, the server 11 may send corresponding messages to the client 12 according to different detection results to direct the client 12 to send different instructions, such as a task creation instruction, a task merging instruction, or a training data use confirmation instruction, so as to obtain the target learning task and the first training set according to the instructions. The server 11 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center, and the client 12 may be a mobile phone terminal, a tablet computer, a notebook computer, a vehicle-mounted terminal, or the like.

A data management and control method provided in the embodiments of the present application is described below.

Referring to fig. 2, fig. 2 is a schematic flowchart of a data management and control method provided in an embodiment of the present application, where the method is applied to a server, and as shown in fig. 2, the data management and control method includes:

step 201, determining a target learning task and a first training set corresponding to current working condition information according to the current working condition information and reference working condition information prestored in a learning training set database.

The target learning task is used for indicating that the production parameters under the current working condition are learned through a preset model, and the first training set is used for indicating a training data set used for learning in the target learning task. The current working condition information refers to that a new working condition needing machine learning appears in a current production batch, the information of the new working condition is the current working condition information, and exemplarily, the information may include a working condition type, a working condition number and the like. If the current working condition information is an old working condition, the current production mode of the old working condition can be further judged, different production strategies are carried out according to different production modes, and learning tasks cannot be created or combined, so that the current working condition information in the application refers to a new working condition needing machine learning. The learning training set database stores training records of historical working conditions, each historical working condition has a unique working condition number and is used for distinguishing different working conditions, and the learning task corresponding to each working condition can be only one or multiple or has no corresponding learning task and corresponding historical training data.

Step 202, removing the first training data set with problems from the first training set to obtain a second training set.

Wherein the quality issues include: production equipment parameters (i.e., process parameters) of a production lot are in trouble or production material quality inspection is abnormal, and the quality problems can cause the quality of the final product to be unqualified. The current working condition information is the working condition information appearing under the target production batch, so that training data which can cause quality problems of the target production batch in the training set are required to be removed when the learning task is deployed to the working condition information, a training set without abnormal data is obtained, and then the training set without the abnormal data is utilized for machine learning, so that the machine learning precision is ensured.

And 203, training according to the second training set and the target learning task to obtain a target learning result.

The target learning result is used for representing that the machine learning of the current working condition information of the target production batch is completed, the target learning result comprises production parameters corresponding to the current working condition information, the production parameters can be parameters of production equipment or component parameters, and the component parameters can comprise original material parameters and intermediate product parameters. Meanwhile, the current working condition information is associated with the working condition codes and a second training set, the second training set is marked through a server, the training data in the second training set are recorded as historical training data of the current working condition, the target learning task is recorded as the historical learning task of the current working condition, and the target learning result comprises the recorded result.

And 204, storing the target learning result into the learning training set database to obtain an updated learning training set database.

After the target learning result is stored in the learning training set database, the training data and the learning task can be used for machine learning tasks under new working conditions in the later production process, and the problem that machine learning efficiency is reduced due to the fact that a new training set is required to be established when the training set is used for machine learning every time is avoided.

It can be seen that, in this example, a target learning task and a first training set corresponding to current working condition information are determined according to the current working condition information and reference working condition information prestored in a learning training set database, a first training data set causing quality problems in a target production batch corresponding to the current working condition information is removed from the first training set to obtain a second training set, training is performed according to the second training set and the target learning task to obtain a target learning result, and the target learning result is stored in the learning training set database to obtain an updated learning training set database. Therefore, each training data is stored in the database for repeated use of machine learning later, the machine learning efficiency can be effectively improved, abnormal training data is eliminated, and the machine learning precision is guaranteed.

In one possible example, the determining, according to the current working condition information and reference working condition information prestored in a learning training set database, a target learning task and a first training set corresponding to the current working condition information includes: detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, and no historical training data corresponding to the current working condition information exists; acquiring an original training set with X training data; receiving a task creating instruction from a client; creating the target learning task according to the task creating instruction; and determining the first training set according to the task creating instruction and the original training set with X training data, wherein the value of X is greater than a target preset value.

When the server detects that the corresponding historical learning task and the corresponding historical training data do not exist in the learning training set database, which indicates that available historical training data do not exist, the server needs to newly create the learning task and acquire an original training set for machine learning. Referring to fig. 3, fig. 3 is a schematic diagram of a first client interaction page provided in an embodiment of the present application, where a server sends a message to a client, pops up the interaction page shown in fig. 3 on a client interface, and a user clicks a "create task" control, so that the client sends a task creation instruction to the server. The database does not have available historical training data, so the obtained original training set is the first training set, in order to ensure the learning effect, the quantity of the training data in the first training set is larger than a target preset value, and the size of the target preset value is flexibly determined according to different working conditions.

It can be seen that, in this example, the server creates a new learning task and obtains the original training set as the first training set by detecting that there is no available historical data in the database, and ensures the effect of machine learning by setting the minimum number of training data in the first training set, thereby avoiding the deviation of learning effect caused by insufficient training samples due to too small number of training data.

In one possible example, the determining, according to the current working condition information and reference working condition information prestored in a learning training set database, a target learning task and a first training set corresponding to the current working condition information includes: detecting that a plurality of historical learning tasks corresponding to the current working condition information and a historical training set corresponding to the historical learning tasks exist in the reference working condition information; acquiring an original training set with X training data; receiving a task merging instruction and a task selecting instruction from a client; selecting N historical learning tasks through the task selection instruction; merging the N historical learning tasks through the task merging instruction to obtain the target learning task; and combining the original training set with the X training data and the historical training set with the Y historical training data corresponding to the N historical learning tasks through the task merging instruction to obtain the first training set, wherein the value of X + Y is greater than a target preset value.

The server detects that a plurality of historical learning tasks corresponding to the current working condition information exist in a learning training set database, and in order to improve the machine learning efficiency, the historical learning tasks are merged to serve as target learning tasks so as to merge historical training sets in the historical learning tasks. Because the combined historical training set may have the situation that the number of the training data in the training set does not reach the standard, or part of the training data is not available, and the like, the original training set is still obtained before the learning task is combined, at this time, X training data in the original training set can be adjusted according to the number Y of the historical training data formed after combination, and the condition that the number (X + Y) of the finally obtained training data in the first training set is larger than the target preset value is met. Exemplarily, referring to fig. 4, fig. 4 is a schematic diagram of a second client-side interactive page provided by an embodiment of the present application, where a server sends a message to a client, an interactive interface shown in fig. 4 pops up at the client, the interactive interface displays all combinable historical learning tasks, a user sends a task selection instruction to the server by checking a check box in front of the historical learning tasks, and then clicks a "combine task" control, so that the client sends the task combination instruction to the server. And the server combines the selected historical learning tasks according to the task selection instruction and the task combination instruction to obtain a target learning task, and simultaneously combines the original training set and the historical training set to obtain a first training set with (X + Y) training data.

It can be seen that, in this example, the server detects that there are available historical learning tasks and available historical training sets in the database, and then merges the historical learning tasks and the historical training sets, so that the process of machine learning is optimized, the reduction of machine learning efficiency due to the repeated creation of training sets is avoided, the minimum quantity of training data in the first training set is set, the effect of machine learning is ensured, and the deviation of learning effect due to the insufficient training samples caused by the too small quantity of training data is avoided.

In one possible example, after the merging, by the task merging instruction, the original training set with X training data and the historical training set with Y historical training data corresponding to the N historical learning tasks to obtain the first training set, the method further includes: receiving a task integration instruction from a client; reselecting T historical learning tasks through the task integration instruction and combining the T historical learning tasks to obtain the target learning task; and combining the original training set with the X training data and the historical training set with the S historical training data corresponding to the T historical learning tasks through the task integration instruction to obtain the first training set, wherein the value of X + S is greater than a target preset value.

Referring to fig. 5, fig. 5 is a schematic diagram of a third client interaction page provided in an embodiment of the present application, and as shown in fig. 5, a user may reselect T history learning tasks to be combined and check a check box in front of the history learning tasks, and then click a "confirm" control, so that the client sends the task integration instruction to the server. And the server merges the T reselected historical learning tasks according to the task integration instruction to obtain a target learning task, merges an original training set with X training data and a historical training set with S historical training data corresponding to the T historical learning tasks to obtain a first training set with (X + S) training data.

Therefore, in this example, the user can re-select the merged historical learning task to enable the server to re-integrate the training data in the training set, so that the machine learning process is ensured to be performed under the expectation of the user, and the flexibility of the scheme is increased.

In one possible example, the determining, according to the current working condition information and reference working condition information prestored in a learning training set database, a target learning task and a first training set corresponding to the current working condition information includes: detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, but Z pieces of historical training data corresponding to the current working condition information exist; acquiring an original training set with X training data; receiving a task creating instruction and a training data use confirmation instruction from a client; creating the target learning task according to the task creating instruction; and combining the original training set with the X training data and the Z historical training data by using a confirmation instruction according to the training data to obtain the first training set, wherein the value of X + Z is greater than a target preset value.

The server detects that no historical learning task exists in the learning training set database, but historical training data corresponding to the current working condition information exists, and at the moment, in order to improve the machine learning efficiency, the historical training data can be fully utilized for training. Because there is no combinable historical learning task, a new learning task is still needed to be established to associate training data with working conditions, and because there may be situations that the quantity in the historical training data does not reach the standard, or department historical training data is unavailable, and the like, an original training set is still obtained when the new learning task is established, at this time, X training data in the original training set can be adjusted according to the quantity Z of the historical training data, as long as the quantity (X + Z) of the finally obtained training data in the first training set is larger than a target preset value. Referring to fig. 6 by way of example, fig. 6 is a schematic diagram of a fourth client-side interaction page provided in an embodiment of the present application, where a server sends a message to a client, and pops up an interaction interface shown in fig. 6 at the client, where the interaction interface displays a status bar "training data use confirmation": the method comprises the steps that Z batches of training data are detected to be available, the training data are determined to be used, a user clicks an adding control to enable a client to send a training data use confirmation instruction to a server, the server combines an original training set with X training data with Z historical training data according to the instruction to obtain a first training set with (X + Z) training data, then a task creating control is clicked to enable the client to send a task creating instruction to the server, and the server creates a target learning task according to the instruction and associates current working condition information.

It can be seen that, in this example, the server combines the original training set and the historical training data to optimize the process of machine learning by detecting that there is no available historical learning task but there is available historical training data in the database, so as to avoid reducing the efficiency of machine learning due to repeatedly creating training sets, and by setting the minimum number of training data in the first training set, the effect of machine learning is ensured, and the deviation of learning effect due to insufficient training samples caused by too small number of training data is avoided.

In one possible example, the removing the problematic first training data set from the first training set to obtain a second training set includes: setting the first training set to be in a usable state, wherein the first training set comprises M training data, M is a positive integer and is greater than the target preset value, and the M training data comprises M process parameters and production results corresponding to the M process parameters; obtaining a target production batch corresponding to the current working condition information; determining that the production results corresponding to the M process parameters comprise A abnormal results, wherein the abnormal results are used for indicating that the value of at least one parameter in the production results is not within a preset target value interval; and clearing the A abnormal results in the first training set and the process parameters corresponding to the A abnormal results to obtain the second training set.

The abnormal result means that one or more values in the values of each parameter included in one production result do not conform to the target value interval corresponding to the value, and the production result is an abnormal production result. Referring to fig. 7, fig. 7 is a schematic diagram of abnormal training data being cleared, according to an embodiment of the present application, the target production lot is a lot of a production process of a product a, the production lot is 000111, a working condition is a parameter setting standard of a device C, the working condition number is 121112111, multiple times of training are performed on different process parameters under the working condition, for example, 9 times of training are performed, 9 training data are in the first training set, the 9 training data are numbered by the server, nine numbers N1, N2, … …, and N9 are obtained, it is determined whether a value of a parameter K included in a production result corresponding to each number is within a preset target value range, if the values of the parameter K of N2, N5, and N7 are not within the preset quality qualified interval, the training data represented by N2, N5, and N7 are abnormal data, the training data represented by N2, N5, and N7 are removed, and the data set formed by the remaining 6 training data is the second training set. If the parameter L corresponding to a certain production result is not within the preset target value interval, the production result is also an abnormal result.

In a specific implementation, it is determined that the production results corresponding to the M process parameters include a number of abnormal results, where a value of the parameter K is not within a preset target value interval, and the method further includes: determining target production equipment corresponding to the target parameters of which the values are not in the preset target interval; comparing the process parameters of the target production equipment corresponding to the M-A normal results with the process parameters of the target production equipment corresponding to the A abnormal results, and determining the target process parameters with abnormality; acquiring first process parameters corresponding to the first process parameters of other production equipment in the M-A normal results and first process parameters corresponding to the first process parameters of other production equipment corresponding to the A abnormal results, wherein the first process parameters are smaller than the first process parameters corresponding to the N normal results of a preset value, and the first process parameters do not include the target process parameters; acquiring parameter values of the target process parameters corresponding to the N normal results to obtain a parameter interval of the target process parameters, wherein the parameter values of the target process parameters corresponding to the A abnormal results are not in the parameter interval; acquiring a target production batch corresponding to an interval endpoint value of the parameter interval; determining a target production result corresponding to the target production batch; determining the influence trend of the target process parameters on the parameter values of the target product; determining a reference first process parameter according to the first process parameter; and generating a reference result according to the target production result, the reference target process parameter, the reference first process parameter and the influence trend, wherein the value of the reference target process parameter is positioned in the parameter interval.

For example, in the 9 training data, the production result numbered N2 is an abnormal result, and the value of the parameter K of N2 is not within the preset target value interval, and the parameter K is a parameter corresponding to the device P, and the process parameter corresponding to the device P is Z. At this time, the training data with a smaller difference from the first process parameter of N2 in the other 8 training data is determined, and the first process parameter includes all the parameters except the target process parameter corresponding to the device P in the process parameters for production. If all the parameters for production include two, the process parameter corresponding to the device P is the parameter Z, and the process parameter corresponding to the other devices is the parameter X, and the value of the parameter X in the production lot corresponding to N2 is 40, then the value of the parameter X in the 9 training data is 39.8-40.2, and the production lots between the two are N3 and N4, then a parameter interval can be determined according to the value of the parameter K corresponding to the secondary production of the two lots N3 and N4, for example, the value of the parameter Z in N3 is 83, the value of the parameter Z in N4 is 78, and then if the value of the parameter Z corresponding to the N2 lot is 80, the parameter interval of the parameter Z can be 80-83 or 78-80. Therefore, the reference first process parameter corresponding to the parameter X may be determined in the interval 39.8-40.2, and the reference target process parameter may be determined in the interval 80-83 or 78-80, and then the reference first process parameters determined in the three intervals are combined to form a plurality of data sets, and the influence trend may be determined, where the larger the value of the process parameter Z is, the larger the value corresponding to the production result is, and then the corresponding reference production results may be generated according to the corresponding first reference process parameter and influence trend in each data set, and the production results corresponding to the two batches of N3 and N4, to obtain the reference training data, and the reference training data is added to the second training set.

Therefore, in the example, the data with the abnormality in the first training set are removed in advance, so that the precision of machine learning is improved.

In one possible example, after storing the target learning result in the learning training set database to obtain an updated learning training set database, the method further includes: acquiring a query request from a client; determining that a learning task exists in the current production batch according to the query request; inquiring the learning training set database to obtain a training data set corresponding to the learning task; and sending the training data set to the client, wherein the training data set is used for configuring parameter values of the production equipment for carrying out the current production batch.

Wherein, after the machine learning process corresponding to the working condition is finished, the learning result is input into the learning training set database to update the data in the learning training set database for the repeated use of the subsequent machine learning, specifically, in a certain production process, the parameter configuration needs to be performed on the production equipment of the current production batch in the production process, at this time, whether the learning task exists in the current production batch or not is preferentially inquired from the database, if the learning task exists, the training data set corresponding to the learning task is preferentially obtained, the parameter configuration on the production equipment is completed according to the training data set, so that the original purpose of the machine learning can be achieved, namely, a plurality of training data meeting the standard are obtained through the machine learning, the training data are the parameter values of various production parameters in the actual production process, so as to directly apply the plurality of training data meeting the standard in the actual production process, the production efficiency is improved.

Therefore, in the present example, in the actual production process, the available learning tasks are preferentially detected in the current production batch, and the training data in the learning tasks are directly pulled to perform parameter configuration, so that the parameter configuration process is optimized, and the production efficiency is improved.

Referring to fig. 8a, in accordance with the above-mentioned embodiment, fig. 8a is a block diagram of functional units of a data management apparatus provided in an embodiment of the present application, where the apparatus is applied to a server, and as shown in fig. 8a, the data management apparatus 80 includes: a determining unit 801, configured to determine, according to current working condition information and reference working condition information prestored in a learning training set database, a target learning task and a first training set corresponding to the current working condition information, where the target learning task is used to instruct a preset model to learn production parameters under a current working condition, and the first training set is used to instruct a training data set used for learning in the target learning task; a clearing unit 802, configured to clear the first training data set with problems from the first training set to obtain a second training set; a training unit 803, configured to perform training according to the second training set and the target learning task, to obtain a target learning result; a storage unit 804, configured to store the target learning result in the learning training set database, so as to obtain an updated learning training set database.

In one possible example, in the aspect of determining the target learning task and the first training set corresponding to the current operating condition information according to the current operating condition information and reference operating condition information pre-stored in a learning training set database, the determining unit 801 is specifically configured to: detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, and no historical training data corresponding to the current working condition information exists; acquiring an original training set with X training data; receiving a task creating instruction from a client; creating the target learning task according to the task creating instruction; and determining the first training set according to the task creating instruction and the original training set with X training data, wherein the value of X is greater than a target preset value.

In one possible example, in the aspect of determining the target learning task and the first training set corresponding to the current operating condition information according to the current operating condition information and reference operating condition information pre-stored in a learning training set database, the determining unit 801 is specifically configured to: detecting that a plurality of historical learning tasks corresponding to the current working condition information and a historical training set corresponding to the historical learning tasks exist in the reference working condition information; acquiring an original training set with X training data; receiving a task merging instruction and a task selecting instruction from a client; selecting N historical learning tasks through the task selection instruction; merging the N historical learning tasks through the task merging instruction to obtain the target learning task; and combining the original training set with the X training data and the historical training set with the Y historical training data corresponding to the N historical learning tasks through the task merging instruction to obtain the first training set, wherein the value of X + Y is greater than a target preset value.

In a possible example, after the combining the original training set with X training data and the historical training set with Y historical training data corresponding to the N historical learning tasks by the task merging instruction to obtain the first training set, the determining unit 801 is further configured to: receiving a task integration instruction from a client; reselecting T historical learning tasks through the task integration instruction and combining the T historical learning tasks to obtain the target learning task; and combining the original training set with the X training data and the historical training set with the S historical training data corresponding to the T historical learning tasks through the task integration instruction to obtain the first training set, wherein the value of X + S is larger than a target preset value.

In one possible example, in the aspect of determining the target learning task and the first training set corresponding to the current operating condition information according to the current operating condition information and reference operating condition information pre-stored in a learning training set database, the determining unit 801 is specifically configured to: detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, but Z pieces of historical training data corresponding to the current working condition information exist; acquiring an original training set with X training data; receiving a task creating instruction and a training data use confirmation instruction from a client; creating the target learning task according to the task creating instruction; and combining the original training set with the X training data and the Z historical training data by using a confirmation instruction according to the training data to obtain the first training set, wherein the value of X + Z is greater than a target preset value.

In a possible example, in the aspect of removing the problematic first training data set from the first training set to obtain a second training set, the removing unit 802 is specifically configured to: setting the first training set to be in a usable state, wherein the first training set comprises M training data, M is a positive integer and is greater than the target preset value, and the M training data comprises M process parameters and production results corresponding to the M process parameters; obtaining a target production batch corresponding to the current working condition information; determining that the production results corresponding to the M process parameters comprise A abnormal results, wherein the abnormal results are used for indicating that the value of at least one parameter in the production results is not within a preset target value interval; and clearing A abnormal results in the first training set and process parameters corresponding to the A abnormal results to obtain the second training set.

In one possible example, after storing the target learning result into the learning training set database to obtain an updated learning training set database, the data management and control device 80 is further configured to: acquiring a query request from a client; determining that a learning task exists in the current production batch according to the query request; inquiring the learning training set database to obtain a training data set corresponding to the learning task; and sending the training data set to the client, wherein the training data set is used for configuring parameter values of the production equipment for carrying out the current production batch.

It can be understood that, since the method embodiment and the apparatus embodiment are different presentation forms of the same technical concept, the content of the method embodiment portion in the present application should be synchronously adapted to the apparatus embodiment portion, and is not described herein again.

In the case of using an integrated unit, as shown in fig. 8b, fig. 8b is a block diagram of functional units of another data management and control apparatus provided in the embodiment of the present application. In fig. 8b, the data administration apparatus 81 includes: a processing module 812 and a communication module 811. The processing module 812 is used to control and manage the actions of the data governors, e.g., to perform the steps of the determining unit 801, the clearing unit 802, the training unit 803, and the storing unit 804, and/or to perform other processes of the techniques described herein. The communication module 811 is used to support interaction between the data management and control apparatus and other devices. As shown in fig. 8b, the data administration apparatus may further include a storage module 813, and the storage module 813 is used for storing program codes and data of the data administration apparatus. The data management apparatus 81 may be the aforementioned data management apparatus 80.

The Processing module 812 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 811 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 813 may be a memory.

All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The data management apparatus 81 can perform the data management method shown in fig. 2.

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer instructions or the computer program are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

Fig. 9 is a block diagram of a server according to an embodiment of the present disclosure. As shown in fig. 9, the server 900 may include one or more of the following components: a processor 901, a memory 902 coupled to the processor 901, wherein the memory 902 may store one or more computer programs that may be configured to implement the methods as described in the embodiments above when executed by the one or more processors 901. The server 900 may be the aforementioned server 11.

Processor 901 may include one or more processing cores. The processor 901 connects various parts within the overall server 900 using various interfaces and lines, and performs various functions of the server 900 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 902 and calling data stored in the memory 902. Alternatively, the processor 901 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 901 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 901, but may be implemented by a communication chip.

The Memory 902 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). The memory 902 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like. The storage data area may also store data created by the server 900 in use, and the like.

It is understood that the server 900 may include more or less structural elements than those shown in the above structural block diagrams, for example, a power module, a physical button, a WiFi (Wireless Fidelity) module, a speaker, a bluetooth module, a sensor, etc., which are not limited herein.

Embodiments of the present application further provide a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, and the computer program makes a computer execute part or all of the steps of any one of the methods as described in the above method embodiments.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not imply any order of execution, and the order of execution of the processes should be determined by their functions and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed method, apparatus and system may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative; for example, the division of the unit is only a logic function division, and there may be another division manner in actual implementation; for example, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer-readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: u disk, removable hard disk, magnetic disk, optical disk, volatile memory or non-volatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example and not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous SDRAM (SLDRAM), and direct bus RAM (DR RAM) among various media that can store program code.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications can be easily made by those skilled in the art without departing from the spirit and scope of the present invention, and it is within the scope of the present invention to include different functions, combination of implementation steps, software and hardware implementations.

Claims

1. A method for data management, the method comprising:

removing the problematic first training data set from the first training set to obtain a second training set;

2. The method of claim 1,

the method for determining the target learning task and the first training set corresponding to the current working condition information according to the current working condition information and reference working condition information prestored in a learning training set database comprises the following steps:

detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, and no historical training data corresponding to the current working condition information exists;

acquiring an original training set with X training data;

receiving a task creating instruction from a client;

creating the target learning task according to the task creating instruction;

and determining the first training set according to the task creating instruction and the original training set with X training data, wherein the value of X is greater than a target preset value.

3. The method of claim 1,

detecting that a plurality of historical learning tasks corresponding to the current working condition information and a historical training set corresponding to the historical learning tasks exist in the reference working condition information;

acquiring an original training set with X training data;

receiving a task merging instruction and a task selecting instruction from a client;

selecting N historical learning tasks through the task selection instruction;

merging the N historical learning tasks through the task merging instruction to obtain the target learning task;

and combining the original training set with the X training data and the historical training set with the Y historical training data corresponding to the N historical learning tasks through the task merging instruction to obtain the first training set, wherein the value of X + Y is greater than a target preset value.

4. The method of claim 3, wherein after the combining the original training set with X training data and the historical training set with Y historical training data corresponding to the N historical learning tasks by the task merging instruction to obtain the first training set, the method further comprises:

receiving a task integration instruction from a client;

reselecting T historical learning tasks through the task integration instruction and combining the T historical learning tasks to obtain the target learning task;

and combining the original training set with the X training data and the historical training set with the S historical training data corresponding to the T historical learning tasks through the task integration instruction to obtain the first training set, wherein the value of X + S is greater than a target preset value.

5. The method of claim 1,

detecting that no historical learning task corresponding to the current working condition information exists in the reference working condition information, but Z pieces of historical training data corresponding to the current working condition information exist;

acquiring an original training set with X training data;

receiving a task creating instruction and a training data use confirmation instruction from a client;

creating the target learning task according to the task creating instruction;

and combining the original training set with the X training data and the Z historical training data by using a confirmation instruction according to the training data to obtain the first training set, wherein the value of X + Z is greater than a target preset value.

6. The method according to any of claims 2-5, wherein said removing the problematic first training data set from the first training set to obtain a second training set comprises:

setting the first training set to be in a usable state, wherein the first training set comprises M training data, M is a positive integer and is greater than the target preset value, and the M training data comprises M process parameters and production results corresponding to the M process parameters;

determining that the production results corresponding to the M process parameters comprise A abnormal results, wherein the abnormal results are used for indicating that the value of at least one parameter in the production results is not within a preset target value interval;

and clearing A abnormal results in the first training set and process parameters corresponding to the A abnormal results to obtain the second training set.

7. The method according to any one of claims 1 to 6, wherein after storing the target learning result in the learning training set database to obtain an updated learning training set database, the method further comprises:

acquiring a query request from a client;

determining that a learning task exists in the current production batch according to the query request;

inquiring the learning training set database to obtain a training data set corresponding to the learning task;

and sending the training data set to the client, wherein the training data set is used for configuring parameter values of the production equipment for carrying out the current production batch.

8. A data management and control apparatus, characterized in that the apparatus comprises:

the system comprises a determining unit, a learning training set database and a judging unit, wherein the determining unit is used for determining a target learning task and a first training set corresponding to current working condition information according to the current working condition information and reference working condition information prestored in the learning training set database, the target learning task is used for indicating that production parameters under the current working condition are learned through a preset model, and the first training set is used for indicating a training data set used for learning in the target learning task;

a clearing unit, configured to clear the problematic first training data set from the first training set to obtain a second training set;

the training unit is used for training according to the second training set and the target learning task to obtain a target learning result;

and the storage unit is used for storing the target learning result into the learning training set database so as to obtain an updated learning training set database.

9. A server, comprising a processor, memory, and a communication interface, one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.

10. A computer-readable storage medium or computer program product, characterized in that the computer-readable storage medium stores a computer program, wherein the computer program causes a computer to perform the method according to any one of claims 1-7 or the computer program product causes a computer to perform the method according to any one of claims 1-7.