CN114595211A - Product data cleaning method and system based on deep learning - Google Patents
Product data cleaning method and system based on deep learning Download PDFInfo
- Publication number
- CN114595211A CN114595211A CN202210089180.XA CN202210089180A CN114595211A CN 114595211 A CN114595211 A CN 114595211A CN 202210089180 A CN202210089180 A CN 202210089180A CN 114595211 A CN114595211 A CN 114595211A
- Authority
- CN
- China
- Prior art keywords
- data
- product
- cleaning
- training
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004140 cleaning Methods 0.000 title claims abstract description 87
- 238000013135 deep learning Methods 0.000 title claims abstract description 31
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000012549 training Methods 0.000 claims abstract description 59
- 239000000463 material Substances 0.000 claims abstract description 51
- 238000012360 testing method Methods 0.000 claims abstract description 26
- 238000004519 manufacturing process Methods 0.000 claims abstract description 20
- 230000002159 abnormal effect Effects 0.000 claims abstract description 14
- 238000002790 cross-validation Methods 0.000 claims abstract description 12
- 238000013136 deep learning model Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 16
- 238000013500 data storage Methods 0.000 claims description 9
- 230000008676 import Effects 0.000 claims description 8
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000013499 data model Methods 0.000 claims description 2
- 230000001915 proofreading effect Effects 0.000 claims description 2
- 238000003754 machining Methods 0.000 abstract description 4
- 238000013461 design Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 4
- 239000002994 raw material Substances 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 229910000831 Steel Inorganic materials 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 239000010959 steel Substances 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 1
- 229910052782 aluminium Inorganic materials 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012946 outsourcing Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 239000004753 textile Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/087—Inventory or stock management, e.g. order filling, procurement or balancing against orders
- G06Q10/0875—Itemisation or classification of parts, supplies or services, e.g. bill of materials
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Economics (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Tourism & Hospitality (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Strategic Management (AREA)
- Evolutionary Biology (AREA)
- General Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Primary Health Care (AREA)
- Manufacturing & Machinery (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- General Factory Administration (AREA)
Abstract
The invention provides a product data cleaning method and a system based on deep learning, wherein the method comprises the following steps: establishing a product data set according to the industry, establishing a data cleaning model based on a deep learning model, and training the product data set by using the data cleaning model to obtain a training data set and a test data set; acquiring product data needing to be cleaned and inputting the product data into a trained data cleaning model to obtain a product cleaning result; and performing circulating cross validation on the product cleaning result according to the material attribute until no abnormal data exists, and outputting the cleaning result. The method comprises the steps of establishing a deep learning data set based on a machining and assembling manufacturing product data structure in advance, wherein the data set comprises industry standard product data and historical project manufacturing product data; the product data for the project is then cleaned by the data cleaning model of the data set.
Description
Technical Field
The invention relates to the technical field of data cleaning, in particular to a product data cleaning method and system based on deep learning.
Background
Data cleaning: the process of re-reviewing and verifying data aims to delete duplicate information, correct existing errors, and provide data consistency. Data cleaning after data import is generally completed by a computer instead of a human.
Material (item): the material is all materials used or consumed in the production process of the product, and comprises final products, parts, assemblies, composite parts, outsourcing parts, raw materials and the like.
Material master file (item data): used for identifying and describing the attribute and information of each material used in the production process, the material master file mainly comprises:
1) basic information: material code, material type, material classification and material name.
2) Design management related information: such as design drawing number or formula (raw material, ingredient) number, design modification number or edition, effective date and ineffective date of the material, etc.
3) Material management related information: such as unit of measure, material, specification, yield, ABC code, default warehouse sum or yes, category, current inventory, safety inventory, longest storage days, maximum inventory limit, cycle count interval, etc.
Bill of materials (BOM, BOM for short): the BOM is a description of the composition of a product, which lists all the sub-components, intermediate components, parts, raw materials required to produce a product, and shows the number of sub-components required to make up this parent. Sometimes also referred to as "recipe list", "matching list", "product structure list", "detailed list", "product detail list", etc.
Product data:
the basis of a production scheduling plan in the ERP system in the manufacturing industry is product basic data which comprises description data of products and materials, product structure data (BOM) and production process data, the sources of the product data mainly comprise drawings, product detailed tables, BOM, material information, process routes and the like provided by a design and product research and development department, and the main modes of importing the product data into the ERP system are as follows:
1) paper product lists or electronic picture files from a design department are manually input;
2) directly importing excel or cvc product specification provided by a design department through a data interface of an ERP system;
3) and directly importing data from a PDM (product data management system) and a PLM (product life cycle management system) applied by a design department through a data interface of the ERP system.
The reason for cleaning the product data is that the manual input workload is huge, the errors are more, and the verification of the data must be completed by a very professional technician; the data imported from the product list, PDM and PLM from the design department cannot be directly used as basic data for production and manufacturing management, and there are the following problems: the product data description provided by the design department is not standard; the description of the product data of the design department is not uniform with the data description rule of the ERP system;
Disadvantages or problems with the prior art:
1) when maintaining bill of material data, only basic mathematical logic checks can be performed, for example: a lower level of the product A is a component B, so that the lower level of the component B cannot be the component A, incomplete cleaning is caused, and the deviation is large.
2) When maintaining the material master file, only each single attribute can be subjected to standardized verification, for example: whether the material of steel accords with national standard, cause to wash incomprehensiblely, the deviation is big.
3) The rule base needs to be manually set and updated, so that the real-time performance is poor and the workload is large.
Disclosure of Invention
In order to solve the technical problems, the invention provides a product data cleaning method and a system based on deep learning, wherein a deep learning data set based on a machining and assembling manufactured product data structure is pre-established, and the data set comprises industry standard product data and historical project manufactured product data; the product data for the project is then cleaned by the data cleaning model of the data set.
In order to realize the purpose, the following technical scheme is provided:
a product data cleaning method based on deep learning comprises the following steps:
s1, establishing a product data set according to the industry, establishing a data cleaning model based on the deep learning model, and training the product data set by using the data cleaning model to obtain a training data set and a test data set;
S2, acquiring product data to be cleaned and inputting the product data to the trained data cleaning model to obtain a product cleaning result;
and S3, performing circulating cross validation on the product cleaning result according to the material attribute until no abnormal data exists, and outputting the cleaning result.
Deep learning is an algorithm process of inducing a model from existing data by means of a multilayer neural network and applying a multilayer analysis and calculation means and then analyzing new data by the model. Therefore, the proposal applies the algorithm to the product data cleaning process. The method has the following advantages: preprocessing product data by using an RNN (neural network) deep learning algorithm; and abstracting each dimension neuron by utilizing a cleaning model of a deep learning algorithm through an industry standard database and a historical service database. And then, the periodic learning promotion is carried out on the training library data by means of the self-learning characteristic; and a deep learning test library and training library result verification mechanism is applied to correct the cleaning result deviation, so that the cleaning accuracy is improved. The bill of material and the material master file information are subjected to cross validation while being cleaned in multiple dimensions.
Preferably, the S1 includes the following steps:
A1: establishing a product data set according to industries, wherein the product data set comprises industry standard product data and historical project manufacturing product data;
a2: making a label according to the attribute of the material;
a3: establishing a classification learning device for the product data set according to classification and labels;
a4: training a product data set through a deep learning model to obtain a training result;
establishing a function Mi=AF(∑jXijtk+bj) Wherein t is the number of product libraries, k is the level of the BOM of the product, X is the tag dataset, and AF is the activation function;
a5: the training result is corrected by an expert database and then output as a data set which passes the training;
a6: and splitting the training passing data set into training data and testing data.
Preferably, the attributes of the materials comprise materials, specifications, types and categories of the materials.
Preferably, the S3 includes the following steps: a K-fold Cross Validation method is applied, and a training data set and test data are called simultaneously, specifically as follows:
1) dividing training data and test data into x parts;
2) continuously and circularly calling 1 part for testing data at a time without repetition, using other x-1 parts for training data models, and then calculating the MSE of each material attribute label of the deep learning model on the testing data set iA value;
3) then the x calculated MSEiAveraging to obtain the MSE value of each material attribute label, wherein x is the configured number value of the split;
4) judging whether abnormal data exist in the step 3), if not, directly outputting a cleaning result, and if so, carrying out the step 5);
5) after the expert database is called for proofreading, bringing the abnormal data into a temporary training library;
6) calling a temporary training library, and cleaning again;
7) judging whether abnormal data exist in the step 6), if not, transferring the temporary training library into a formal training library, and directly outputting a cleaning result; if so, return to 5).
Preferably, the activation function AF is a Sigmoid activation function or a Tanh activation function or an ELU activation function.
A product data cleaning system based on deep learning adopts the product data cleaning method based on deep learning, and comprises the following steps:
the data storage module is used for storing the data of the historical project product database;
the deep learning module is used for training the data of the data storage module to obtain a training data set and a test data set;
the data cleaning module is used for performing circular cross authentication on the product cleaning result to obtain a cleaning result;
the data import module is used for importing the product data to be cleaned to the data cleaning module;
The result display module is used for displaying and analyzing the cleaning result;
and the production database module is used for receiving the cleaning result and performing production scheduling.
The beneficial effects of the invention are: preprocessing product data by using an RNN (neural network) deep learning algorithm; and abstracting each dimension neuron by utilizing a cleaning model of a deep learning algorithm through an industry standard database and a historical service database. And then, the periodic learning promotion is carried out on the training library data by means of the self-learning characteristic; and a deep learning test library and training library result verification mechanism is applied to correct the cleaning result deviation, so that the cleaning accuracy is improved.
Drawings
FIG. 1 is a process diagram of an embodiment for building a data cleansing model;
FIG. 2 is a detailed flow chart of data cleansing according to an embodiment.
Detailed Description
Example (b):
the embodiment provides a product data cleaning method based on deep learning, which comprises the following steps:
s1, building a product data set according to the industry, building a data cleaning model based on the deep learning model, and training the product data set by using the data cleaning model to obtain a training data set and a test data set, which are as follows with reference to fig. 1:
A1: establishing a product data set according to the industry, such as a machining product data set and a product database of historical data source projects;
a2: making labels according to the material quality, specification, material type and material category of the materials;
a3: establishing a classification learning device for the product data set according to classification and labels;
a4: training a product data set through a deep learning model;
establishing a function Mi=AF(∑jXijtk+bj) Wherein t is the number of product libraries, k is the level of the BOM of the product, X is the tag dataset, and AF is the activation function; the usual activation functions are as follows:
(1) sigmoid activation function:
(2) tanh activation function:
(3) ELU activation function:
f(x)=a(e-x-1);
a5: the training result is corrected by an expert database and then output as a data set which passes the training;
a6: the training passed data set is split into training data and test data.
S2, acquiring product data needing to be cleaned and inputting the product data into the trained data cleaning model to obtain a product cleaning result; the method specifically comprises the following steps:
b1: obtaining product data of the current project, such as a drawing and an Excel table, and performing data import or interface synchronization;
b2: inputting project product data into the established product data cleaning model;
s3, performing circulating cross validation on the product cleaning result according to the material attributes, and outputting the cleaning result until no abnormal data exists, wherein the method specifically comprises the following steps:
B3: respectively calculating product cleaning results according to the training data and the testing data, comparing, and removing results with large differences:
performing business logic cross validation according to the material attributes or the business relationship among the labels, for example:
1. and (3) starting to circularly and crossly verify from the product category, and when the product category is 'I-shaped steel', dividing into: the material meets the national standard GB/T700-2006, and the material type must be a raw material and measurement unit symbol.
2. Performing circular cross validation from the specification, and when the specification is '20 a', dividing the specification into a dimension table with the dimension in accordance with the corresponding specification, a material in accordance with the national standard GB/T700-2006 and a single-weight table with the single-weight in accordance with the corresponding specification;
by analogy, all labels are circulated, each circulation generates a group of material attributes or label arrays, and when the results of all the arrays are consistent, the normal operation is returned;
the specific algorithm is realized by using a K-fold Cross Validation method and calling a plurality of data of a training data set and a testing data set at the same time, wherein the specific number can be set by configuration items. For example, set to 8, then the process of cross-validation is:
1) dividing training data and test data into 8 parts;
2) continuously and repeatedly calling 1 part of the deep learning model for test data and 7 other parts of the deep learning model for training data, and calculating each deep learning model on the test data set MSE of Material Attribute tagsiA value;
3) the 8 calculated MSEs are then usediAfter averaging, the MSE value of each material attribute label is relatively accurate, wherein x is the configured splitting quantity value, such as 8 in the example, the larger the quantity is, the higher the accuracy is, and the larger the calculated quantity is;
b7: if the abnormal data does not exist in the B3, directly outputting the cleaning result;
b4: if abnormal data exist in B3, calling an expert library for evaluation and then bringing the expert library into a temporary training library;
b5: meanwhile, calling a temporary training library, and cleaning again;
b6: if abnormal data does not exist in the B5, the temporary training library is transferred into a formal training library, and a cleaning result is directly output;
b7: if there is abnormal data in B5, loop B4 flow.
The embodiment further provides a product data cleaning system based on deep learning, and the product data cleaning method based on deep learning includes:
the data storage module is used for storing the data of the historical project product database;
the deep learning module is used for training the data of the data storage module to obtain a training data set and a test data set;
the data cleaning module is used for performing circular cross authentication on the product cleaning result to obtain a cleaning result;
The data import module is used for importing the data of the product to be cleaned into the data cleaning module;
the result display module is used for displaying and analyzing the cleaning result;
and the production database module is used for receiving the cleaning result and performing production scheduling.
The specific using process is as follows:
step C1: transferring product databases in industries such as machining, assembly manufacturing, medical appliances, textile and clothing, national standard libraries of material specifications such as steel, aluminum and the like, and historical project product databases of the company into a data storage module;
step C2: establishing a data storage module according to a classification label mode of industry + national standard + material attribute + project;
step C3: in a deep learning module, selecting Tanh and ELU activation functions as data of a data storage module for training, abstracting each dimension neuron, and establishing complete training data and test data;
step C4: configuring an industry to be cleaned and a K-fold Cross Validation K value in a data cleaning module;
step C5: the data import module supports Excel import, CAD import and PDM system integration;
step C6: after the data of C5 is imported into a C4 data cleaning module, cleaning the data and displaying the result, comparing and displaying the data before and after cleaning according to the material number and the classification label, and analyzing and displaying the cleaning quantity and quality according to each classification label;
Step C7: and after cleaning, synchronizing the data to a production database module through an interface platform to perform subsequent production scheduling.
Claims (6)
1. A product data cleaning method based on deep learning is characterized by comprising the following steps:
s1, establishing a product data set according to the industry, establishing a data cleaning model based on the deep learning model, and training the product data set by using the data cleaning model to obtain a training data set and a test data set;
s2, acquiring product data needing to be cleaned and inputting the product data into the trained data cleaning model to obtain a product cleaning result;
and S3, performing circulating cross validation on the product cleaning result according to the material attribute until no abnormal data exists, and outputting the cleaning result.
2. The deep learning-based product data cleansing method according to claim 1, wherein the step S1 comprises the steps of:
a1: establishing a product data set according to industries, wherein the product data set comprises industry standard product data and historical project manufacturing product data;
a2: making a label according to the attribute of the material;
a3: establishing a classification learning device for the product data set according to classification and labels;
a4: training a product data set through a deep learning model to obtain a training result;
Establishing a function Mi=AF(∑jXijtk+bj) Wherein t is the number of product libraries, k is the level of BOM of the product, X is the tag data set, and AF is the activation function;
a5: the training result is corrected by an expert database and then is output as a data set which passes the training;
a6: and splitting the training passing data set into training data and testing data.
3. The deep learning-based product data cleaning method as claimed in claim 2, wherein the material properties include material, specification, type and category of the material.
4. The deep learning-based product data cleansing method according to claim 1, wherein the step S3 comprises the steps of: a K-fold Cross Validation method is applied, and a training data set and test data are called simultaneously, specifically as follows:
1) dividing training data and test data into x parts;
2) continuously and circularly calling 1 part for testing data at a time without repetition, using other x-1 parts for training data models, and then calculating the MSE of each material attribute label of the deep learning model on the testing data setiA value;
3) then calculate the MSE of the x timesiAveraging to obtain the MSE value of each material attribute label, wherein x is the configured number value of the split;
4) Judging whether abnormal data exist in the step 3), if not, directly outputting a cleaning result, and if so, carrying out the step 5);
5) after the expert database is called for proofreading, bringing the abnormal data into a temporary training library;
6) calling a temporary training library, and cleaning again;
7) judging whether abnormal data exist in the step 6), if not, transferring the temporary training library into a formal training library, and directly outputting a cleaning result; if so, return to 5).
5. The deep learning-based product data cleaning method as claimed in claim 1, wherein the activation function AF is a Sigmoid activation function, a Tanh activation function or an ELU activation function.
6. A deep learning based product data cleansing system using a deep learning based product data cleansing method according to any one of claims 1 to 5, comprising:
the data storage module is used for storing the data of the historical project product database;
the deep learning module is used for training the data of the data storage module to obtain a training data set and a test data set;
the data cleaning module is used for performing circular cross authentication on the product cleaning result to obtain a cleaning result;
the data import module is used for importing the product data to be cleaned to the data cleaning module;
The result display module is used for displaying and analyzing the cleaning result;
and the production database module is used for receiving the cleaning result and performing production scheduling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089180.XA CN114595211A (en) | 2022-01-25 | 2022-01-25 | Product data cleaning method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089180.XA CN114595211A (en) | 2022-01-25 | 2022-01-25 | Product data cleaning method and system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114595211A true CN114595211A (en) | 2022-06-07 |
Family
ID=81804977
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210089180.XA Pending CN114595211A (en) | 2022-01-25 | 2022-01-25 | Product data cleaning method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114595211A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111709488A (en) * | 2020-06-22 | 2020-09-25 | 电子科技大学 | Dynamic label deep learning algorithm |
CN111860981A (en) * | 2020-07-03 | 2020-10-30 | 航天信息(山东)科技有限公司 | Enterprise national industry category prediction method and system based on LSTM deep learning |
US20210201205A1 (en) * | 2019-12-26 | 2021-07-01 | Wipro Limited | Method and system for determining correctness of predictions performed by deep learning model |
CN113110324A (en) * | 2021-04-12 | 2021-07-13 | 江苏丰尚智能科技有限公司 | Training method and device for dryer parameter optimization model and computer equipment |
CN113326689A (en) * | 2020-02-28 | 2021-08-31 | 中国科学院声学研究所 | Data cleaning method and device based on deep reinforcement learning model |
CN113674862A (en) * | 2021-07-08 | 2021-11-19 | 中国科学院国家空间科学中心 | Acute renal function injury onset prediction method based on machine learning |
CN113807462A (en) * | 2021-09-28 | 2021-12-17 | 中电福富信息科技有限公司 | AI-based network equipment fault reason positioning method and system |
CN113888278A (en) * | 2021-10-14 | 2022-01-04 | 黑龙江省范式智能技术有限公司 | Data analysis method and device based on enterprise credit line analysis model |
-
2022
- 2022-01-25 CN CN202210089180.XA patent/CN114595211A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210201205A1 (en) * | 2019-12-26 | 2021-07-01 | Wipro Limited | Method and system for determining correctness of predictions performed by deep learning model |
CN113326689A (en) * | 2020-02-28 | 2021-08-31 | 中国科学院声学研究所 | Data cleaning method and device based on deep reinforcement learning model |
CN111709488A (en) * | 2020-06-22 | 2020-09-25 | 电子科技大学 | Dynamic label deep learning algorithm |
CN111860981A (en) * | 2020-07-03 | 2020-10-30 | 航天信息(山东)科技有限公司 | Enterprise national industry category prediction method and system based on LSTM deep learning |
CN113110324A (en) * | 2021-04-12 | 2021-07-13 | 江苏丰尚智能科技有限公司 | Training method and device for dryer parameter optimization model and computer equipment |
CN113674862A (en) * | 2021-07-08 | 2021-11-19 | 中国科学院国家空间科学中心 | Acute renal function injury onset prediction method based on machine learning |
CN113807462A (en) * | 2021-09-28 | 2021-12-17 | 中电福富信息科技有限公司 | AI-based network equipment fault reason positioning method and system |
CN113888278A (en) * | 2021-10-14 | 2022-01-04 | 黑龙江省范式智能技术有限公司 | Data analysis method and device based on enterprise credit line analysis model |
Non-Patent Citations (1)
Title |
---|
陈云霁: "《智能计算系统》", 29 February 2020, 机械工业出版社 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Discovering data quality problems: the case of repurposed data | |
CN110020660A (en) | Use the integrity assessment of the unstructured process of artificial intelligence (AI) technology | |
Sidhu et al. | A machine learning approach to software model refactoring | |
CN116150493A (en) | Industrial Internet high-quality supply chain recommendation method and system based on knowledge graph | |
CN115269874A (en) | Intelligent contract examination method based on natural language understanding | |
CN113379432B (en) | Sales system customer matching method based on machine learning | |
Natarajan et al. | QUARNEWSS: a model for applying Six Sigma framework to achieve continuous quality and reliability improvement in new product development | |
Ji et al. | A non-conformance rate prediction method supported by machine learning and ontology in reducing underproduction cost and overproduction cost | |
CN116823026B (en) | Engineering data processing system and method based on block chain | |
CN113342793A (en) | Investigation data standardization method and system | |
Schreckenberg et al. | Developing a maturity-based workflow for the implementation of ML-applications using the example of a demand forecast | |
CN117575222A (en) | Production management method, system, equipment and storage medium | |
Jayanti et al. | Application of Predictive Analytics To Improve The Hiring Process In A Telecommunications Company | |
CN114595211A (en) | Product data cleaning method and system based on deep learning | |
Raghuram et al. | Assessing the responsiveness of supply chain-structural equation modelling based approach | |
Mansour et al. | Prediction of implementing ISO 14031 guidelines using a multilayer perceptron neural network approach | |
Xu et al. | [Retracted] Metrological Analysis of Online Consumption Evaluation Influence Commodity Marketing Decision Based on Data Mining | |
CN115018609A (en) | Export tax return declaration automation model and algorithm based on big data | |
Ievlanov et al. | Development of a method for solving the problem of it product configuration analysis | |
CN109299381B (en) | Software defect retrieval and analysis system and method based on semantic concept | |
Chen et al. | Effects of an inaccurate sorting procedure on optimal procurement and production decisions in a remanufacturing system | |
English | Total quality data management (TQdM) | |
Ishak et al. | Rubber spare parts supplier selection model using artificial neural network: Multi-layer perceptron | |
Putra et al. | Flexible stage-based process performance mining for customer journey analysis | |
Yunitarini et al. | An integrated website of electronic data interchange and computer-aided process planning in production outsourcing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220607 |
|
RJ01 | Rejection of invention patent application after publication |