CN110619363A - Classification method for subclass names corresponding to long description of material data - Google Patents

Classification method for subclass names corresponding to long description of material data Download PDF

Info

Publication number
CN110619363A
CN110619363A CN201910877234.7A CN201910877234A CN110619363A CN 110619363 A CN110619363 A CN 110619363A CN 201910877234 A CN201910877234 A CN 201910877234A CN 110619363 A CN110619363 A CN 110619363A
Authority
CN
China
Prior art keywords
material data
data
classification
subclasses
description
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910877234.7A
Other languages
Chinese (zh)
Inventor
隋怡
杨浩东
张复生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Top 100 Information Technology Co Ltd
Original Assignee
Shaanxi Top 100 Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Top 100 Information Technology Co Ltd filed Critical Shaanxi Top 100 Information Technology Co Ltd
Priority to CN201910877234.7A priority Critical patent/CN110619363A/en
Publication of CN110619363A publication Critical patent/CN110619363A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders

Abstract

The invention discloses a classification method for subclasses corresponding to long description of material data. The classification of subclasses of material data can accurately analyze the problems in the data, such as case/full half angle, connector, unit non-uniformity and similar pronunciation, carry out reasonable data preprocessing process, standardize and standardize the data, then convert the data into a form of characteristic vector, and classify the data by adopting a method of logistic regression, L2 regularization and L-BFGS optimization.

Description

Classification method for subclass names corresponding to long description of material data
Technical Field
The invention relates to the technical field of material data classification, in particular to a classification method for subclasses corresponding to long description of material data.
Background
The material master data contains a description of the materials purchased, produced and stored in inventory by all enterprises. Which is a material database of materials data related to material information (e.g., inventory levels) in an enterprise. Integrating all material data into a single material database eliminates the problem of data redundancy and allows the purchasing department to use the data as well as other application departments (e.g., inventory management, material planning and control, invoice verification, etc.). The material classification means that materials with the same natural attributes are classified according to a certain arrangement order and a certain combination mode. The basic standard of classification by natural attributes is required to be followed as much as possible in the material classification process, the existing material classification efficiency is low, and the phenomenon of classification errors is easy to occur.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, the invention aims to provide a classification method of subclasses corresponding to the description of the material data length.
According to the classification method of the subclass names corresponding to the long description of the material data, the method comprises the following steps:
s1: raw material data: reading data of the original material;
s2: data preprocessing: preprocessing the read-in original material data, and standardizing the data;
s3: class-to-number: encoding the original material data category column into numbers;
s4: dividing a sample set: dividing a sample set into a training set and a testing set;
s5: vectorizing the characteristics: converting the material length description into a characteristic vector form;
s6: and (4) classification: obtaining an objective function through learning, and mapping each feature set to a predefined class label;
s7: and (4) evaluating the classification result: the classification results were evaluated by accuracy, recall, and F1 values.
The S2 includes the following steps:
s21: unifying the original material data unit and the connector;
s22: brackets and slashes are removed;
s23: after Chinese word segmentation, text-to-pinyin conversion is carried out;
s24: upper case to lower case and full angle to half angle.
The original material data in the S3 includes a material data length description and a subclass name.
The dividing ratio of the sample set in the S4 is that the ratio of the training set sample size to the test set sample size is 7: 3.
the feature vectorization method in S5 is the tf-idf algorithm.
The material length in S5 is described as material text data.
The classification method in S6 includes logistic regression, naive Bayes, decision trees, support vector machines, K neighbors, random forests, GBDT, XGboost, neural networks and the like.
The metrics for evaluating the classification results in S7 include accuracy, recall, and F1 values.
The beneficial effects of the invention are as follows: the classification of subclasses of material data can accurately analyze the problems existing in the data, such as case/full half angle, connector, unit non-uniformity and similar pronunciation, carry out a reasonable data preprocessing process, standardize and standardize the data, then convert the data into a form of a feature vector, and classify the data by adopting a logistic regression + L2 regularization + L-BFGS optimization method.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a classification method for names of subclasses corresponding to description of material data length according to the present invention;
FIG. 2 is a flow chart of data preprocessing in a classification method of names of subclasses corresponding to long description of material data according to the present invention;
fig. 3 is a flowchart of an example of data preprocessing in the classification method of the names of the subclasses corresponding to the description of the material data length provided by the present invention.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views, and merely illustrate the basic structure of the present invention, and therefore, they show only the components related to the present invention.
Referring to fig. 1-2, a method for classifying names of corresponding subclasses of material data length description comprises the following steps:
s1: raw material data: reading data of the original material;
s2: data preprocessing: preprocessing the read-in original material data, and standardizing the data;
s3: class-to-number: encoding the original material data category column into numbers;
s4: dividing a sample set: dividing a sample set into a training set and a testing set;
s5: vectorizing the characteristics: converting the material length description into a characteristic vector form;
s6: and (4) classification: obtaining an objective function through learning, and mapping each feature set to a predefined class label;
s7: and (4) evaluating the classification result: the classification results were evaluated by accuracy, recall, and F1 values.
S2 includes the steps of:
s21: unifying the original material data unit and the connector;
s22: brackets and slashes are removed;
s23: after Chinese word segmentation, text-to-pinyin conversion is carried out;
s24: upper case to lower case and full angle to half angle.
And in S3, the original material data are long description of material data and name of subclass.
The dividing ratio of the sample set in the S4 is that the ratio of the training set sample size to the test set sample size is 7: 3.
the feature vectorization method in the S5 is tf-idf algorithm.
The material length in S5 is described as material text data.
The classification method in S6 includes logistic regression, naive Bayes, decision trees, support vector machines, K neighbors, random forests, GBDT, XGboost and neural networks.
The method for evaluating the classification result in the S7 is a logistic regression method, a naive Bayes method, a decision tree method, a support vector machine method, a K neighbor method, a random forest method and an XGboost method.
Data preprocessing:
due to the problems of non-uniform capital and small case, non-uniform full half angle, non-uniform multiplier, non-uniform space, non-uniform underline and non-uniform cross bar, non-uniform metering units, non-uniform input word sequence, similar pronunciation and the like of the material data, the preprocessing operation of the data is carried out before the data is converted into the characteristic vector, and the data is normalized and standardized.
Example 2.1:
the material data length describes a radial bearing \ N40/50/20T6540 tilting pad, and the results of the pretreatment process are as follows:
example 2.2:
the long description of the original material data and the subclass name are as follows:
the data length of the pretreated materials is described as follows:
kebian danhuang zhijia df07kfa116 2327n 2747n 9↑q 321002jda zuhe jian
shimian xiangjiaodian pian cl300 dn25 xb350 gaf sh3401
wufeng santong dn50*dn50 sch120 sch120 sh t3408 15crmo gb9948
shourong redianou redianou wrp–131 0–1600s xing l=900
shourong ruhua beng yeya guan 32*5m
class-to-number:
to facilitate the classification task, the category columns are all encoded into numbers.
Example 3.1:
the subclass name of the raw material data is encoded into a number:
dividing a sample set:
a test sample set is typically required to evaluate the generalization error of the classifier. Therefore, the sample set is divided into a training set and a testing set, and after the classifier is trained by the training sample set, the testing error on the testing sample set is used as an approximation of the generalization error. The dividing proportion of the sample set in the invention is the sample amount of the training set: the test set sample size was 7: 3.
Vectorizing the characteristics:
the independent variable of the classification task is a continuous real-valued vector, so the material length description (text data) is converted into a feature vector form. The text vectorization method mainly comprises a bag-of-words model and a tf-idf algorithm. In consideration of the characteristics of material data, the invention adopts a tf-idf algorithm to carry out feature vectorization.
the tf-idf algorithm is a statistical method for assessing the importance of a word to a document in a document set or corpus. The main idea is as follows: if a word occurs with a high frequency (tf) in one article and rarely occurs in other articles, the word is considered to have a good class distinction capability and is suitable for classification. the tf-idf algorithm is widely applied to search engines, keyword extraction, text similarity, text summarization and the like.
(1) The word frequency (tf) represents the frequency of occurrence of words in the text, and the calculation formula is as follows:
namely, it isWherein n isi,jIs that the word is in the document DjThe number of times of occurrence of (a),is a file DjThe sum of the number of occurrences of all words in (a).
(2) The inverse document frequency (idf) is the logarithm of the ratio of the number of files that contain the total number of files to the number of files for a particular term. The calculation formula is as follows:
namely, it isWhere | D | is the total number of files in the corpus, | { j: w ∈ DjIs the number of files containing the word w. The denominator is added to prevent the case where the word w is not in the corpus resulting in a denominator of 0.
The more files containing the word w, the larger idf value is, and the word has good category distinguishing capability.
(3) tf-idf=tf×idf
High frequency terms in a particular document, and low document frequency of the term across the document collection, may result in a high weighting of tf-idf. tf-idf tends to filter out common words, preserving important words.
Example 5.1:
preprocessed material data
kebian danhuang zhijia df07kfa116 2327n 2747n 9↑q 321002jda zuhe jian
shimian xiangjiaodian pian cl300 dn25 xb350 gaf sh3401
wufeng santong dn50*dn50 sch120 sch120 sh t3408 15crmo gb9948
shourong redianou redianou wrp–1310–1600s xing l=900
shourong ruhua beng yeya guan 32*5m
Expressed in the form of a feature vector:
[0 0 0 0 0 0 0 0 0 0 0 0.35355339 0 0 0.35355339 0 0.35355339 0 0 0 0 0 0.35355339 0 0 0 0 0 0 0 0.35355339 0.35355339 0 0 0 0 0.353553390.35355339 0 0 0 0]
[0.2811506 0.2811506 0.0.2811506 0 0 0 0 0 0 0.2811506 0 0 0 0 0 0 0 0 0 0 0 0.2811506 0 0 0.5623012 0 0.2811506 0 0 0 0 0 0.22683053 0 0.28115060 0 0 0.2811506 0 0 0]
[0 0 0 0 0 0 0.38775666 0 0.38775666 0 0 0.38775666 0 0 0 0 0 0 0 0.38775666 0 0 0 0 0 0 0.38775666 0 0 0 0 0 0 0.31283963 0 0 0 0 0 00.38775666 0 0]
[0 0 0.26726124 0 0 0 0 0 0 0 0 0 0 0 0 0 0.53452248 0 0.26726124 0 0 0 0 0 0 0 0 0 0.26726124 0.53452248 0.26726124 0 0 0 0.26726124 0 0.267261240 0 0 0 0 0]
[0 0 0 0 0.30151134 0.30151134 0 0.30151134 0 0.30151134 0 0 0 0.30151134 0.30151134 0 0 0 0 0 0.30151134 0.30151134 0 0 0.30151134 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0.30151134 0.30151134]
and (4) classification:
the classification task is to obtain an objective function through learning, and map each feature set x to a predefined class label yi
The current mainstream classification method comprises the methods of logistic regression, naive Bayes, decision trees, support vector machines, K neighbor, random forests, GBDT, XGboost, neural networks and the like, and after the material data characteristics are fully considered, the invention adopts the logistic regression method added with the L2 regular term and uses the L-BFGS algorithm to carry out iterative solution.
And (4) evaluating the classification result:
the main metrics for evaluating the classification result are: accuracy, recall, and F1 values.
(1) Rate of accuracy
Accuracy is, as the name implies, the proportion of correctly sorted samples to the total number of samples. The calculation formula is as follows:
(2) recall rate
The recall rate is also called recall rate, and represents the proportion of the number of correctly classified samples of the good cases to the total number of samples of the good cases, and the calculation formula is as follows:
where TP represents the number of correctly classified positive examples and FN represents the number of incorrectly classified positive examples.
(3) F1 value
The F1 value is the harmonic mean of precision and recall, i.e.
Rate of accuracyWhere FP represents the number of misclassified negative samples.
Example 7.1:
in order to evaluate and compare the classification effect of the classification method on the material data sets, classification is performed on 50000 material data sets (total 1995 subclass categories) and 20564 material data sets (total 1213 subclass categories) by using logistic regression, naive Bayes, decision trees, support vector machines, K neighbor, random forests and XGboost methods respectively, and the classification result metric on the test set is shown in the following table.
Rate of accuracy Recall rate F1 value
logistic regression 0.88 0.90 0.89
Naive Bayes 0.60 0.65 0.57
Decision tree 0.84 0.82 0.82
Support vector machine 0.06 0.13 0.17
K nearest neighbor 0.84 0.82 0.82
Random forest 0.89 0.89 0.88
XGBoost 0.67 0.73 0.69
The table above shows the comparison of the results of different classification methods on 50000 material data sets.
Rate of accuracy Recall rate F1 value
logistic regression 0.88 0.90 0.89
Naive Bayes 0.64 0.73 0.65
Decision tree 0.87 0.89 0.87
Support vector machine 0.18 0.22 0.18
K nearest neighbor 0.82 0.82 0.80
Random forest 0.86 0.84 0.84
XGBoost 0.69 0.73 0.71
The table above is a comparison of the results of different classification methods on the 20564 material data sets.
From the two tables, the average classification effect of the logistic regression + L2 regularization + L-BFGS method adopted by the invention is superior to that of other classification methods.
The logistic regression model uses probabilistic estimates to classify. The latent variable y is assumed to represent the possibility of occurrence of a certain event to be researched, the value range of the latent variable y is the whole real number, and the larger the value of the latent variable y is, the higher the possibility of occurrence of the event is. The logistic regression model is widely applied to economic prediction, disaster weather prediction and auxiliary medical diagnosis.
For the material data classification problem, the event to be researched is that a long description of material data is classified into a certain subclass class. And (3) analyzing the internal association between the material data characteristics (namely words in the long description) and the subclass class by using logistic regression so as to predict the subclass class to which the material data belongs.
If the classification is binary, the independent variable is x to represent the characteristic of long description, and yiIndicates the likelihood that the bar description belongs to subclass class i, yi1 indicates belonging to this category, yi0 means not belonging to this category.
Assuming that the predicted values are linear combinations of features, the relationship between the predicted values z and the independent variables x generated by the linear regression model is as follows:
z=wTx+b
to convert real-valued z to 0/1 values, z is assumed to obey a logistic distribution, i.e.
Then the probability that the long description belongs to that category is
The above formula can be changed into
Is obviously provided with
The objective function of logistic regression is
W and b in the model can be estimated by maximum likelihood. The likelihood function of logistic regression is
Taking logarithm of likelihood function for convenient calculation
Maximizing the likelihood function is equivalent to minimizing
The maximum likelihood estimate is easily over-fitted and therefore a regularization term can be added to the objective function. Commonly used regularization terms are L1 regularization and L2 regularization. Adding an L2 regular term according to the prior characteristics of the material data
This is an unconstrained convex optimization problem. According to the convex optimization theory, a Newton-Raphson method is generally adopted for solving. As can be seen from the above formula, all training samples are required to solve the problem, and matrix inversion operation is required for each iteration during optimization of the Newton-Raphson method. In consideration of high dimensionality of text features, in order to reduce the calculation amount, an approximate algorithm, such as an L-BFGS algorithm, Newton-CG and the like, can be adopted for solving. The invention adopts an L-BFGS algorithm to solve.
The L-BFGS algorithm is the most common method for solving the unconstrained nonlinear programming problem, has the advantages of high convergence rate, low memory overhead and the like, and is suitable for large-scale calculation.
Let us assume that the unconstrained problem is defined as minf (x), x ∈ Rn
f (x) at x(k)At a second order Taylor expansion of
Since the extreme point of f (x) satisfiesNeglecting the last remainder and taking the derivative to obtain
Thus the iterative formula of Newton's method is
As can be seen from the above equation, Newton's method requires x to be calculated for each iteration(k)The inverse of the Hessian matrix is processed, and the Hessian matrix is not necessarily fixed, so that the inverse of the Hessian matrix is approximated by a matrix containing no second derivative, namely a quasi-newton method, and different construction methods of the approximation matrix determine different quasi-newton methods.
The BFGS algorithm uses a matrix Bk+1To approximate Hessian matrixIs calculated by the formula
Wherein p is(k)=x(k+1)-x(k)Order toThe BFGS formula can be obtained
Let yk=qk,sk=pkThe above formula can be rewritten as
Order toThenThe L-BFGS only takes the nearest m groups of data to construct an approximate calculation formula each time, namely
The pseudo-code for the L-BFGS algorithm is as follows:
the classification of subclasses of material data can accurately analyze the problems in the data, such as case/full half angle, connector, unit non-uniformity, similar pronunciation and the like, carry out a reasonable data preprocessing process, standardize and standardize the data, convert the data into a characteristic vector form, and classify the data by adopting a logistic regression + L2 regularization + L-BFGS optimization method. The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and their inventive concepts of the present invention with the equivalent alternatives or modifications within the scope of the present invention.

Claims (7)

1. A method for classifying names of subclasses corresponding to long description of material data comprises the following steps:
s1: raw material data: reading data of the original material;
s2: data preprocessing: preprocessing the read-in original material data, and standardizing the data;
s3: class-to-number: encoding the original material data category column into numbers;
s4: dividing a sample set: dividing a sample set into a training set and a testing set;
s5: vectorizing the characteristics: converting the material length description into a characteristic vector form;
s6: and (4) classification: obtaining an objective function through learning, and mapping each feature set to a predefined class label;
s7: and (4) evaluating the classification result: the classification result is evaluated by a classification result metric.
2. The method for classifying names of subclasses corresponding to long description of material data according to claim 1, wherein said S2 comprises the following steps:
s21: unifying the original material data unit and the connector;
s22: brackets and slashes are removed;
s23: after Chinese word segmentation, text-to-pinyin conversion is carried out;
s24: upper case to lower case and full angle to half angle.
3. The method for classifying names of subclasses corresponding to material data length description according to claim 1, wherein the original material data in S3 is material data length description and subclass name.
4. The method for classifying names of subclasses corresponding to long description of material data according to claim 1, wherein in step S4, the sample set is divided into training set samples and testing set samples in a ratio of 7: 3.
5. the method for classifying names of subclasses corresponding to material data length descriptions according to claim 1, wherein the material length descriptions in S5 are material text data, and the feature vectorization method is tf-idf algorithm.
6. The method for classifying names of subclasses corresponding to long description of material data according to claim 1, wherein the classification method in S6 includes logistic regression, naive Bayes, decision trees, support vector machines, K neighbors, random forests, GBDT, XGBoost, neural networks and the like.
7. The method for classifying names of corresponding subclasses described in claim 1, wherein said measures of evaluating classification results in S7 include accuracy, recall rate and F1 value.
CN201910877234.7A 2019-09-17 2019-09-17 Classification method for subclass names corresponding to long description of material data Pending CN110619363A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910877234.7A CN110619363A (en) 2019-09-17 2019-09-17 Classification method for subclass names corresponding to long description of material data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910877234.7A CN110619363A (en) 2019-09-17 2019-09-17 Classification method for subclass names corresponding to long description of material data

Publications (1)

Publication Number Publication Date
CN110619363A true CN110619363A (en) 2019-12-27

Family

ID=68923609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910877234.7A Pending CN110619363A (en) 2019-09-17 2019-09-17 Classification method for subclass names corresponding to long description of material data

Country Status (1)

Country Link
CN (1) CN110619363A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11790033B2 (en) 2020-09-16 2023-10-17 International Business Machines Corporation Accelerated Quasi-Newton methods on analog crossbar hardware

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491390A (en) * 2018-03-28 2018-09-04 江苏满运软件科技有限公司 A kind of main line logistics goods title automatic recognition classification method
CN108777674A (en) * 2018-04-24 2018-11-09 东南大学 A kind of detection method for phishing site based on multi-feature fusion
CN109165294A (en) * 2018-08-21 2019-01-08 安徽讯飞智能科技有限公司 Short text classification method based on Bayesian classification
CN109271517A (en) * 2018-09-29 2019-01-25 东北大学 IG TF-IDF Text eigenvector generates and file classification method
CN109308485A (en) * 2018-08-02 2019-02-05 中国矿业大学 A kind of migration sparse coding image classification method adapted to based on dictionary domain

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491390A (en) * 2018-03-28 2018-09-04 江苏满运软件科技有限公司 A kind of main line logistics goods title automatic recognition classification method
CN108777674A (en) * 2018-04-24 2018-11-09 东南大学 A kind of detection method for phishing site based on multi-feature fusion
CN109308485A (en) * 2018-08-02 2019-02-05 中国矿业大学 A kind of migration sparse coding image classification method adapted to based on dictionary domain
CN109165294A (en) * 2018-08-21 2019-01-08 安徽讯飞智能科技有限公司 Short text classification method based on Bayesian classification
CN109271517A (en) * 2018-09-29 2019-01-25 东北大学 IG TF-IDF Text eigenvector generates and file classification method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11790033B2 (en) 2020-09-16 2023-10-17 International Business Machines Corporation Accelerated Quasi-Newton methods on analog crossbar hardware

Similar Documents

Publication Publication Date Title
US10685044B2 (en) Identification and management system for log entries
CN107577785B (en) Hierarchical multi-label classification method suitable for legal identification
CN108320171B (en) Hot-sold commodity prediction method, system and device
WO2020199591A1 (en) Text categorization model training method, apparatus, computer device, and storage medium
US9646262B2 (en) Data intelligence using machine learning
JP2020115346A (en) AI driven transaction management system
Park et al. Explainability of machine learning models for bankruptcy prediction
CN107622326B (en) User classification and available resource prediction method, device and equipment
US20170024662A1 (en) Data driven classification and troubleshooting system and method
CN112527970B (en) Data dictionary standardization processing method, device, equipment and storage medium
US20230334119A1 (en) Systems and techniques to monitor text data quality
Abakarim et al. Towards an efficient real-time approach to loan credit approval using deep learning
CN110619363A (en) Classification method for subclass names corresponding to long description of material data
CN116245107B (en) Electric power audit text entity identification method, device, equipment and storage medium
CN117290404A (en) Method and system for rapidly searching and practical main distribution network fault processing method
Sana et al. Data transformation based optimized customer churn prediction model for the telecommunication industry
Zhang et al. Can sentiment analysis help mimic decision-making process of loan granting? A novel credit risk evaluation approach using GMKL model
GUMUS et al. Stock market prediction by combining stock price information and sentiment analysis
CN114443840A (en) Text classification method, device and equipment
Anastasopoulos et al. Computational text analysis for public management research: An annotated application to county budgets
CN112100370B (en) Picture-trial expert combination recommendation method based on text volume and similarity algorithm
Hepburn et al. Proper losses for learning with example-dependent costs
AU2020104034A4 (en) IML-Cloud Data Performance: Cloud Data Performance Improved using Machine Learning.
CN116932487B (en) Quantized data analysis method and system based on data paragraph division
US11816427B1 (en) Automated data classification error correction through spatial analysis using machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191227

RJ01 Rejection of invention patent application after publication