CN117291708A - Enterprise credit assessment method, system, equipment and medium - Google Patents

Enterprise credit assessment method, system, equipment and medium Download PDF

Info

Publication number
CN117291708A
CN117291708A CN202311237487.0A CN202311237487A CN117291708A CN 117291708 A CN117291708 A CN 117291708A CN 202311237487 A CN202311237487 A CN 202311237487A CN 117291708 A CN117291708 A CN 117291708A
Authority
CN
China
Prior art keywords
enterprise
key features
classified
text
frequent pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311237487.0A
Other languages
Chinese (zh)
Inventor
帅勇
郁笑雯
苏芯颐
钟雨言臻
王含含
周粤川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Saibao Industrial Technology Research Institute Co ltd
Original Assignee
Chongqing Saibao Industrial Technology Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Saibao Industrial Technology Research Institute Co ltd filed Critical Chongqing Saibao Industrial Technology Research Institute Co ltd
Priority to CN202311237487.0A priority Critical patent/CN117291708A/en
Publication of CN117291708A publication Critical patent/CN117291708A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an enterprise credit assessment method, system, equipment and medium, wherein the method comprises the following steps: acquiring a text to be classified corresponding to enterprise information; establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features; performing level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing degrees; and determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set and obtain a credit evaluation result of the corresponding enterprise. The method and the device can effectively improve the efficiency and accuracy of enterprise credit assessment.

Description

Enterprise credit assessment method, system, equipment and medium
Technical Field
The invention relates to the field of big data application, in particular to an enterprise credit assessment method, an enterprise credit assessment system, enterprise credit assessment equipment and an enterprise credit assessment medium.
Background
In the process of applying for financial loans by industrial enterprises, the current credit system mainly depends on financial information of the enterprises and personal credits of high management of the enterprises, neglects the production capacity and innovation capacity of the enterprises, and causes serious information asymmetry between credit institutions and the enterprises, so that the loans of small and medium enterprises are difficult. Loan institutions fear providing large amounts of loans to small and medium-sized enterprises because they are concerned about the repayment capabilities of small and medium-sized enterprises and the spurious data provided by these enterprises. These conditions are detrimental to relief of employment pressures, optimization of economic structures, and also affect social stability.
Meanwhile, along with the wide application of intelligent manufacturing and industrial internet technology in industrial enterprises, a large number of industrial enterprises input Manufacturing Execution System (MES), enterprise Resource Planning (ERP), customer Relationship Management System (CRMS) and other informationized systems, and build an industrial internet cloud platform, a digital workshop and an intelligent factory, so that a large amount of production, manufacturing and management flow data is generated. Compared with the traditional financial information data, the industrial data not only reflect the actual operation condition of enterprises, but also have large data volume and are difficult to counterfeit. Thus, these industrial data can be used not only to assist in the production management and decision-making of the enterprise, but also to truly reflect the credit level of the enterprise. The data is incorporated into the feature category used for constructing the credit system of the enterprise, so that financial risks can be effectively prevented, the loan quality of the enterprise can be improved, and the net income of a lending institution can be improved.
The current industrial enterprise credit evaluation system only relates to enterprise finance and asset data and enterprise high management data, does not comprise data such as production, operation and logistics of enterprises, and is not fully suitable for industrial enterprise credit grade evaluation in the industrial Internet age. When part of enterprise credit evaluation models use machine learning and artificial intelligence models to evaluate the enterprise credit level, the sensitivity of the machine learning models to data is low because the feature selection does not meet the actual needs, and the classification results of the models have the problems of low classification precision, poor robustness and poor interpretability.
Disclosure of Invention
In view of the problems existing in the prior art, the invention provides an enterprise credit assessment method, an enterprise credit assessment system, enterprise credit assessment equipment and an enterprise credit assessment medium, which mainly solve the problems of low enterprise credit classification precision and poor interpretability in the prior art.
In order to achieve the above and other objects, the present invention adopts the following technical scheme.
The application provides an enterprise credit assessment method, which comprises the following steps: acquiring a text to be classified corresponding to enterprise information; establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features; performing level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing degrees; and determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set and obtain a credit evaluation result of the corresponding enterprise.
In an embodiment of the present application, obtaining text to be classified corresponding to enterprise information includes: collecting enterprise information, wherein the enterprise information comprises credit investigation data of a target object in an enterprise, enterprise asset data and enterprise online operation data; and converting the enterprise information into the text to be classified, wherein the text to be classified is a plurality of texts.
In an embodiment of the present application, before establishing the frequent pattern tree according to the occurrence frequency of the key feature corresponding to the text to be classified, the method further includes: taking the words matched with the preset standard words in each text to be classified as target words of the corresponding text to be classified; converting the target word into a feature representation to form a vector space model composed of a plurality of feature representations; and extracting features of the vector space model to obtain a feature set so as to determine the key features based on the feature set.
In an embodiment of the present application, determining the key feature based on the feature set further includes: establishing a word frequency matrix according to the feature set; and performing dimension reduction on the word frequency matrix through singular value decomposition to obtain a target vector set, wherein the target vector set consists of a plurality of key features.
In an embodiment of the present application, building a frequent pattern tree according to occurrence frequencies of key features corresponding to the text to be classified includes: sorting according to the occurrence frequency of each key feature in the corresponding text to be classified to generate a head list; and reordering all transaction items in each text to be classified according to the sequence of the item header table, removing the transaction items with the support degree smaller than a preset threshold value, and obtaining a new data set so as to construct a frequent pattern tree based on the new data set.
In an embodiment of the present application, performing level quantization on the corresponding key feature according to the adjective associated with each key feature in the frequent pattern tree in the sample to be classified includes: converting the occurrence frequency of key features in the frequent pattern tree into weights; when the key features have associated adjectives representing degrees, matching preset weights according to the corresponding adjectives so as to correct the weights; and determining a weight set according to the weights of the key features.
In an embodiment of the present application, determining a target feature set according to the frequent pattern tree and the weight set includes: the key features are used as attributes, and fuzzy processing is carried out on the weight set based on a Bayesian network; and determining the association probability among the key features according to the weight set after the blurring processing so as to determine the target feature set based on the association probability.
The application also provides an enterprise credit assessment method, which is characterized by comprising the following steps: the data acquisition module is used for acquiring texts to be classified corresponding to the enterprise information; the association module is used for establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing association relations among the key features; the weight quantization module is used for carrying out level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing the degree; and the credit classification module is used for determining a target feature set according to the frequent pattern tree and the weight set so as to classify the credit based on the target feature set and obtain a credit evaluation result of a corresponding enterprise.
The present application also provides an apparatus comprising: one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the enterprise credit assessment method.
The present application also provides one or more machine-readable media having instructions stored thereon that, when executed by one or more processors, cause an apparatus to perform the enterprise credit assessment method.
As described above, the enterprise credit assessment method, system, equipment and medium provided by the invention have the following beneficial effects.
According to the method and the device, the characteristics of the text to be classified are screened through the frequent pattern tree and the weight set, so that the dimension of massive high-dimension data can be reduced, the complexity of classification is further reduced, the cost of extracting unnecessary characteristics is saved, the influence of noise on classification results is reduced, and the accuracy of classification is improved.
Drawings
FIG. 1 is a flow chart of an enterprise credit evaluation method according to an embodiment of the present application.
FIG. 2 is a schematic diagram of dog-bone procedure with frequent pattern tree in an embodiment of the present application.
FIG. 3 is a block diagram of an enterprise credit assessment system in an embodiment of the present application.
Fig. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.
Referring to fig. 1, fig. 1 is a flow chart of an enterprise credit evaluation method according to an embodiment of the disclosure. The enterprise credit assessment method provided by the embodiment of the application comprises the following steps:
step S100, obtaining a text to be classified corresponding to the enterprise information.
In an embodiment, obtaining text to be classified corresponding to enterprise information includes: collecting enterprise information, wherein the enterprise information comprises credit investigation data of a target object in an enterprise, enterprise asset data and enterprise online operation data; and converting the enterprise information into the text to be classified, wherein the text to be classified is a plurality of texts.
Specifically, the information management system of the enterprise can be docked to acquire data such as production, operation and logistics of the enterprise as online operation data of the enterprise, and meanwhile, credit investigation data of high-level management such as legal representatives, board owners and total managers of the enterprise are called to form enterprise information together for enterprise credit assessment. The enterprise information can be dynamically updated according to the requirements, and the enterprise information is converted into texts to be classified and stored in a preset database for calling.
Step S110, a frequent pattern tree is established according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features.
In an embodiment, before the frequent pattern tree is built according to the occurrence frequency of the key feature corresponding to the text to be classified, the method further includes: taking the words matched with the preset standard words in each text to be classified as target words of the corresponding text to be classified; converting the target word into a feature representation to form a vector space model composed of a plurality of feature representations; and extracting features of the vector space model to obtain a feature set so as to determine the key features based on the feature set.
Specifically, after obtaining a plurality of texts to be classified, each text to be classified may be preprocessed by using a reverse maximum matching method (Reverse Maximum Method, RM method). The dictionary may be constructed in advance based on standard words required for credit evaluation. The method selects a symbol string containing 6-8 Chinese characters as a maximum symbol string, and matches the maximum symbol string with word entries in a dictionary. If the two words cannot be matched, a Chinese character is cut off and the matching is continued until the corresponding word position is found in the dictionary, and the matching direction is from left to right, so that the target word corresponding to each text to be classified is obtained. And regarding each text to be classified as a document, and representing text features corresponding to the target words as metadata of the document so as to obtain vector space models corresponding to a plurality of texts to be classified. Text feature representation refers to metadata of a text, divided into descriptive features (e.g., name, date, size, type of text) and their semantic features (author of text, organization, title, content, etc.). The feature representation is characterized in that a certain feature item is used for representing a document, and only the feature item needs to be processed when text mining is carried out, so that unstructured text is processed. Vector space model (Vector Space Model, VSM) is one of the more effective methods. In this model, the document space is taken as a vector space consisting of a set of orthogonal term vectors, each document d being represented as one of the normalized feature vectors:
γ(d)=(t 1 ,w w (d),…,t i ,w i (d),…,t n ,w n (d))
t is in i Is the term, w i (d) At t i The weights in d. All words occurring in d can be taken as t i May also require t i Is all the phrases that appear in d, thereby improving the accuracy of the representation of the content features.
The dimension of the feature vector obtained by the vector space model often reaches hundreds of thousands of dimensions, so that the feature with high dimension is not necessarily all important and beneficial to the classification learning to be performed, and the feature with high dimension greatly prolongs the learning time of the machine, which is the work to be done by the feature extraction. The feature extraction algorithm evaluates each feature by constructing an evaluation function, and then arranges the features according to the score, and the feature with the highest predetermined score is selected. In text processing, commonly used evaluation functions are information gain, expected cross entropy, mutual information, text evidence weight and word frequency. Feature extraction may be performed using textual evidence weights, and the evaluation function is used to measure the difference between the probability of a class and the conditional probability of the class given the feature, which is better than the expected cross entropy in the experiment. The text evidence weight assessment function is as follows:
wherein C is i The feature representing the target word represents W representing the weight of the corresponding target word.
In an embodiment, determining the key feature based on the feature set further comprises: establishing a word frequency matrix according to the feature set; and performing dimension reduction on the word frequency matrix through singular value decomposition to obtain a target vector set, wherein the target vector set consists of a plurality of key features.
Specifically, the feature set obtained in the previous step can be reduced, and the word frequency matrix can be converted into a k×k singular matrix by using a "singular value decomposition" (Singular Value Decomposition) technology in matrix theory through a potential semantic indexing (latent semantic indexing) method, and the basic steps are as follows:
s111, establishing a word frequency matrix.
S112, singular value analysis of the word frequency matrix is performed, and the word frequency matrix is decomposed into 3 matrices U, S, U. U and V are orthogonal matrices (uv=i), S is a diagonal matrix of singular values (k×k).
S113, for each document d, replacing the original vector with a new vector excluding the eliminated words in SVD.
S114, storing all vector sets, and creating indexes for the vector sets by using an advanced multidimensional indexing technology.
S115, performing similarity calculation by using the converted document vector.
In an embodiment, building a frequent pattern tree according to the occurrence frequency of the key feature corresponding to the text to be classified includes: sorting according to the occurrence frequency of each key feature in the corresponding text to be classified to generate a head list; and reordering all transaction items in each text to be classified according to the sequence of the item header table, removing the transaction items with the support degree smaller than a preset threshold value, and obtaining a new data set so as to construct a frequent pattern tree based on the new data set.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating a dog step process of a frequent pattern tree in an embodiment of the present application. Specifically, a frequent pattern tree may be constructed using FP-Growth association rules. The steps of building the frequent pattern tree are as follows:
(1) Setting a minimum support threshold, searching out frequent item sets for the processed text data, and sorting according to the support degree to form a list, wherein the list is L, and L= { (x) exists i ,,m),(x j ,p),…(x k Q). Wherein i, j, k, m, p and q are any positive integers, and m is more than or equal to p and more than or equal to q.
(2) Starting from the item with the lowest priority, all transaction data containing the item is read, and a frequent tree is constructed for the item conditions.
(3) And pruning the conditional frequent tree according to the minimum support degree, and deleting nodes smaller than the minimum support degree.
(4) And providing a frequent set for the conditional frequent tree subjected to pruning to obtain all the frequent sets containing the item.
(5) Selecting the next item in the reverse order of priority, repeating (2) to (4), and finding frequent sets all containing the item. The construction process of the frequent pattern tree is shown in fig. 2.
(6) Until a frequent set is found that contains all the items with the highest degree of excellent, the algorithm ends.
Step S120, performing level quantization on the corresponding key features according to adjectives associated with each key feature in the frequent pattern tree in the sample to be classified, so as to obtain a weight set of each key feature, where the adjectives are phrases representing the degree.
In an embodiment, the step of performing rank quantization on the corresponding key feature according to the adjective associated with each key feature in the frequent pattern tree in the sample to be classified includes: converting the occurrence frequency of key features in the frequent pattern tree into weights; when the key features have associated adjectives representing degrees, matching preset weights according to the corresponding adjectives so as to correct the weights; and determining a weight set according to the weights of the key features.
Specifically, as shown in table 1, for the frequently concentrated key factors, converting the key factors and the key word occurrence thereof into weights, adopting the ideas of avoiding 'false rejection' and 'false taking' in sampling distribution for the corresponding adjectives, namely, inducing adjectives such as 'high, strong' and the like appearing in the text into a grade, considering that the attribute is close to 1 and has high possibility, and the value range is (0.9,1); the adjectives "low, weak" and the like are generalized to a ranking, and the attribute is considered to have a very low probability of approaching 1, and the value range is 0,0.1. For ease of calculation, the average values of each interval were taken to be 0.95 and 0.05. The sample quantization levels for each key factor are shown in table 1.
And step S130, determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set, and obtaining a credit evaluation result of the corresponding enterprise.
In an embodiment, determining a target feature set from the frequent pattern tree and the weight set comprises:
the key features are used as attributes, and fuzzy processing is carried out on the weight set based on a Bayesian network;
and determining the association probability among the key features according to the weight set after the blurring processing so as to determine the target feature set based on the association probability.
Specifically, modeling may be performed by using a fuzzy bayesian network, and first, a fuzzy bayesian network structure may be constructed, where the foregoing key features in the frequent set may be used as attribute parameters, and the attribute parameters of the fuzzy bayesian network need to conform to the following two assumptions:
(1) parameter global independence. Given network result B, there areThis is true.
(2) The parameters are locally independent. Given network result B, there areThis is true.
And defining proper membership degree according to the Bayesian network node to fuzzify the Bayesian network node to obtain fuzzification weights and values of all the attributes.
Let the Bayesian network have n nodes, X respectively 1 ,X 2 ,…X n Hereinafter, X is represented by X1, X2, …, xn, respectively 1 ,X 2 ,…X n Is a value of (a). Since the attribute values cannot be described accurately, then the appropriate membership is definedFuzzifying it, wherein i is more than or equal to 1 and less than or equal to n meets the following conditions
A:X→{0,1}
Then for random attributes, it can be based on
x→u A (x)
In the "blurring" process of the attribute, the membership of the variable x to a can only be 0 or 1. Fuzzifying the n attribute variables into n discrete fuzzy variables, respectively, in the manner described aboveUse->To representIs a value of (a).
Calculating fuzzy condition mutual informationWhen it is necessary to learn to obtain +.> And->Wherein,
but->And->As in FNBC.
Further, the attribute parameters can be subjected to fuzzification processing by adopting a structure learning mode of FTANC. The structure learning mode of the FTANC is as follows: FTANC is a tree-augmented naive bayes net model with at most two nodes, whose learning structure algorithm and steps are:
the first step: fuzzification of attribute variables defines appropriate membership for attributes if the attribute values cannot be described accurately or if the attribute values are continuousAnd blurring the attribute, and blurring the attribute with random value.
And secondly, calculating the condition mutual information among all attribute variables.
Thus, there are:
and a third step of: is provided withAnd constructing a maximum weight undirected tree and a directed tree for attribute variables of the key factors as weight values of the arcs.
(1) And sorting the edges according to the size of the weight values from high to low.
(2) According to the weight value of the edge, connection is selected according to the sequence from high to low, and the connection cannot form a loop in the cyclic connection process of the edge.
(3) The connection edges are selected and these selected edges constitute the maximum weight undirected tree.
(4) And taking one node as a root node, and taking the root node as a starting point of the extending direction of all the edges, thereby realizing the conversion from the undirected tree to the directed tree.
Fourth step: a class node is added (which is typically related to the attributes of the key factors), and an arc is added between all key factor attribute nodes and the class node.
(2) Fuzzy Bayesian network parameter establishment
(1) Fuzzy prior probability estimation
Only p (x) is learned from the samples i ) It can be given by definition
(2) Fuzzy conditional probability estimation
When (when)Only one parent node->When it is necessary to learn +.>
When the definition and operation of membership functions are known,is calculated by onlyLearning from samples to obtain P (x) i x j )。
Similarly, parameters need to be normalized before being used for reasoning.
(3) FTANC parameter learning algorithm
The FTANC parameter learning algorithm process is as follows:
each node of the TAN network has at most two father nodes, and fuzzy conditional probability is considered according to two conditionsIs calculated by the computer.
(a) If it isThere is a parent node, C, to learn +.>
(b) If it isHas two father nodes, set as +.>And C, study->I.e.
And obtaining the CPT table of the fuzzy Bayesian network according to a calculation formula given by the FTANC result learning stage. The CPT table is the target feature set.
Inference based on fuzzy knowledge ofX should be judged +.>Corresponding classes, i.e.
(7) The generated data is used for comparing with the actual text data, and the feasibility and efficiency of the establishment of the association relationship are judged or verified.
After the target feature set is obtained, the target feature set can be classified based on a support vector machine training classification model to obtain a credit evaluation result. Because the classification capability of the support vector machine is influenced by the kernel function and the model parameters, the RBF kernel function can be used as the kernel function of the support vector machine, and meanwhile, the parameters of the support vector machine are updated through a Bayesian method to obtain a classification model. The credit rating of the enterprise can be classified through the classification model, and a credit evaluation result of the corresponding enterprise is obtained. The specific model training process can be selected and adjusted according to actual application requirements, and will not be described in detail here.
Based on the technical scheme of the embodiment of the application, the data with high dimensionality is selected and processed through the frequent pattern tree and the feature weight, so that the dimensionality of the processed data can be effectively reduced, the modeling complexity is reduced, the cost for extracting unnecessary features is saved, noise included in a data set is removed, and compared with a similar model, the method has stronger robustness and interpretability; by introducing enterprise production operation data to an industrial enterprise credit evaluation system at night, credit evaluation is performed through the screened feature set and classification is performed by using an improved support vector machine model, a perfect enterprise credit rating system can be established, objectivity and accuracy of the enterprise credit system are improved, more financing is facilitated for enterprises, and financial risks of loan institutions are reduced.
In one embodiment, as shown in FIG. 3, an enterprise credit assessment system is provided, comprising: the data acquisition module 30 is configured to acquire text to be classified corresponding to the enterprise information; the association module 31 is configured to establish a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, where the frequent pattern tree is used to characterize association relationships between the key features; the weight quantization module 32 is configured to perform level quantization on the corresponding key features according to adjectives associated with each key feature in the frequent pattern tree in the sample to be classified, so as to obtain a weight set of each key feature, where the adjectives are phrases representing degrees; and the credit classification module 33 is configured to determine a target feature set according to the frequent pattern tree and the weight set, so as to classify the credits based on the target feature set, and obtain a credit evaluation result of the corresponding enterprise.
The enterprise credit assessment system described above may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 4. A computer device, comprising: memory, a processor, and a computer program stored on the memory and executable on the processor.
The various modules in the enterprise credit assessment system described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules can be embedded in the memory of the terminal in a hardware form or independent of the terminal, and can also be stored in the memory of the terminal in a software form, so that the processor can call and execute the operations corresponding to the above modules. The processor may be a Central Processing Unit (CPU), microprocessor, single-chip microcomputer, etc.
As shown in fig. 4, a schematic diagram of the internal structure of the computer device in one embodiment is shown. There is provided a computer device comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of: acquiring a text to be classified corresponding to enterprise information; establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features; performing level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing degrees; and determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set and obtain a credit evaluation result of the corresponding enterprise.
In one embodiment, the computer device may be used as a server, including but not limited to a stand-alone physical server, or a server cluster formed by a plurality of physical servers, and may also be used as a terminal, including but not limited to a mobile phone, a tablet computer, a personal digital assistant, a smart device, or the like. As shown in FIG. 4, the computer device includes a processor, a non-volatile storage medium, an internal memory, a display screen, and a network interface connected by a system bus.
Wherein the processor of the computer device is configured to provide computing and control capabilities to support the operation of the entire computer device. The non-volatile storage medium of the computer device stores an operating system and a computer program. The computer program is executable by a processor for implementing an enterprise credit assessment method provided by the above embodiments. Internal memory in a computer device provides a cached operating environment for an operating system and computer programs in a non-volatile storage medium. The display interface can display data through the display screen. The display screen may be a touch screen, such as a capacitive screen or an electronic screen, and the corresponding instruction may be generated by receiving a click operation on a control displayed on the touch screen.
It will be appreciated by those skilled in the art that the architecture of the computer device illustrated in fig. 4 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than those illustrated, or may combine some components, or have a different arrangement of components.
In one embodiment, a computer readable storage medium is provided having stored thereon a computer program which when executed by a processor performs the steps of: acquiring a text to be classified corresponding to enterprise information; establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features; performing level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing degrees; and determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set and obtain a credit evaluation result of the corresponding enterprise.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), or the like.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.

Claims (10)

1. A method for evaluating credit of an enterprise, comprising:
acquiring a text to be classified corresponding to enterprise information;
establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing the association relation among the key features;
performing level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing degrees;
and determining a target feature set according to the frequent pattern tree and the weight set, so as to classify the credit based on the target feature set and obtain a credit evaluation result of the corresponding enterprise.
2. The method for evaluating credit of an enterprise according to claim 1, wherein obtaining text to be classified corresponding to enterprise information comprises:
collecting enterprise information, wherein the enterprise information comprises credit investigation data of a target object in an enterprise, enterprise asset data and enterprise online operation data;
and converting the enterprise information into the text to be classified, wherein the text to be classified is a plurality of texts.
3. The method for evaluating credit of enterprises according to claim 2, wherein before establishing the frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, further comprises:
taking the words matched with the preset standard words in each text to be classified as target words of the corresponding text to be classified;
converting the target word into a feature representation to form a vector space model composed of a plurality of feature representations;
and extracting features of the vector space model to obtain a feature set so as to determine the key features based on the feature set.
4. The enterprise credit assessment method of claim 3, wherein determining the key features based on the feature set further comprises:
establishing a word frequency matrix according to the feature set;
and performing dimension reduction on the word frequency matrix through singular value decomposition to obtain a target vector set, wherein the target vector set consists of a plurality of key features.
5. The enterprise credit assessment method according to claim 1, wherein building a frequent pattern tree according to the occurrence frequency of key features corresponding to the text to be classified comprises:
sorting according to the occurrence frequency of each key feature in the corresponding text to be classified to generate a head list;
and reordering all transaction items in each text to be classified according to the sequence of the item header table, removing the transaction items with the support degree smaller than a preset threshold value, and obtaining a new data set so as to construct a frequent pattern tree based on the new data set.
6. The enterprise credit assessment method of claim 1, wherein ranking key features according to adjectives associated in the sample to be classified for each key feature in the frequent pattern tree comprises:
converting the occurrence frequency of key features in the frequent pattern tree into weights;
when the key features have associated adjectives representing degrees, matching preset weights according to the corresponding adjectives so as to correct the weights;
and determining a weight set according to the weights of the key features.
7. The enterprise credit assessment method of claim 6, wherein determining a target feature set from the frequent pattern tree and the weight set comprises:
the key features are used as attributes, and fuzzy processing is carried out on the weight set based on a Bayesian network;
and determining the association probability among the key features according to the weight set after the blurring processing so as to determine the target feature set based on the association probability.
8. A method for evaluating credit of an enterprise, comprising:
the data acquisition module is used for acquiring texts to be classified corresponding to the enterprise information;
the association module is used for establishing a frequent pattern tree according to the occurrence frequency of the key features corresponding to the text to be classified, wherein the frequent pattern tree is used for representing association relations among the key features;
the weight quantization module is used for carrying out level quantization on the corresponding key features according to adjectives associated with the key features in the frequent pattern tree in the sample to be classified to obtain a weight set of the key features, wherein the adjectives are word groups representing the degree;
and the credit classification module is used for determining a target feature set according to the frequent pattern tree and the weight set so as to classify the credit based on the target feature set and obtain a credit evaluation result of a corresponding enterprise.
9. An apparatus, comprising:
one or more processors; and
one or more machine readable media having instructions stored thereon, which when executed by the one or more processors, cause the apparatus to perform the enterprise credit assessment method of any of claims 1-7.
10. One or more machine readable media having instructions stored thereon that, when executed by one or more processors, cause an apparatus to perform the enterprise credit assessment method of any of claims 1-7.
CN202311237487.0A 2023-09-22 2023-09-22 Enterprise credit assessment method, system, equipment and medium Pending CN117291708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311237487.0A CN117291708A (en) 2023-09-22 2023-09-22 Enterprise credit assessment method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311237487.0A CN117291708A (en) 2023-09-22 2023-09-22 Enterprise credit assessment method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN117291708A true CN117291708A (en) 2023-12-26

Family

ID=89240294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311237487.0A Pending CN117291708A (en) 2023-09-22 2023-09-22 Enterprise credit assessment method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN117291708A (en)

Similar Documents

Publication Publication Date Title
CN110069709B (en) Intention recognition method, device, computer readable medium and electronic equipment
CN109948149B (en) Text classification method and device
US11556716B2 (en) Intent prediction by machine learning with word and sentence features for routing user requests
CN110674850A (en) Image description generation method based on attention mechanism
US20230075341A1 (en) Semantic map generation employing lattice path decoding
CN113220886A (en) Text classification method, text classification model training method and related equipment
CN111126067B (en) Entity relationship extraction method and device
Estevez-Velarde et al. AutoML strategy based on grammatical evolution: A case study about knowledge discovery from text
CN110634060A (en) User credit risk assessment method, system, device and storage medium
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN111709225A (en) Event cause and effect relationship judging method and device and computer readable storage medium
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN115270988A (en) Fine adjustment method, device and application of knowledge representation decoupling classification model
CN113011689B (en) Evaluation method and device for software development workload and computing equipment
CN114119191A (en) Wind control method, overdue prediction method, model training method and related equipment
CN115983982A (en) Credit risk identification method, credit risk identification device, credit risk identification equipment and computer readable storage medium
CN117291708A (en) Enterprise credit assessment method, system, equipment and medium
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
CN113837307A (en) Data similarity calculation method and device, readable medium and electronic equipment
CN111400413A (en) Method and system for determining category of knowledge points in knowledge base
CN110909777A (en) Multi-dimensional feature map embedding method, device, equipment and medium
Pchelin et al. Analysis of machine learning models by solving the text data classification problem
CN113836244B (en) Sample acquisition method, model training method, relation prediction method and device
Kampfer Performance and Interpretability of Machine Learning Algorithms for Credit Risk Modelling
Duan et al. Stock Investors' Preferences on Stock Forum Topics Based on FNS-LDA2vec

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination