CN117422312A - Assessment method, medium and device for enterprise management risk - Google Patents

Assessment method, medium and device for enterprise management risk Download PDF

Info

Publication number
CN117422312A
CN117422312A CN202311738929.XA CN202311738929A CN117422312A CN 117422312 A CN117422312 A CN 117422312A CN 202311738929 A CN202311738929 A CN 202311738929A CN 117422312 A CN117422312 A CN 117422312A
Authority
CN
China
Prior art keywords
enterprise
data
risk
environment
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311738929.XA
Other languages
Chinese (zh)
Other versions
CN117422312B (en
Inventor
朱向东
兰一杰
张凯
黄文敏
梁丽莉
张标金
李炳鸿
曾怀勋
张冰
蔡裕燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Shida Group Co ltd
Original Assignee
Fujian Shida Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Shida Group Co ltd filed Critical Fujian Shida Group Co ltd
Priority to CN202311738929.XA priority Critical patent/CN117422312B/en
Publication of CN117422312A publication Critical patent/CN117422312A/en
Application granted granted Critical
Publication of CN117422312B publication Critical patent/CN117422312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention provides an enterprise management risk assessment method, medium and device, wherein the method comprises the following steps: acquiring large environment text content related to large environment factors, and acquiring large environment measurement data comprising large environment entity information and large environment influence grading based on a large language model; acquiring enterprise basic information data, enterprise business data and enterprise risk and abnormality judgment data to obtain enterprise measurement data; the method comprises the steps of using industry classification and region coding as association fields to associate large environment measurement data with enterprise measurement data to obtain a data set to be trained, and training a neural network model to obtain a trained enterprise management risk assessment model; and performing business risk assessment on the enterprise by using the enterprise business risk assessment model. The invention avoids enterprise operation impact caused by the influence of large environmental factors, eliminates the limitation of evaluating enterprise operation abnormality and risk only in the business data boundary, and ensures the prediction accuracy and stability of enterprise operation risk.

Description

Assessment method, medium and device for enterprise management risk
Technical Field
The present invention relates to the field of enterprise data processing technologies, and in particular, to a method, medium, and apparatus for evaluating enterprise management risk.
Background
In the development process of enterprises, the management risks and abnormal conditions of the enterprises need to be prejudged and remedial measures are taken.
In the prior art, for example, an enterprise risk assessment method, computer equipment and storage medium based on nonlinear dimension reduction with the patent publication number of CN115131039B effectively quantizes the business health degree of an enterprise by combining the analysis experience of professionals and provides a multi-factor and interpretable enterprise risk assessment result; as another example, the patent publication number CN110135689a is an enterprise management risk early warning method device and a computer readable storage medium, and the enterprise management risk early warning model is obtained by training the BP neural network model. However, in practical application, the technology disclosed in the above patent has low accuracy in predicting enterprise management risk and high volatility.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an enterprise operation risk assessment method, medium and device, which can ensure the accuracy and stability of enterprise operation risk prediction.
In order to achieve the above purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for evaluating risk of enterprise operations, including:
step S1, acquiring macro environment text content related to macro environment factors, carrying out named entity recognition on the macro environment text content based on a macro language model to obtain macro environment entity information, and carrying out semantic emotion analysis to obtain macro environment influence grading to form macro environment measurement data comprising the macro environment entity information and the macro environment influence grading;
step S2, acquiring enterprise basic information data, enterprise business data and enterprise risk and abnormality judgment data to obtain enterprise measurement data;
step S3, industry classification and region coding are used as associated fields to associate the large environment measurement data with the enterprise measurement data to obtain a data set to be trained, wherein each sample data in the data set to be trained comprises enterprise basic information data, enterprise business data and large environment measurement data which are used as input data and enterprise risk and abnormality judgment data which are used as output data;
step S4, training the neural network model based on the data set to be trained to obtain a trained enterprise management risk assessment model;
s5, inputting real-time large environment measurement data, real-time basic information data and real-time business data of an enterprise into an enterprise management risk assessment model to obtain the management risk of the enterprise;
the step S1 includes:
acquiring macro-environmental text content related to macro-environmental factors, wherein the macro-environmental factors comprise government policies and risk events;
inputting first prompt content and the large environment text content in a large language model, so that the large language model carries out named entity recognition on the large environment text content to obtain large environment entity information, wherein the first prompt content comprises a description for named entity recognition and an entity object to be generated, and the entity object comprises large environment factors, influence industries, influence areas and influence duration;
inputting second prompt content and the large environment text content in a large language model, so that the large language model carries out semantic emotion analysis on the large environment text content to obtain large environment influence grading, wherein the second prompt content comprises explanation needing semantic emotion analysis and emotion content needing generation;
and summarizing the macro environment entity information and the macro environment influence grading to obtain macro environment measurement data.
The invention has the beneficial effects that: according to the invention, the large environmental factors are converted into the large environmental measurement data, the large environmental measurement data are associated with the enterprise measurement data through industry classification and region coding, and the associated data set is trained by the neural network model, so that enterprise operation impact caused by the influence of the large environmental factors is avoided, the limitation of evaluating enterprise operation abnormality and risk only in the boundary of the business data is eliminated, deep reasons can be examined and analyzed at a higher visual angle, and the prediction accuracy and stability of enterprise operation risk are ensured. And meanwhile, when the large environment factors are considered, the large environment text content is subjected to named entity recognition through the large language model, and the large environment influence classification is obtained based on semantic emotion analysis, so that subjective deviation caused by subjective judgment of a user is avoided, and the large environment measurement data is more objective, reasonable and reliable.
Optionally, the step S2 further includes:
acquiring an enterprise unique identifier in enterprise basic information data, converting and updating old version unified credit codes in the enterprise unique identifier into new version unified credit codes in batches according to a code conversion relation, and converting and updating the business registration numbers in the enterprise unique identifier into the new version unified credit codes in batches according to a code mapping relation;
and cleaning and converting repeated data and abnormal data for the unique enterprise identifier in each enterprise so that each enterprise primary key is unique and belongs to the same category.
According to the above description, the unification of the enterprise primary key is realized.
Optionally, the step S2 further includes:
when the enterprise risk and abnormality judgment data are associated with the enterprise unique identification, the classification level in the risk abnormality classification table is used as a selection priority, and all enterprise risk and abnormality judgment data are judged through a COALESCE function so as to obtain a final association relation.
From the above description, it is possible to eliminate enterprise risk and abnormal judgment data conflict or duplicate information.
Optionally, the step S4 includes:
acquiring a predesigned multi-layer perceptron;
training the pre-designed multi-layer perceptron according to initial training super-parameters and the data set to be trained, and adjusting the training super-parameters according to the prediction effect in the training process to finally obtain the enterprise operation risk assessment model conforming to the expected effect, wherein the initial training super-parameters comprise the iterative rounds of epochs of 100-200, the batch size of 32-128, the initialization learning rate of 0.01 and the drop probability of 0.2.
According to the above description, the setting of the initial training hyper-parameters makes the training effect of the multi-layer perceptron better, and helps to prevent the model from being over fitted.
Optionally, the multi-layer perceptron uses a cross entropy loss function and Adam optimizer, which includes:
an input layer having a number of neurons consistent with a number of sample features of the input data;
a hidden layer, set to 2 layers, 8 neurons and 4 neurons, respectively, and using a ReLU activation function;
and an output layer that activates a function using Softmax.
Optionally, the step S4 further includes:
step S6, loading an evaluation mode of the enterprise operation risk evaluation model, storing the evaluation mode in a variable model, analyzing a matrix of weights of all layers in the enterprise operation risk evaluation model, storing the weight corresponding to each feature in a Python dictionary feature_weights, and forming a visual analysis graph according to the features_weights, names corresponding to all features of an input layer in the enterprise operation risk evaluation model and column names of each column of features in the input data, wherein the visual analysis graph visually displays the influence weight and the combination mode of each feature in the input data in the enterprise operation risk evaluation model;
wherein, step S5 and step S6 are in parallel connection.
Optionally, the step S6 further includes:
and S7, generating and outputting conclusion documents for analyzing the internal and external cause combination relation and the influence degree of the enterprise risk in a preset time period based on the visual analysis chart.
In a second aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed, implements a method of assessing risk of an enterprise business as in the first aspect.
In a third aspect, the present invention provides an enterprise management risk assessment apparatus, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements a method for assessing an enterprise management risk according to the first aspect when the processor executes the computer program.
The technical effects corresponding to the computer readable storage medium provided in the second aspect and the enterprise operation risk assessment device provided in the third aspect refer to the related description of the enterprise operation risk assessment method provided in the first aspect.
Drawings
FIG. 1 is a schematic flow chart of an enterprise business risk assessment method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a framework of an enterprise management risk assessment apparatus according to an embodiment of the present invention.
[ reference numerals description ]
1: an enterprise management risk assessment device;
2: a processor;
3: a memory.
Detailed Description
In order that the above-described aspects may be better understood, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Example 1
The method and the device are suitable for application scenes in which business risks of enterprises need to be evaluated. In the prior art, the influence of large environmental factors on enterprise constitution is not generally considered, the operation of enterprises can be fluctuated in stages under the influence of the large environmental factors, and the influence can be fatal to small and medium-sized enterprises, so that the prediction accuracy of enterprise operation risks is low and the fluctuation is large. According to the embodiment, the large environmental factors are converted into the large environmental measurement data, the large environmental measurement data are associated with the enterprise measurement data through industry classification and region coding, and the associated data set is trained by the neural network model, so that enterprise operation impact caused by the influence of the large environmental factors is avoided, the limitation that enterprise operation abnormality and risk are evaluated only in the service data boundary is eliminated, deep reasons can be examined and analyzed at a higher visual angle, and the prediction accuracy and stability of enterprise operation risk are ensured.
See in particular the description below.
Referring to fig. 1, a method for evaluating enterprise management risk includes:
step S1, acquiring macro environment text content related to macro environment factors, carrying out named entity recognition on the macro environment text content based on a macro language model to obtain macro environment entity information, and carrying out semantic emotion analysis to obtain macro environment influence grading to form macro environment measurement data comprising the macro environment entity information and the macro environment influence grading;
wherein, step S1 includes:
step S11, acquiring large environment text content related to large environment factors, wherein the large environment factors comprise government policies and risk events;
government policies refer to related policies issued by administrative units, including economic policies, industry policies, support policies, and regulation policies, among others.
The risk event mainly comprises:
(1) Public health event: public health events such as new infections, water or air pollution, drug abuse, and the like, harm to citizens health results in reduced productivity, and reduced public consumer will, resulting in reduced revenue for the enterprise.
(2) Natural disaster event: natural disasters such as extreme drought weather, extra flood disasters, earthquakes and the like can directly cause damage to enterprise facilities and equipment, and operation falls into paralysis. At the same time, traffic disruption affects staff attendance and raw material transport. Post-disaster reconstruction funds shortages are also a problem facing enterprises.
(3) Technological innovation event: conditions such as technology innovation formation technology replacement risk, competition situation reconstruction, input cost pressure, insufficient skill update and the like can lead to replacement of existing products and skills of social enterprises, and risk of industry lag and input cost increase is faced. Meanwhile, competitors acquire a precedent machine by using new technology, and enterprises face increased competitive pressure.
In view of the fact that the occurrence frequency of government policies and major risk events is not high, a method for manually collecting relevant information is adopted. The source channel of the information is determined by: official websites of government authorities, government news agencies, official media reports, government statistics, third party research institutions and the like.
Step S12, inputting first prompt content and large environment text content in the large language model, so that the large language model carries out named entity recognition on the large environment text content to obtain large environment entity information, wherein the first prompt content comprises explanation for named entity recognition and entity objects to be generated, and the entity objects comprise large environment factors, influence industries, influence areas and influence duration;
the large environmental factors are descriptions of large environmental events, the influence industry refers to the industry influenced by the large environmental factors, the influence area refers to the area influenced by the large environmental factors, and the influence duration refers to the duration of influence caused by the large influence factors.
In this embodiment, a domestic large language model or an open source large language model is selected and used to construct the model. After the first prompt content of named entity recognition is written, inputting the text content of the large environment, and the large language model can recognize the named entity according to the prompt content of your to obtain the information of the large environment entity.
Specifically, the first prompt content, the large environment text content and the obtained large environment entity information input in the domestic "Xingfeistar fire large model" are respectively as follows:
first prompt content: the operations NER (Named Entity Recognition) are performed on physical objects including regulations (letters, regulations), characters, things, places, times, etc.
Large environment text content: AA guidance opinions are issued in XX, and firstly trial points are recommended in BB areas, wherein important attention is paid to enterprises such as mechanical production industry, import and export trade enterprises, information technology enterprises and the like. The enterprises are helped in policy support and fund support.
Large environment entity information: in this text we can identify the following entities:
1. regulations (hair, provision): AA guidance opinion
2. Time: XX year
3. Location: BB region
4. The following matters: mechanical production industry, import and export trade enterprises and informatization technical enterprises
5. Character: without any means for
The four entities correspond to large environmental factors, influence duration, influence industry and influence areas respectively.
Step S13, inputting second prompt content and large environment text content into the large language model, so that the large language model carries out semantic emotion analysis on the large environment text content to obtain large environment influence grading, wherein the second prompt content comprises explanation needing semantic emotion analysis and emotion content needing to be generated;
in this embodiment, a domestic large language model or an open source large language model is selected and used to construct the model. After the second prompt content of semantic emotion analysis is written, inputting the large-environment text content, and carrying out semantic emotion analysis by the large language model according to the prompt content to obtain large-environment influence grading.
Specifically, the second prompt content, the large environment text content and the obtained large environment influence grade input in the domestic "Xingfeistar fire large model" are respectively as follows:
the second prompt content: emotion analysis (Sentiment Analysis) is performed on the following, the output content including:
1. emotion rating: the magnitude of the influence is divided into 7 levels (-3, -2, -1, 0, 1, 2, 3), positive numbers representing positive, negative numbers representing negative, and 0 representing neutral.
2. Opinion extraction: and extracting core opinions in the content, wherein the number of words is controlled within 20 words.
Large environment text content: AA guidance opinions are issued in XX, and firstly trial points are recommended in BB areas, wherein important attention is paid to enterprises such as mechanical production industry, import and export trade enterprises, information technology enterprises and the like. The enterprises are helped in policy support and fund support.
Large environmental impact grading:
emotion rating: 2
Opinion extraction: the guidance opinion provides policy and fund support for mechanical production, import and export trade and informatization technical enterprises in BB regions.
In other embodiments, more standard contents may be added in the second prompting content, for example, a country may be added to classify natural disasters, including a first level, a second level, a third level and a fourth level, which are respectively indicated by red, orange, yellow and blue, and the first level is the highest level.
And step S14, summarizing the large environment entity information and the large environment influence classification to obtain large environment measurement data.
In the embodiment, when the large environment factors are considered, the large environment text content is named entity identified through the large language model, and the large environment influence classification is obtained based on semantic emotion analysis, so that subjective deviation caused by subjective judgment of a user is avoided, and the large environment measurement data is objective, reasonable and reliable.
On this basis, an auxiliary judgment is made by government industry specialists, and some exemplary judgment criteria regarding government policies include that government policies are divided from 5 criteria, as shown in table 1 below.
TABLE 1 government policy standards
Thus, in this embodiment, industry codes for industry classification are standard GB/T4754-2017 and area codes are standard GB/T2260, resulting in large environmental metric data as in Table 2.
TABLE 2 big environmental metric data
Step S2, acquiring enterprise basic information data, enterprise business data and enterprise risk and abnormality judgment data to obtain enterprise measurement data;
in this embodiment, the basic information data of the enterprise reflects the basic condition, and reasonable granularity analysis can be performed by using the enterprise type and the registered area information, including: business name, organization form, industry, time of establishment, registered capital, legal representative.
In this embodiment, the business data of the enterprise reflects the business condition, and the business risk of the enterprise is reflected on the business data, including: staff number, total assets, equity, annual business income, annual business profit, tax, credit, main business income proportion, main product proportion, market share, main customer, main supplier, business time.
In this embodiment, risk and anomaly data decision data are used for supervision labels when evaluating risk, for training and verifying the effect of the evaluation. Specifically, the results are shown in Table 3.
TABLE 3 related requirements for risk and anomaly data decision data
As can be seen from table 3, in this embodiment, the different subjects divide the judgment result data into different levels, for example, 1, 2, and 3, wherein the larger the value is, the higher the level is, and the repetition result and the collision result are selected according to the level priority when the data are associated. The judgment result is divided into a historical judgment result and a current judgment result based on time.
And after the enterprise data are acquired, carrying out lake entering operation treatment. The treatment method comprises the following steps: and building a front-end processor open network for accessing the data of the city supervision bureau, connecting to a data source through the JDBC function of the data source in the data lake system, and performing incremental synchronization data by a configuration field. And (3) connecting Excel files through the file function of a data source in a data lake system, and performing coverage aggregation in a full-scale mode.
The social unified credit code and the industrial and commercial registration number can be used as the unique identification of enterprises, but the social unified credit code has two sets of standards, and the industrial and commercial registration number is different from the social unified credit code in terms of coding system, and in the obtained data table, 3 coding rules are possible to be used as main keys.
Thus, in this embodiment, step S2 further includes:
step S21, obtaining enterprise unique identifiers in enterprise basic information data, converting and updating old version unified credit codes in the enterprise unique identifiers into new version unified credit codes in batches according to a code conversion relation, and converting and updating business registration numbers in the enterprise unique identifiers into the new version unified credit codes in batches according to a code mapping relation;
the rule for converting the old code into the new code is uniformly formulated by the country and needs to be carried out according to a conversion relation table published by the country. The mapping of the new version code can be carried out on different segment bit combinations of the old version code in the conversion relation table, and the basic principle is to keep the information in the original code of the enterprise as far as possible. For some special cases, the old code cannot fully correspond to the new code, requiring the enterprise to reissue the new version of the code.
The relevant government departments of each county need to check the corresponding relation between the business registration numbers of enterprises in the jurisdiction and the unified social credit codes to form a mapping table. Checking among different counties, checking whether registered numbers which are repeated across regions exist, and adjusting the mapping relation. And (5) reporting the registration numbers of all counties and the credit code mapping table to the related government departments of the city level in a centralized way. And checking the coding mapping relation of the summarized whole city by the city bureau, and reporting the coding mapping relation to the provincial related government departments.
Step S22, cleaning and converting repeated data and abnormal data are carried out on enterprise unique identifiers in each enterprise, so that each enterprise primary key is unique and belongs to the same category.
Therefore, the consistency and uniqueness of the enterprise primary key is ensured, and the new edition of unified social credit code is directly input by the subsequent newly-added data.
On the basis, the enterprise data is subjected to data management, including business rule management such as enterprise registration information management, product supervision information management, metering supervision information management and the like, and the management is carried out according to the existing standard as reference data, and the consistency of the data is checked through data association. The method comprises the following steps of: the enterprise name/address/legal person can not register repeatedly, the business registration number and the unified social credit code must be in one-to-one correspondence, the company type must accord with the operation range of registration permission, and the registration information change must be subjected to historical version control; product supervision information management: the product registration information must be complete and accurate, the product standard must meet mandatory requirements, the product production license must be consistent with the actual production situation, and the product recall information needs to be recorded and tracked in time; metering supervision information management: the metering standard must be reasonably formulated for different industry regions, the accuracy of the metering appliance is checked regularly, and the record and the treatment of the metering error exceeding the standard must be complete.
At this time, there are various types of enterprise risk and abnormality judgment data in the enterprise data, and there is a possibility that collision or duplicate information may occur.
Thus, in this embodiment, step S2 further includes:
step S23, a preset risk abnormality classification table is obtained, when the enterprise risk and abnormality judgment data are associated with the enterprise unique identification, classification levels in the risk abnormality classification table are used as selection priorities, and all the enterprise risk and abnormality judgment data are judged through a COALESCE function, so that a final association relation is obtained.
Finally, according to the industry classification column provided by the government departments, the duplication is removed through SQL DISTINCT clauses, and then the corresponding mapping table is constructed by manual check by taking the industry classification codes in the GB/T4754-2017 standard as the standard. And the business data is associated with the mapping table to complete industry classification standardization.
The regional code standardization is carried out on the enterprise location in the same way, but the national standard is used as national standard GB/T2260.
Step S3, industry classification and region coding are used as associated fields to associate large environment measurement data with enterprise measurement data to obtain a data set to be trained, wherein each sample data in the data set to be trained comprises enterprise basic information data, enterprise service data and large environment measurement data which are used as input data, and enterprise risk and abnormality judgment data which are used as output data;
wherein, the 'unified social credit code' is used for association through SQL; and (3) using industry classification and region coding as association fields, and associating the enterprise measurement data with the large environment measurement data through SQL sentences to obtain a table A, and storing the table A in a Hive database.
And then reading a table A in the Hive database by using Pyspark in a Hadoop technical stack in a data lake system to obtain data, wherein the data extraction time is nearly one year.
Example of #Pyspark
df=spark.sql ("select from a date > current time-1 year")
data = df.toPandas()
Then, the extracted data is divided into a feature data set f and a target data set t.
Example of #Pyspark
f=data [ [ 'basic information column', 'business data column', 'policy/risk event metric value' ] ] # feature set
t=data [ [ 'risk tag column' ] ] ] # target set
And finally, dividing data by using a sklearn packet integrated in a data lake system to obtain training, testing and verifying data.
Step S4, training the neural network model based on the data set to be trained to obtain a trained enterprise management risk assessment model;
in this embodiment, step S4 includes:
s41, acquiring a predesigned multi-layer perceptron;
in this embodiment, the multi-layer perceptron includes:
an input layer whose number of neurons is identical to the number of sample features of input data;
a hidden layer, set to 2 layers, 8 neurons and 4 neurons, respectively, and using a ReLU activation function;
and an output layer that activates a function using Softmax;
the multi-layer perceptron also uses a cross entropy loss function and Adam optimizer.
Specifically:
(1) Designing input layers of a multi-layer perceptron
The number of neurons in the input layer is consistent with the number of features of the sample, and 24 columns are generated by the processing in steps S1 and S2. The output layer neurons are designed to be 24, and the output layer neurons are adjusted according to actual changing conditions in the implementation process.
(2) Design of hidden layers of a multi-layer perceptron
The number of hidden layers determines the network fitting capability, typically 1-3 layers. The layer 2 is set in this application, 8 neurons and 4 neurons, respectively. And fitting processing is added according to actual execution conditions.
(3) Designing an output layer of a multi-layer perceptron
Two neurons are respectively represented as two categories of 'risk and no risk', and the category with higher output probability is selected as a prediction category during prediction.
(4) Activation function, loss function and optimizer for selective use
Activation function: the input layer does not need an activation function; the hidden layer can use a ReLU activation function, so that the gradient disappearance problem can be avoided; the output layer normalizes the 2 outputs to a probability distribution using a Softmax activation function. Loss function: the distance between the output probability distribution and the target probability distribution is measured using a cross entropy loss function. An optimizer: an Adam optimizer was used.
And step S42, training the pre-designed multi-layer perceptron according to an initial training super-parameter and a data set to be trained, and adjusting the training super-parameter according to the prediction effect in the training process to finally obtain the enterprise operation risk assessment model conforming to the expected effect, wherein the initial training super-parameter comprises 100-200 iterative rounds of epochs, the batch size is 32-128, the initial learning rate is 0.01, and the drop probability is 0.2.
The set initial training super parameters are as follows: the iteration number epochs is 100-200 rounds moderate; the batch size is 32-128 moderate; the initial learning rate is preferably 0.01, and when Adam optimization is used, 0.01 is a better initial learning rate; dropout-discard probability takes 0.2, and proper random discard helps to prevent overfitting.
Finally, the Pytorch packet integrated in the data lake system is realized by codes according to the multi-layer perceptron designed in the steps, and the super parameters of the model are adjusted according to the prediction effect in the training process. And finally obtaining the multi-layer perceptron MLP and the trained model file traained_model.
And S5, inputting real-time large environment measurement data, real-time enterprise basic information data and real-time enterprise business data of the enterprise into an enterprise management risk assessment model to obtain the management risk of the enterprise.
After the model is trained, if the enterprise has the need of carrying out management risk assessment, the current related data is input into the enterprise management risk assessment model, and the current management risk of the enterprise can be obtained.
In conclusion, the method and the system avoid enterprise operation impact caused by the influence of large environmental factors, eliminate the limitation of evaluating enterprise operation abnormality and risk only in the business data boundary, examine and analyze deep reasons at a higher visual angle, and ensure the prediction accuracy and stability of enterprise operation risk.
Example two
Referring to fig. 1, a method for evaluating risk of enterprise operations further includes the steps of, based on the first embodiment, after step S4
Step S6, loading an evaluation mode of an enterprise operation risk evaluation model, storing the evaluation mode in a variable model, analyzing a matrix of weights of all layers in the enterprise operation risk evaluation model, storing the weight corresponding to each feature in a Python dictionary feature_weights, forming a visual analysis chart according to the names corresponding to all features of an input layer in the feature_weights and the column names of each column of features in input data in the enterprise operation risk evaluation model, and visually displaying the influence weight and the combination mode of each feature in the input data in the enterprise operation risk evaluation model in the visual analysis chart;
in this embodiment, after the stable multi-layer perceptron is trained, how the model predicts can be understood by analyzing the weight and the combination relation of the model, so as to analyze the combination factors and the importance degree of the enterprise risk and abnormality. The extraction weights and the combination relationships are demonstrated by the following pseudo codes, and the evaluation modes of the model are loaded first and stored in the variable model.
import torch.nn as nn
model=mlp () # creates an instance of an MLP (multi-layer perceptron) model
model.load_state_subject (torch.load ("trained_model.pt")) # loads parameters (weights, biases, etc.) of the pre-trained model into the model
model eval () # switches model to evaluation mode
Then, a matrix of weights of each layer in the enterprise management risk assessment model is analyzed, and the weights corresponding to each feature are stored in a Python dictionary feature_weights.
Weight of each feature is calculated, # the parameter model in the function uses the variable model of the code
def analyze_feature_weights(model):
feature_weights = {}
# obtain weight matrix for each layer
for name, parameters in model. Name_parameters () # traverse all parameters of the model.
if 'weight' in name: # determines whether the current parameter is a weight matrix
layer_name=name split (') [0] # extracts the name of the layer from the parameter name
if the current layer name is not in the feature_weights dictionary, say that the weights of the layer are encountered for the first time, an empty tensor needs to be created to store the weight data.
feature_weights[layer_name] = param.data
return feature_weights
Finally, the matrix of weights of each layer is stored in the feature_weights dictionary, and for better analysis, a visualization of the matrix of weights is constructed using Matplotlib packages. Besides the variable feature_weights calculated by using the matrix of weights, the feature_names list also needs to contain names corresponding to the features of the input layer, and the column names of the feature data set f in the step S3 are used to finally obtain the analysis chart P.
The names corresponding to the features of the input layer are code names in the network, so that the weight features and the actual reality names need to be mapped through the code names, and visual display is performed.
And S7, generating and outputting conclusion documents for analyzing the internal and external cause combination relation and the influence degree of the enterprise risk in a preset time period based on the visual analysis chart.
Wherein, the step S5 and the step S6 are in parallel connection, not in sequence, and the step S6 and the step S7 are in sequence connection, i.e. the step S5 may be performed after the step S4, or the steps S6 and S7 may be sequentially performed.
Based on the analysis chart P of the step S6, the internal and external reasons of the enterprise risk are analyzed in the time period of the business data, and a conclusion document is output.
Example III
The present invention provides a computer readable storage medium, on which a computer program is stored, which when executed implements a method for assessing risk of an enterprise business in accordance with one or two of the first or second embodiments.
Example IV
Referring to fig. 2, an enterprise management risk assessment apparatus 1 includes a memory 3, a processor 2, and a computer program stored in the memory 3 and capable of running on the processor 2, wherein the processor 2 implements the steps of the first or second embodiment when executing the computer program.
Since the system/device described in the foregoing embodiments of the present invention is a system/device used for implementing the method of the foregoing embodiments of the present invention, those skilled in the art will be able to understand the specific structure and modification of the system/device based on the method of the foregoing embodiments of the present invention, and thus will not be described in detail herein. All systems/devices used in the methods of the above embodiments of the present invention are within the scope of the present invention.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third, etc. are for convenience of description only and do not denote any order. These terms may be understood as part of the component name.
Furthermore, it should be noted that in the description of the present specification, the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to a specific feature, structure, material, or characteristic described in connection with the embodiment or example being included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art upon learning the basic inventive concepts. Therefore, the appended claims should be construed to include preferred embodiments and all such variations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, the present invention should also include such modifications and variations provided that they come within the scope of the following claims and their equivalents.

Claims (9)

1. A method for assessing business risk of an enterprise, comprising:
step S1, acquiring macro environment text content related to macro environment factors, carrying out named entity recognition on the macro environment text content based on a macro language model to obtain macro environment entity information, and carrying out semantic emotion analysis to obtain macro environment influence grading to form macro environment measurement data comprising the macro environment entity information and the macro environment influence grading;
step S2, acquiring enterprise basic information data, enterprise business data and enterprise risk and abnormality judgment data to obtain enterprise measurement data;
step S3, industry classification and region coding are used as associated fields to associate the large environment measurement data with the enterprise measurement data to obtain a data set to be trained, wherein each sample data in the data set to be trained comprises enterprise basic information data, enterprise business data and large environment measurement data which are used as input data and enterprise risk and abnormality judgment data which are used as output data;
step S4, training the neural network model based on the data set to be trained to obtain a trained enterprise management risk assessment model;
s5, inputting real-time large environment measurement data, real-time basic information data and real-time business data of an enterprise into an enterprise management risk assessment model to obtain the management risk of the enterprise;
the step S1 includes:
acquiring macro-environmental text content related to macro-environmental factors, wherein the macro-environmental factors comprise government policies and risk events;
inputting first prompt content and the large environment text content in a large language model, so that the large language model carries out named entity recognition on the large environment text content to obtain large environment entity information, wherein the first prompt content comprises a description for named entity recognition and an entity object to be generated, and the entity object comprises large environment factors, influence industries, influence areas and influence duration;
inputting second prompt content and the large environment text content in a large language model, so that the large language model carries out semantic emotion analysis on the large environment text content to obtain large environment influence grading, wherein the second prompt content comprises explanation needing semantic emotion analysis and emotion content needing generation;
and summarizing the macro environment entity information and the macro environment influence grading to obtain macro environment measurement data.
2. The method for evaluating risk of enterprise operations according to claim 1, wherein the step S2 further comprises:
acquiring an enterprise unique identifier in enterprise basic information data, converting and updating old version unified credit codes in the enterprise unique identifier into new version unified credit codes in batches according to a code conversion relation, and converting and updating the business registration numbers in the enterprise unique identifier into the new version unified credit codes in batches according to a code mapping relation;
and cleaning and converting repeated data and abnormal data for the unique enterprise identifier in each enterprise so that each enterprise primary key is unique and belongs to the same category.
3. The method for evaluating risk of enterprise operations according to claim 1, wherein the step S2 further comprises:
when the enterprise risk and abnormality judgment data are associated with the enterprise unique identification, the classification level in the risk abnormality classification table is used as a selection priority, and all enterprise risk and abnormality judgment data are judged through a COALESCE function so as to obtain a final association relation.
4. The method for evaluating risk of enterprise operations according to claim 1, wherein the step S4 comprises:
acquiring a predesigned multi-layer perceptron;
training the pre-designed multi-layer perceptron according to initial training super-parameters and the data set to be trained, and adjusting the training super-parameters according to the prediction effect in the training process to finally obtain the enterprise operation risk assessment model conforming to the expected effect, wherein the initial training super-parameters comprise the iterative rounds of epochs of 100-200, the batch size of 32-128, the initialization learning rate of 0.01 and the drop probability of 0.2.
5. The method of claim 4, wherein the multi-layer perceptron uses a cross entropy loss function and Adam optimizer, comprising:
an input layer having a number of neurons consistent with a number of sample features of the input data;
a hidden layer, set to 2 layers, 8 neurons and 4 neurons, respectively, and using a ReLU activation function;
and an output layer that activates a function using Softmax.
6. The method for assessing a risk of an enterprise business as claimed in any one of claims 1 to 4, wherein said step S4 further comprises:
step S6, loading an evaluation mode of the enterprise operation risk evaluation model, storing the evaluation mode in a variable model, analyzing a matrix of weights of all layers in the enterprise operation risk evaluation model, storing the weight corresponding to each feature in a Python dictionary feature_weights, and forming a visual analysis graph according to the features_weights, names corresponding to all features of an input layer in the enterprise operation risk evaluation model and column names of each column of features in the input data, wherein the visual analysis graph visually displays the influence weight and the combination mode of each feature in the input data in the enterprise operation risk evaluation model;
wherein, step S5 and step S6 are in parallel connection.
7. The method for assessing risk of an enterprise business as claimed in claim 6, wherein said step S6 further comprises:
and S7, generating and outputting conclusion documents for analyzing the internal and external cause combination relation and the influence degree of the enterprise risk in a preset time period based on the visual analysis chart.
8. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, the computer program when executed implementing a method for assessing risk of an enterprise business as claimed in any one of claims 1 to 7.
9. An enterprise risk assessment apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a method of enterprise risk assessment as claimed in any one of claims 1 to 7 when executing the computer program.
CN202311738929.XA 2023-12-18 2023-12-18 Assessment method, medium and device for enterprise management risk Active CN117422312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311738929.XA CN117422312B (en) 2023-12-18 2023-12-18 Assessment method, medium and device for enterprise management risk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311738929.XA CN117422312B (en) 2023-12-18 2023-12-18 Assessment method, medium and device for enterprise management risk

Publications (2)

Publication Number Publication Date
CN117422312A true CN117422312A (en) 2024-01-19
CN117422312B CN117422312B (en) 2024-03-12

Family

ID=89531154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311738929.XA Active CN117422312B (en) 2023-12-18 2023-12-18 Assessment method, medium and device for enterprise management risk

Country Status (1)

Country Link
CN (1) CN117422312B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050119961A1 (en) * 2003-12-02 2005-06-02 Dun & Bradstreet, Inc. Enterprise risk assessment manager system
US20150066575A1 (en) * 2013-08-28 2015-03-05 Bank Of America Corporation Enterprise risk assessment
CN109118118A (en) * 2018-09-06 2019-01-01 平安科技(深圳)有限公司 Methods of risk assessment, storage medium and the server of business event
CN109636147A (en) * 2018-11-28 2019-04-16 平安科技(深圳)有限公司 Methods of risk assessment, device, computer equipment and readable storage medium storing program for executing
CN109993412A (en) * 2019-03-01 2019-07-09 百融金融信息服务股份有限公司 The construction method and device of risk evaluation model, storage medium, computer equipment
CN110020048A (en) * 2017-10-27 2019-07-16 北京宸信征信有限公司 A kind of business risk evaluation system and method based on open source data
WO2020037942A1 (en) * 2018-08-20 2020-02-27 平安科技(深圳)有限公司 Risk prediction processing method and apparatus, computer device and medium
CN110889556A (en) * 2019-11-28 2020-03-17 福建亿榕信息技术有限公司 Enterprise operation risk prediction method and system
CN111191853A (en) * 2020-01-06 2020-05-22 支付宝(杭州)信息技术有限公司 Risk prediction method and device and risk query method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050119961A1 (en) * 2003-12-02 2005-06-02 Dun & Bradstreet, Inc. Enterprise risk assessment manager system
US20150066575A1 (en) * 2013-08-28 2015-03-05 Bank Of America Corporation Enterprise risk assessment
CN110020048A (en) * 2017-10-27 2019-07-16 北京宸信征信有限公司 A kind of business risk evaluation system and method based on open source data
WO2020037942A1 (en) * 2018-08-20 2020-02-27 平安科技(深圳)有限公司 Risk prediction processing method and apparatus, computer device and medium
CN109118118A (en) * 2018-09-06 2019-01-01 平安科技(深圳)有限公司 Methods of risk assessment, storage medium and the server of business event
CN109636147A (en) * 2018-11-28 2019-04-16 平安科技(深圳)有限公司 Methods of risk assessment, device, computer equipment and readable storage medium storing program for executing
CN109993412A (en) * 2019-03-01 2019-07-09 百融金融信息服务股份有限公司 The construction method and device of risk evaluation model, storage medium, computer equipment
CN110889556A (en) * 2019-11-28 2020-03-17 福建亿榕信息技术有限公司 Enterprise operation risk prediction method and system
CN111191853A (en) * 2020-01-06 2020-05-22 支付宝(杭州)信息技术有限公司 Risk prediction method and device and risk query method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘志伟: "《基于大数据的企业税务风险评估研究——以X公司为例》", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑》, 15 June 2022 (2022-06-15) *
常永炷: "《 基于多源文本的信用风险评估研究与应用》", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 15 January 2020 (2020-01-15) *
张冰: "《基于EGARCH-KMV模型的上市公司信用风险评价研究》", 《中国优秀硕士学位论文全文数据库经济与管理科学辑》, 15 February 2020 (2020-02-15) *

Also Published As

Publication number Publication date
CN117422312B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
Dzwigol et al. An entrepreneurship model for assessing the investment attractiveness of regions
CN112182246B (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
WO2011022499A1 (en) Process and method for data assurance management by applying data assurance metrics
Sohrabinejad et al. Risk determination, prioritization, and classifying in construction project case study: gharb tehran commercial-administrative complex
Lee et al. Using Mahalanobis–Taguchi system, logistic regression, and neural network method to evaluate purchasing audit quality
Lokanan Predicting money laundering using machine learning and artificial neural networks algorithms in banks
Dbouk et al. Towards a machine learning approach for earnings manipulation detection
CN109492097B (en) Enterprise news data risk classification method
CN112035595A (en) Construction method and device of audit rule engine in medical field and computer equipment
CN111105311A (en) Dynamic credit rating method and device for bond body
CN111583033A (en) Association analysis method and device based on relation between listed company and stockholder
CN112036842A (en) Intelligent matching platform for scientific and technological services
Tian et al. A dataset on corporate sustainability disclosure
Putri et al. Big data and strengthening msmes after the covid-19 pandemic (development studies on batik msmes in east java)
Kasunic et al. An investigation of techniques for detecting data anomalies in earned value management data
CN112036841A (en) Policy analysis system and method based on intelligent semantic recognition
CN117422312B (en) Assessment method, medium and device for enterprise management risk
Barankin et al. Evidence-driven approach for assessing social vulnerability and equality during extreme climatic events
US20220374401A1 (en) Determining domain and matching algorithms for data systems
Al-Halabi et al. The impact of designing accounting information systems on the level of accounting conservatism-a field study
Ying et al. Research on tax inspection case selection model based on Bayesian network
Vladova et al. Data preprocessing for machine analysis of sales representatives’ key performance indicators
Kusaya et al. Insider abuse and fraud prediction for us banks: A comparison of machine learning approaches
CN112734559B (en) Enterprise credit risk evaluation method and device and electronic equipment
CN116307829B (en) Method and device for evaluating influence of infectious diseases on social bearing capacity based on information entropy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant