CN116187322B - Internal control compliance detection method and system based on momentum distillation - Google Patents

Internal control compliance detection method and system based on momentum distillation Download PDF

Info

Publication number
CN116187322B
CN116187322B CN202310248087.3A CN202310248087A CN116187322B CN 116187322 B CN116187322 B CN 116187322B CN 202310248087 A CN202310248087 A CN 202310248087A CN 116187322 B CN116187322 B CN 116187322B
Authority
CN
China
Prior art keywords
internal control
control information
model
compliance
momentum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310248087.3A
Other languages
Chinese (zh)
Other versions
CN116187322A (en
Inventor
胡为民
刘克飞
谢凡
胡珏
唐庆艳
余露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Dib Enterprise Risk Management Technology Co ltd
Original Assignee
Shenzhen Dib Enterprise Risk Management Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dib Enterprise Risk Management Technology Co ltd filed Critical Shenzhen Dib Enterprise Risk Management Technology Co ltd
Priority to CN202310248087.3A priority Critical patent/CN116187322B/en
Publication of CN116187322A publication Critical patent/CN116187322A/en
Application granted granted Critical
Publication of CN116187322B publication Critical patent/CN116187322B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention relates to the field of knowledge distillation and internal control management, and discloses an internal control compliance detection method and system based on momentum distillation, wherein the method comprises the following steps: after word segmentation and vectorization are carried out on the obtained text data to be detected, inputting an internal control information characterization model trained by a dynamic knowledge distillation mode, and obtaining internal control information characteristics; after the internal control information features are subjected to dimension reduction, inputting a compliance service identification model obtained through classification loss function training, and obtaining confidence probabilities that text data to be detected belong to various compliance services; and carrying out abnormal service detection according to the confidence probabilities of various compliance services so as to realize internal control compliance detection. According to the invention, the internal control information characteristic is extracted through the internal control information characterization model, the confidence probability of various compliance services is identified through the compliance service identification model, and then the internal control compliance detection is performed according to the abnormal detection mode, so that the internal control compliance detection efficiency is improved, and the normal and orderly operation activities of enterprises are maintained.

Description

Internal control compliance detection method and system based on momentum distillation
Technical Field
The invention relates to the technical field of knowledge distillation and internal control management, in particular to an internal control compliance detection method and system based on momentum distillation.
Background
With the expansion of the enterprise business scale and the deepening of the informatization degree, a large amount of complex text data is generated in the enterprise business management process, wherein the complex text data comprises the key business flow and the main business condition of the enterprise. And enterprise internal control management needs to analyze enterprise text data to acquire enterprise internal control related information from the enterprise text data so as to identify whether enterprise business is compliant or not, thereby judging whether enterprise internal control (i.e. enterprise internal control) is compliant or not.
However, currently, enterprise internal control management mainly adopts a manual mode to detect the compliance of enterprise internal control, and the mode increases with the increase of the text data scale of the enterprise, so that the difficulty of enterprise internal control management is increased. In addition, the manual mode is difficult to process various text data timely, accurately and efficiently, so that hysteresis occurs in enterprise internal control management, and risks are brought to enterprise management.
Disclosure of Invention
Based on the problems, the invention aims to solve the technical problems that the internal control management efficiency is low and normal and orderly operation activities are difficult to maintain in the prior art, so as to provide an internal control compliance detection method and system based on momentum distillation.
In order to solve the above problems, the embodiment of the invention provides an internal control compliance detection method based on momentum distillation, which comprises the following steps:
Acquiring text data to be detected;
after word segmentation and vectorization are carried out on the text data to be detected, inputting an internal control information characterization model trained by a dynamic knowledge distillation mode, and obtaining internal control information characteristics;
after the internal control information features are subjected to dimension reduction, inputting a compliance service identification model obtained through classification loss function training, and obtaining confidence probabilities that the text data to be detected belong to various compliance services;
and carrying out abnormal service detection according to the confidence probabilities of the various compliance services, and obtaining an internal control compliance detection result according to the abnormal service detection result.
Optionally, the internal control compliance detection method based on momentum distillation further comprises:
performing word segmentation and vectorization on the obtained labeled text data to obtain a real word vector sequence;
processing the real vocabulary vector sequence according to a mask vocabulary modeling mode to obtain mask data;
performing self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model;
constructing a momentum model according to an exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model;
And training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the labeling text data and the classification loss function to obtain a compliance service identification model.
Optionally, the performing self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model includes:
inputting the mask data into a converter network, and predicting the masked original vocabulary through the converter network to obtain a predicted vocabulary vector sequence;
determining a cross entropy loss function according to the predicted vocabulary vector sequence and the real vocabulary vector sequence;
and training and optimizing the weight parameters of the transducer network through the cross entropy loss function to obtain an initial internal control information characterization model.
Optionally, the constructing a momentum model according to an exponential moving average method, and performing knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model, which includes:
constructing a momentum model according to a network structure of an internal control information characterization model, and copying initial weight parameters of the internal control information characterization model to the momentum model;
Inputting the real vocabulary vector sequence into the internal control information characterization model and the momentum model, and calculating KL divergence between model outputs;
determining a target loss function according to the KL divergence, and optimizing and updating weight parameters of the internal control information characterization model according to the target loss function;
after the weight parameters of the internal control information representation model are updated, updating the weight parameters of the momentum model by adopting an exponential moving average;
and recalculating the KL divergence between model outputs until the KL divergence is smaller than or equal to a preset loss allowable value, storing the weight parameters of the internal control information representation model, and obtaining the optimized internal control information representation model.
Optionally, training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the labeling text data and the classification loss function to obtain a compliance service identification model, including:
after the dimension of the internal control information characteristics output by the optimized internal control information characterization model is reduced, inputting the internal control information characteristics into a multi-layer perceptron network, and obtaining the confidence probability that the labeling text data belong to various compliance services;
Determining a classification loss function according to the service type real tag of the marked text data and the confidence probabilities of various compliance services;
and training and optimizing the weight parameters of the multi-layer perceptron network according to the classification loss function to obtain a compliance service identification model.
Optionally, the information characterization model comprises an encoder and a decoder; after word segmentation and vectorization are carried out on the text data to be detected, an internal control information characterization model trained by a dynamic knowledge distillation mode is input, and the obtaining of internal control information features comprises the following steps:
the text data to be detected is segmented through a Chinese word segmentation device, so that a plurality of words are obtained;
vectorizing each vocabulary through independent coding, and generating a vocabulary vector sequence according to the vectorized vocabulary;
and performing context learning and feature fusion on the vocabulary vector sequence through an encoder of the internal control information characterization model, and decoding the output of the encoder through the decoder to obtain internal control information features.
Optionally, the compliance service identification model includes three hidden full connection layers and one activation function layer; after the internal control information features are subjected to dimension reduction, a compliance service identification model obtained through classification loss function training is input, and the confidence probability that the text data to be detected belongs to various compliance services is obtained, wherein the method comprises the following steps:
Performing dimension reduction on the internal control information features through an independent component analysis algorithm to obtain dimension-reduced internal control information features;
and extracting the characteristics of the internal control information after dimension reduction through a hidden full-connection layer of the compliance service identification model, and processing the characteristic information output by the hidden full-connection layer through the activation function layer to obtain the confidence probabilities of various compliance services.
Optionally, the detecting the abnormal service according to the confidence probabilities of the various compliance services, and obtaining the internal control compliance detection result according to the abnormal service detection result, includes:
detecting whether the confidence probabilities of various compliance services are smaller than a preset confidence probability threshold value;
if yes, generating a detection result of the abnormal service, and generating a detection result of internal control non-compliance according to the detection result of the abnormal service; otherwise, generating a detection result without abnormal service, and generating a detection result of internal control compliance according to the detection result without abnormal service.
In addition, the embodiment of the invention also provides an internal control compliance detection system based on momentum distillation, which comprises the following steps:
the text data acquisition module is used for acquiring text data to be detected;
The internal control information characterization module is used for inputting an internal control information characterization model obtained through training in a dynamic knowledge distillation mode after word segmentation and vectorization are carried out on the text data to be detected, and obtaining internal control information characteristics;
the service classification and identification module is used for inputting a compliance service identification model obtained through classification loss function training after the internal control information features are subjected to dimension reduction, and obtaining confidence probabilities that the text data to be detected belong to various compliance services;
and the internal control compliance detection module is used for carrying out abnormal service detection according to the confidence probabilities of various compliance services and obtaining an internal control compliance detection result according to the abnormal service detection result.
Optionally, the internal control compliance detection system based on momentum distillation further comprises:
the text data processing module is used for carrying out word segmentation and vectorization on the obtained marked text data to obtain a real vocabulary vector sequence;
the mask processing module is used for processing the real vocabulary vector sequence according to a mask vocabulary modeling mode to obtain mask data;
the self-supervision training module is used for carrying out self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model;
The knowledge distillation training module is used for constructing a momentum model according to an exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model;
and the recognition model training module is used for training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the marked text data and the classification loss function to obtain the compliance service recognition model.
According to the internal control compliance detection method and system based on knowledge distillation, after word segmentation and vectorization are carried out on the acquired text data to be detected, the internal control information characterization model is trained and obtained through a momentum knowledge distillation mode, internal control information features are extracted from word vector sequences, adverse effects of noise data on feature extraction can be restrained, and feature extraction accuracy is improved; then, after the dimension of the internal control information feature is reduced, a compliance service identification model is obtained through classification loss function training, the confidence probability that the text data to be detected belongs to various compliance services is identified, the calculation amount of internal control compliance detection is reduced, and meanwhile, the accuracy of service classification is improved; finally, the business anomaly detection is carried out according to the confidence probabilities of various compliance businesses, and the internal control compliance detection result is obtained according to the business anomaly detection result, so that the internal control compliance detection efficiency is improved, the internal control management efficiency is improved, and the enterprise can maintain normal and orderly operation activities.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an internal control compliance detection method based on momentum distillation according to an embodiment of the present invention;
FIG. 2 is a flow chart of an internal control compliance detection method based on momentum distillation according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of an internal control compliance detection system based on momentum distillation according to an embodiment of the present invention;
fig. 4 shows a schematic structural diagram of an internal control compliance detection system based on momentum distillation according to another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of systems and methods that are consistent with aspects of the invention as detailed in the accompanying claims.
As shown in fig. 1, the method for detecting internal control compliance based on momentum distillation according to an embodiment of the present invention specifically includes the following steps:
s10, acquiring text data to be detected.
In step S10, the text data to be detected is various text data that needs internal control compliance detection and is related to enterprise management and administration, including but not limited to business composition description information, financial audit information, disclosure association information, enterprise rules and regulations, historical financial data, employee complaints and suggestions, and the like. The text data to be detected comprises internal control data for reflecting internal control conditions and also comprises a plurality of noise data which are unfavorable for internal control compliance detection.
S20, after word segmentation and vectorization are carried out on the text data to be detected, an internal control information characterization model trained through a dynamic knowledge distillation mode is input, and internal control information characteristics are obtained.
In step S20, the internal control information characterization model adopts a transducer network structure, and includes an encoder and a decoder; the encoder includes a plurality of transform coding layers, each transform coding layer including a multi-headed attention layer and a forward layer; the decoder includes a plurality of transducer decoding layers and an output head.
Preferably, the step S20 specifically includes the following steps:
s201, word segmentation is carried out on the text data to be detected through a Chinese word segmentation device, and a plurality of words are obtained;
s202, vectorizing each vocabulary through independent coding, and generating a vocabulary vector sequence according to the vectorized vocabulary;
s203, performing context learning and feature fusion on the vocabulary vector sequence through an encoder of the internal control information representation model, and decoding output of the encoder through the decoder to obtain internal control information features.
In this embodiment, after text data to be detected of an enterprise is obtained, the text data to be detected is segmented by using a chinese word segmentation device (for example, the current popular ikAnalyzer) to obtain a plurality of words, each word is vectorized based on one-hot encoding (one-hot) encoding, and word vectors obtained by vectorization are spliced to generate a word vector sequence of the enterprise. Then, the internal control information characterization model obtained through training in the subsequent steps S60 to S80 takes the vocabulary vector sequence as input, and the encoder output of the internal control information characterization model is obtained and used as the internal control information characteristic of the enterprise, and the calculation formula of the internal control information characteristic is as follows:
Wherein, the liquid crystal display device comprises a liquid crystal display device,is an internal control information feature, and the feature dimension is 512; />Characterizing a model for internal control information;is a sequence of vocabulary vectors.
Further, the encoder of the internal control information characterization model takes a vocabulary vector sequence as input and comprises 6 transform coding layers, wherein each transform coding layer consists of a multi-head attention layer and a forward layer; the multi-head attention layer is used for calculating the relativity between vocabulary vectors in the vocabulary vector sequence and extracting context information; the forward layer is used for fusing the correlation and the context information and increasing the characterization capability of the internal control information characterization model so as to acquire high-level characteristics. The decoding device of the internal control information characterization model takes the high-level characteristic output by the encoder as input and consists of 2 layers of transform coding layers and an output head containing a Softmax activation function; the transform coding layer is used for decoding the high-level features to restore the vocabulary vectors; the output head is used for calculating the probability of each vocabulary vector to be restored and outputting internal control information characteristics which are expressed as a sequence composed of a plurality of vocabulary vectors in a form.
S30, after the internal control information features are subjected to dimension reduction, inputting a compliance service identification model obtained through classification loss function training, and obtaining confidence probabilities that the text data to be detected belong to various compliance services.
In step S30, the classification loss function is a cross entropy loss function. The compliance service identification model adopts a multi-layer perceptron network structure and comprises a plurality of hidden full-connection layers and an activation function layer.
Preferably, the compliance service identification model includes three hidden full-connection layers and one activation function layer, and the number of neurons of the three hidden full-connection layers is 512, 256 and 128, and at this time, step S30 specifically includes the following steps:
s301, reducing the dimension of the internal control information features through an independent component analysis algorithm to obtain the dimension-reduced internal control information features;
s302, feature extraction is carried out on the internal control information features after dimension reduction through a hidden full-connection layer of the compliance service identification model, and feature information output by the hidden full-connection layer is processed through the activation function layer, so that confidence probabilities of various compliance services are obtained.
In this embodiment, an independent component analysis algorithm is adopted to reduce the dimension of the internal control information feature with the feature dimension of 512, and an internal control information feature with the feature dimension of 64 is generated, and the calculation formula of the internal control information feature after the dimension reduction is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is the internal control information characteristic after dimension reduction; / >An independent component analysis operator for a target dimension of 64.
Internal control information characteristics after dimension reduction is obtainedThen training the obtained compliance business recognition model through the step S90 to obtain the feature +.>For input, obtaining confidence probability of different types of compliance services output by an activation function layer of a compliance service identification model>The confidence probability->The calculation formula of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,confidence probabilities for compliance services of various enterprises; />Identifying a model for the compliance service;the function is activated for activating softmax in the function layer. Finally, by extracting the compliance service type with the highest confidence probability, the text data to be detected can be identified to which compliance service type, wherein the compliance service type comprises but is not limited to the daily material purchasing, production technology, equipment management, product sales, cost accounting, fund distribution and other compliance services of an enterprise.
It can be appreciated that the dimension reduction is performed on the internal control information feature in the embodiment, so that the calculation amount of internal control compliance detection can be reduced, and the detection efficiency is improved.
S40, detecting abnormal services according to the confidence probabilities of the various compliance services, and obtaining an internal control compliance detection result according to the abnormal service detection result.
In step S40, based on the confidence probabilities of the compliance services, whether the internal control of the enterprise is compliant is determined by detecting whether the text data to be detected has abnormal services, so as to obtain an internal control compliance detection result.
Preferably, the step S40 specifically includes the steps of:
s401, detecting whether the confidence probabilities of various compliance services are smaller than a preset confidence probability threshold value;
s402, if yes, generating a detection result of abnormal service, and generating a detection result of internal control non-compliance according to the detection result of the abnormal service; otherwise, generating a detection result without abnormal service, and generating a detection result of internal control compliance according to the detection result without abnormal service.
In this embodiment, the confidence probability that the text data to be detected belongs to various types of compliant services is obtained through the optimized compliant service identification modelThereafter, if all confidence probabilities +.>Are all smaller than the confidence probability threshold +.>I.e.Determining that abnormal business exists in the text data to be detected, and further determining that internal control of an enterprise is not compliant; if insteadAnd determining that the text data to be detected does not have abnormal business, and further determining the internal control compliance of the enterprise. Optionally, confidence probability threshold +. >
In summary, according to the knowledge distillation-based internal control compliance detection method provided by the embodiment of the invention, after word segmentation and vectorization are performed on the acquired text data to be detected, the obtained internal control information characterization model is trained in a momentum knowledge distillation mode, and the internal control information characteristics are extracted from the word vector sequence, so that adverse effects of noise data on characteristic extraction can be restrained, and the accuracy of characteristic extraction is improved; then, after the dimension of the internal control information feature is reduced, a compliance service identification model is obtained through classification loss function training, the confidence probability that the text data to be detected belongs to various compliance services is identified, the calculation amount of internal control compliance detection is reduced, and meanwhile, the accuracy of service classification is improved; finally, the business anomaly detection is carried out according to the confidence probabilities of various compliance businesses, and the internal control compliance detection result is obtained according to the business anomaly detection result, so that the internal control compliance detection efficiency is improved, the internal control management efficiency is improved, and the enterprises can maintain normal and orderly operation activities.
In another embodiment, as shown in fig. 2, the method for detecting internal control compliance based on momentum distillation provided by the embodiment of the invention further includes the following steps:
S50, word segmentation and vectorization are carried out on the obtained labeled text data, and a real word vector sequence is obtained.
In step S50, the labeling text data is various text data related to enterprise management and administration, and the service type labeling is performed by adopting a manual method, that is, the labeling text data includes a service type real tag. The labeling text data comprises internal control data for reflecting internal control conditions and noise data which is unfavorable for internal control compliance detection.
More specifically, various text data of enterprises in the past year are obtained, after the various text data are marked with business types, the marked text data are segmented, vectorization is carried out based on single-hot coding, and vectorized vocabulary is spliced into a real vocabulary vector sequenceWherein->Is the +.o in the sequence of real vocabulary vectors>Individual vocabulary vectors>Is the number of vocabulary vectors, i.e. the length of the sequence of real vocabulary vectors.
S60, processing the real vocabulary vector sequence according to a mask vocabulary modeling mode to obtain mask data.
In step S60, the mask vocabulary modeling mode specifically includes: identification with maskRandomly replacing partial vocabulary vectors in the real vocabulary vector sequence to generate mask data for self-supervision training
And S70, performing self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model.
In this embodiment, the self-supervised training process, that is, the step S70 includes the following steps:
s701, inputting the mask data into a converter network, and predicting the masked original vocabulary through the converter network to obtain a predicted vocabulary vector sequence;
s702, determining a cross entropy loss function according to the predicted vocabulary vector sequence and the real vocabulary vector sequence;
s703, training and optimizing the weight parameters of the transducer network through the cross entropy loss function to obtain an initial internal control information characterization model.
It can be understood that the mask data is used as a self-supervision signal, the mask data is input into a transducer network, and the transducer network predicts the masked original vocabulary vectors by learning context information, and at this time, the vocabulary vectors corresponding to each mask mark can be predicted, thereby obtaining the predicted vocabularyVector sequence, the predictive vocabulary vector sequenceThe calculation formula of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,a sequence of predicted vocabulary vectors output for the transducer network.
Then, the predicted vocabulary vector sequence And the real vocabulary vector sequence->Determining a cross entropy loss function, the cross entropy loss function being:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a predictive vocabulary vector, and->;/>Is a true vocabulary vector, and->;/>Weight parameter for a transducer network, < ->For the cross entropy operator, expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,、/>respectively true vocabulary vectors->And predictive vocabulary vector +.>Vector values of (2); />Is calculated as an average. Finally, the weight parameters of the transducer network are optimized by adopting a gradient descent back propagation algorithm with the aim of minimizing the cross entropy loss function, and the internal control information characterization model is obtained. It can be appreciated that the present embodiment can improve the capability of the internal control information characterization model to extract the context information of the internal control information by minimizing the cross entropy loss function.
And S80, constructing a momentum model according to an exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model.
In this embodiment, the knowledge distillation training process, that is, the step S80 specifically includes the following steps:
s801, constructing a momentum model according to a network structure of an internal control information characterization model, and copying initial weight parameters of the internal control information characterization model to the momentum model;
S802, inputting the real vocabulary vector sequence into the internal control information characterization model and the momentum model, and calculating KL divergence between model outputs;
s803, determining a target loss function according to the KL divergence, and optimizing and updating weight parameters of the internal control information characterization model according to the target loss function;
s804, after the weight parameters of the internal control information representation model are updated, updating the weight parameters of the momentum model by adopting an exponential moving average;
s805, calculating the KL divergence between model outputs again until the KL divergence is smaller than or equal to a preset loss allowable value, storing the weight parameters of the internal control information representation model, and obtaining the optimized internal control information representation model.
It can be understood that a large amount of complex text data is generated in the enterprise management and management process, the text data contains internal control data capable of reflecting the internal control condition, but also contains a plurality of noise data which are unfavorable for internal control compliance detection, a momentum model is constructed by adopting an exponential sliding average method, and the adverse effect of the noise data can be reduced by utilizing a momentum model smooth supervision signal. Specifically, the momentum model is constructed by adopting the network structure which is the same as the internal control information characterization model Before optimizing the internal control information representation model, copying a weight parameter of the internal control information representation model>As a weight parameter for the momentum model. Next, after gradient descent updating the weight parameters of the internal control information representation model in each iteration period, updating the weight parameters of the momentum model by adopting an exponential moving average method so as to maintain smooth updating of the momentum model, wherein the updating process of the weight parameters of the momentum model is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,weight parameters for momentum model, +.>Weight parameters characterizing the model for the latest internal control information, +.>Is a momentum coefficient; alternatively, the momentum coefficient has a value of 0.99.
Then, according to the constructed momentum modelWith vocabulary vector sequence of enterprise->As input, calculating KL divergence between the internal control information representation model and momentum model output, and optimizing the internal control information representation model by adopting a gradient descent back propagation algorithm by taking the KL divergence as a target loss function, wherein the target loss function is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,characterizing model weight parameters for internal control information, < +.>Is a KL divergence operator.
It can be appreciated that by minimizing the KL divergence between the output of the two models, the embodiment can reduce adverse effects caused by noise data and improve the capability of the internal control information characterization model to accurately extract enterprise information.
And S90, training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the marked text data and the classification loss function to obtain a compliance service identification model.
In this embodiment, the multi-layer perceptron network training optimization process, that is, step S90 specifically includes the following steps:
s901, after the dimension of the internal control information characteristics output by the optimized internal control information characterization model is reduced, inputting the internal control information characteristics into a multi-layer perceptron network, and obtaining the confidence probability that the labeling text data belong to various compliance services;
s902, determining a classification loss function according to the service type real tag of the marked text data and the confidence probabilities of various compliant services;
and S903, training and optimizing the weight parameters of the multi-layer perceptron network according to the classification loss function to obtain a compliance service identification model.
It can be understood that, based on the enterprise information characterization model optimized in the step S80, the real vocabulary vector sequence of the text data is labeledFor input, the encoder output of the enterprise information characterization model is extracted as enterprise data feature +.>. Then, the independent component analysis algorithm is adopted for the enterprise data characteristic +. >After the dimension reduction, the multi-layer perceptron network is input, and the multi-layer perceptron network is used for ++>Function calculates confidence probability of different compliance services>
Finally, outputting confidence probability of marking text data belonging to various compliance services according to the multi-layer perceptron networkAnd determining a classification loss function by combining with a service type real tag of the marked text data, and optimizing the weight parameters of the multi-layer perceptron network by adopting a gradient descent back propagation algorithm with the aim of minimizing the classification loss function to obtain a compliance service identification model. Wherein, the classification loss function is:
wherein, the liquid crystal display device comprises a liquid crystal display device,weight parameters for a compliance service identification model, +.>Confidence probability for various compliance services +.>Is a true label of the service type. When the tagged text data is correctly classified, the business type true tag +.>1, otherwise 0.
It can be appreciated that the present embodiment can improve the ability of the compliance business identification model to accurately identify the business type of the enterprise by minimizing the classification loss function.
It should be noted that, the steps S50 to S80 included in the internal control compliance detection method based on momentum distillation provided in the present embodiment need only be executed before the step S10 or the step S20, the step S90 need only be executed after the step S80 and before the step S30, and the flow chart of the internal control compliance detection method based on momentum distillation shown in fig. 2 is only one alternative embodiment.
It can be appreciated that in the momentum distillation-based internal control compliance detection method provided by the embodiment, the transform network is trained by the momentum knowledge distillation mode to obtain an internal control information characterization model, the internal control information characterization model is used for extracting internal control information features, the multi-layer perceptron network is trained by the classification loss function to obtain a compliance service identification model, the confidence probabilities of different types of compliance services are used for identifying, and further the internal control compliance detection is performed according to the confidence probabilities of the different types of compliance services, so that the efficient internal control compliance detection can be realized.
In addition, as shown in fig. 3, an embodiment of the present invention further provides an internal control compliance detection system based on momentum distillation, including:
a text data acquisition module 110, configured to acquire text data to be detected;
the internal control information characterization module 120 is configured to input an internal control information characterization model trained by a dynamic knowledge distillation mode after word segmentation and vectorization are performed on the text data to be detected, and obtain internal control information features;
the service classification recognition module 130 is configured to input a compliance service recognition model obtained through classification loss function training after performing dimension reduction on the internal control information feature, and obtain confidence probabilities that the text data to be detected belongs to various compliance services;
The internal control compliance detection module 140 is configured to perform abnormal service detection according to confidence probabilities of the various compliance services, and obtain an internal control compliance detection result according to the abnormal service detection result.
In some alternative embodiments, as shown in fig. 4, the internal control compliance detection system based on momentum distillation further comprises:
the text data processing module 150 is configured to perform word segmentation and vectorization on the obtained labeled text data, so as to obtain a real vocabulary vector sequence;
the mask processing module 160 is configured to process the real vocabulary vector sequence according to a mask vocabulary modeling manner to obtain mask data;
the self-supervision training module 170 is configured to perform self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model;
the knowledge distillation training module 180 is configured to construct a momentum model according to an exponential moving average method, and perform knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model;
the recognition model training module 190 is configured to train and optimize the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the labeled text data, and the classification loss function, so as to obtain a compliance service recognition model.
In some alternative embodiments, the information characterization model includes an encoder and a decoder, and the internal control information characterization module 120 includes:
the word segmentation sub-module is used for segmenting the text data to be detected through a Chinese word segmentation device to obtain a plurality of words;
the vectorization sub-module is used for vectorizing each vocabulary through independent coding and generating a vocabulary vector sequence according to the vectorized vocabulary;
and the internal control information characterization sub-module is used for carrying out context learning and feature fusion on the vocabulary vector sequence through an encoder of the internal control information characterization model, and decoding the output of the encoder through the decoder to obtain internal control information features.
In some alternative embodiments, the compliance service identification model includes three hidden fully connected layers and one activation function layer; the service classification recognition module 130 includes:
the characteristic dimension reduction sub-module is used for reducing the dimension of the internal control information characteristic through an independent component analysis algorithm to obtain the dimension-reduced internal control information characteristic;
and the service classification and identification sub-module is used for extracting the characteristics of the internal control information subjected to dimension reduction through a hidden full-connection layer of the compliance service identification model, and processing the characteristic information output by the hidden full-connection layer through the activation function layer to obtain the confidence probabilities of various compliance services.
In some alternative embodiments, the internal control compliance detection module 140 includes:
the confidence probability detection sub-module is used for detecting whether the confidence probabilities of various compliance services are smaller than a preset confidence probability threshold value;
the internal control compliance detection sub-module is used for generating a detection result of abnormal business existence if yes, and generating a detection result of internal control non-compliance according to the detection result of abnormal business existence; otherwise, generating a detection result without abnormal service, and generating a detection result of internal control compliance according to the detection result without abnormal service.
In some alternative embodiments, the self-supervising training module 170 includes:
the word vector prediction sub-module is used for inputting the mask data into a transducer network, and predicting the masked original words through the transducer network to obtain a predicted word vector sequence;
the cross entropy loss determination submodule is used for determining a cross entropy loss function according to the predicted vocabulary vector sequence and the real vocabulary vector sequence;
and the self-supervision training sub-module is used for training and optimizing the weight parameters of the transducer network through the cross entropy loss function to obtain an initial internal control information characterization model.
In some alternative embodiments, the knowledge distillation training module 180 includes:
the momentum model construction sub-module is used for constructing a momentum model according to the network structure of the internal control information characterization model and copying initial weight parameters of the internal control information characterization model to the momentum model;
the KL divergence calculation sub-module is used for inputting the real vocabulary vector sequence into the internal control information characterization model and the momentum model, and calculating KL divergence between model outputs;
the target loss determination submodule is used for determining a target loss function according to the KL divergence and optimizing and updating weight parameters of the internal control information characterization model according to the target loss function;
the index smoothing sub-module is used for updating the weight parameters of the momentum model by adopting an index sliding average after the weight parameters of the internal control information representation model are updated;
and the iterative optimization sub-module is used for recalculating the KL divergence between model outputs until the KL divergence is smaller than or equal to a preset loss allowable value, storing the weight parameters of the internal control information characterization model, and obtaining the optimized internal control information characterization model.
In some alternative embodiments, the recognition model training module 190 includes:
The internal control information identification sub-module is used for inputting the internal control information characteristics output by the optimized internal control information characterization model into a multi-layer perceptron network after reducing the dimensions, and obtaining the confidence probabilities that the labeling text data belong to various compliance services;
the classification loss determination submodule is used for determining a classification loss function according to the service type real tag of the marked text data and the confidence probabilities of various compliance services;
and the identification model optimization sub-module is used for training and optimizing the weight parameters of the multi-layer perceptron network according to the classification loss function to obtain a compliance service identification model.
The internal control compliance detection system based on momentum distillation provided by the embodiment of the invention is used for realizing the internal control compliance detection method based on momentum distillation of the embodiment, has the beneficial effects of the corresponding method embodiment and is not repeated herein.
It should be noted that in the description of the present invention, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present invention, unless otherwise indicated, the meaning of "plurality" means at least two.
The present invention is not limited to the above embodiments, but is capable of modification and variation in detail, and other modifications and variations can be made by those skilled in the art without departing from the scope of the present invention.

Claims (5)

1. An internal control compliance detection method based on momentum distillation is characterized by comprising the following steps:
acquiring text data to be detected;
after word segmentation and vectorization are carried out on the text data to be detected, inputting an internal control information characterization model trained by a dynamic knowledge distillation mode, and obtaining internal control information characteristics;
after the internal control information features are subjected to dimension reduction, inputting a compliance service identification model obtained through classification loss function training, and obtaining confidence probabilities that the text data to be detected belong to various compliance services;
Performing word segmentation and vectorization on the obtained labeled text data to obtain a real word vector sequence;
processing the real vocabulary vector sequence according to a mask vocabulary modeling mode to obtain mask data;
performing self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model;
performing self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model, wherein the self-supervision training comprises the following steps:
inputting the mask data into a converter network, and predicting the masked original vocabulary through the converter network to obtain a predicted vocabulary vector sequence;
determining a cross entropy loss function according to the predicted vocabulary vector sequence and the real vocabulary vector sequence;
training and optimizing the weight parameters of a transducer network through the cross entropy loss function to obtain an initial internal control information characterization model;
constructing a momentum model according to an exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model;
the method for constructing the momentum model according to the exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model comprises the following steps:
Constructing a momentum model according to the network structure of the internal control information characterization model, and copying initial weight parameters of the internal control information characterization model to the momentum model;
inputting the real vocabulary vector sequence into the internal control information characterization model and the momentum model, and calculating KL divergence between model outputs;
determining a target loss function according to the KL divergence, and optimizing and updating weight parameters of the internal control information characterization model according to the target loss function;
after the weight parameters of the internal control information representation model are updated, updating the weight parameters of the momentum model by adopting an exponential moving average;
re-calculating the KL divergence between model outputs until the KL divergence is smaller than or equal to a preset loss allowable value, storing the weight parameters of the internal control information characterization model, and obtaining an optimized internal control information characterization model;
training and optimizing a multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the marked text data and the classification loss function to obtain a compliance service identification model;
the training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the marked text data and the classification loss function to obtain a compliance service identification model comprises the following steps:
After the dimension of the internal control information characteristics output by the optimized internal control information characterization model is reduced, inputting the internal control information characteristics into a multi-layer perceptron network, and obtaining the confidence probability that the labeling text data belong to various compliance services;
determining a classification loss function according to the service type real tag of the marked text data and the confidence probabilities of various compliance services;
training and optimizing weight parameters of the multi-layer perceptron network according to the classification loss function to obtain a compliance service identification model;
and carrying out abnormal service detection according to the confidence probabilities of the various compliance services, and obtaining an internal control compliance detection result according to the abnormal service detection result.
2. The method for detecting internal control compliance based on momentum distillation according to claim 1, wherein the information characterization model comprises an encoder and a decoder; after word segmentation and vectorization are carried out on the text data to be detected, an internal control information characterization model trained by a dynamic knowledge distillation mode is input, and internal control information characteristics are obtained, wherein the method comprises the following steps:
the text data to be detected is segmented through a Chinese word segmentation device, so that a plurality of words are obtained;
vectorizing each vocabulary through independent coding, and generating a vocabulary vector sequence according to the vectorized vocabulary;
And performing context learning and feature fusion on the vocabulary vector sequence through an encoder of the internal control information characterization model, and decoding the output of the encoder through the decoder to obtain internal control information features.
3. The method for detecting internal control compliance based on momentum distillation according to claim 1, wherein the compliance service identification model comprises three hidden full connection layers and one activation function layer; after the internal control information features are subjected to dimension reduction, a compliance service identification model obtained through classification loss function training is input, and the confidence probability that the text data to be detected belongs to various compliance services is obtained, wherein the method comprises the following steps:
performing dimension reduction on the internal control information features through an independent component analysis algorithm to obtain dimension-reduced internal control information features;
and extracting the characteristics of the internal control information after dimension reduction through a hidden full-connection layer of the compliance service identification model, and processing the characteristic information output by the hidden full-connection layer through the activation function layer to obtain the confidence probabilities of various compliance services.
4. The method for detecting internal control compliance based on momentum distillation according to claim 1, wherein said detecting abnormal traffic according to confidence probabilities of various kinds of said compliance traffic and obtaining internal control compliance detection results according to abnormal traffic detection results comprises:
Detecting whether the confidence probabilities of various compliance services are smaller than a preset confidence probability threshold value;
if yes, generating a detection result of the abnormal service, and generating a detection result of internal control non-compliance according to the detection result of the abnormal service; otherwise, generating a detection result without abnormal service, and generating a detection result of internal control compliance according to the detection result without abnormal service.
5. An internal control compliance detection system based on momentum distillation, comprising:
the text data acquisition module is used for acquiring text data to be detected;
the internal control information characterization module is used for inputting an internal control information characterization model obtained through training in a dynamic knowledge distillation mode after word segmentation and vectorization are carried out on the text data to be detected, and obtaining internal control information characteristics;
the service classification and identification module is used for inputting a compliance service identification model obtained through classification loss function training after the internal control information features are subjected to dimension reduction, and obtaining confidence probabilities that the text data to be detected belong to various compliance services;
the text data processing module is used for carrying out word segmentation and vectorization on the obtained marked text data to obtain a real vocabulary vector sequence;
The mask processing module is used for processing the real vocabulary vector sequence according to a mask vocabulary modeling mode to obtain mask data;
the self-supervision training module is used for carrying out self-supervision training on the transducer network according to the mask data to obtain an initial internal control information characterization model;
the self-supervision training module comprises:
the word vector prediction sub-module is used for inputting the mask data into a transducer network, and predicting the masked original words through the transducer network to obtain a predicted word vector sequence;
the cross entropy loss determination submodule is used for determining a cross entropy loss function according to the predicted vocabulary vector sequence and the real vocabulary vector sequence;
the self-supervision training sub-module is used for training and optimizing the weight parameters of the transducer network through the cross entropy loss function to obtain an initial internal control information characterization model;
the knowledge distillation training module is used for constructing a momentum model according to an exponential moving average method, and carrying out knowledge distillation training on an initial internal control information characterization model according to the momentum model to obtain an optimized internal control information characterization model;
the knowledge distillation training module comprises:
the momentum model construction sub-module is used for constructing a momentum model according to the network structure of the internal control information characterization model and copying initial weight parameters of the internal control information characterization model to the momentum model;
The KL divergence calculation sub-module is used for inputting the real vocabulary vector sequence into the internal control information characterization model and the momentum model, and calculating KL divergence between model outputs;
the target loss determination submodule is used for determining a target loss function according to the KL divergence and optimizing and updating weight parameters of the internal control information characterization model according to the target loss function;
the index smoothing sub-module is used for updating the weight parameters of the momentum model by adopting an index sliding average after the weight parameters of the internal control information representation model are updated;
the iterative optimization sub-module is used for recalculating the KL divergence between model outputs until the KL divergence is smaller than or equal to a preset loss allowable value, saving the weight parameters of the internal control information characterization model, and obtaining an optimized internal control information characterization model;
the recognition model training module is used for training and optimizing the multi-layer perceptron network according to the internal control information characteristics output by the optimized internal control information characterization model, the service type real labels of the marked text data and the classification loss function to obtain a compliance service recognition model;
the recognition model training module comprises:
the internal control information identification sub-module is used for inputting the internal control information characteristics output by the optimized internal control information characterization model into a multi-layer perceptron network after reducing the dimensions, and obtaining the confidence probabilities that the labeling text data belong to various compliance services;
The classification loss determination submodule is used for determining a classification loss function according to the service type real tag of the marked text data and the confidence probabilities of various compliance services;
the identification model optimization sub-module is used for training and optimizing the weight parameters of the multi-layer perceptron network according to the classification loss function to obtain a compliance service identification model;
and the internal control compliance detection module is used for carrying out abnormal service detection according to the confidence probabilities of various compliance services and obtaining an internal control compliance detection result according to the abnormal service detection result.
CN202310248087.3A 2023-03-15 2023-03-15 Internal control compliance detection method and system based on momentum distillation Active CN116187322B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310248087.3A CN116187322B (en) 2023-03-15 2023-03-15 Internal control compliance detection method and system based on momentum distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310248087.3A CN116187322B (en) 2023-03-15 2023-03-15 Internal control compliance detection method and system based on momentum distillation

Publications (2)

Publication Number Publication Date
CN116187322A CN116187322A (en) 2023-05-30
CN116187322B true CN116187322B (en) 2023-07-25

Family

ID=86442431

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310248087.3A Active CN116187322B (en) 2023-03-15 2023-03-15 Internal control compliance detection method and system based on momentum distillation

Country Status (1)

Country Link
CN (1) CN116187322B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818902A (en) * 2022-04-21 2022-07-29 浪潮云信息技术股份公司 Text classification method and system based on knowledge distillation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472730A (en) * 2019-08-07 2019-11-19 交叉信息核心技术研究院(西安)有限公司 A kind of distillation training method and the scalable dynamic prediction method certainly of convolutional neural networks
US20220180206A1 (en) * 2020-12-09 2022-06-09 International Business Machines Corporation Knowledge distillation using deep clustering
CN114841244B (en) * 2022-04-05 2024-03-12 西北工业大学 Target detection method based on robust sampling and mixed attention pyramid

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818902A (en) * 2022-04-21 2022-07-29 浪潮云信息技术股份公司 Text classification method and system based on knowledge distillation

Also Published As

Publication number Publication date
CN116187322A (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN110532542B (en) Invoice false invoice identification method and system based on positive case and unmarked learning
CN109302410B (en) Method and system for detecting abnormal behavior of internal user and computer storage medium
CN111314331A (en) Unknown network attack detection method based on conditional variation self-encoder
KR20200125682A (en) How and system to search video time segment
CN111339260A (en) BERT and QA thought-based fine-grained emotion analysis method
CN112613569B (en) Image recognition method, training method and device for image classification model
CN114818708B (en) Key information extraction method, model training method, related device and electronic equipment
CN116127953B (en) Chinese spelling error correction method, device and medium based on contrast learning
CN111523421A (en) Multi-user behavior detection method and system based on deep learning and fusion of various interaction information
CN116402630B (en) Financial risk prediction method and system based on characterization learning
CN110008699A (en) A kind of software vulnerability detection method neural network based and device
CN112883990A (en) Data classification method and device, computer storage medium and electronic equipment
CN113986674A (en) Method and device for detecting abnormity of time sequence data and electronic equipment
CN116402352A (en) Enterprise risk prediction method and device, electronic equipment and medium
CN114936290A (en) Data processing method and device, storage medium and electronic equipment
CN117151223B (en) Multi-modal entity identification and relation extraction method based on learning prompt
CN113268985A (en) Relationship path-based remote supervision relationship extraction method, device and medium
CN116187322B (en) Internal control compliance detection method and system based on momentum distillation
CN111859925A (en) Emotion analysis system and method based on probability emotion dictionary
CN115098681A (en) Open service intention detection method based on supervised contrast learning
CN115357711A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium
CN115269925A (en) Non-biased scene graph generation method based on hierarchical structure
Navarro-Cerdan et al. Batch-adaptive rejection threshold estimation with application to OCR post-processing
CN110599230A (en) Method for constructing pricing model of second-hand vehicle, pricing method and device
CN117151117B (en) Automatic identification method, device and medium for power grid lightweight unstructured document content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant