CN114936600A - Document abnormity monitoring method, device, equipment and storage medium - Google Patents

Document abnormity monitoring method, device, equipment and storage medium Download PDF

Info

Publication number
CN114936600A
CN114936600A CN202210590897.2A CN202210590897A CN114936600A CN 114936600 A CN114936600 A CN 114936600A CN 202210590897 A CN202210590897 A CN 202210590897A CN 114936600 A CN114936600 A CN 114936600A
Authority
CN
China
Prior art keywords
document
sample data
real
classification
service data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210590897.2A
Other languages
Chinese (zh)
Inventor
倪豪
陈少杰
杨龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210590897.2A priority Critical patent/CN114936600A/en
Publication of CN114936600A publication Critical patent/CN114936600A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Finance (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Resources & Organizations (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of artificial intelligence and discloses a document abnormity monitoring method, a device, equipment and a storage medium. The method comprises the following steps: acquiring real-time service data of the real-time service data, and inputting the real-time service data into a preset target classification model; carrying out classification prediction on the real-time service data through a classification model to obtain the document type of the real-time service data; calculating and integrating the real-time service data according to the document types, and detecting whether the document indexes after precise calculation and integration meet the corresponding requirements or not; and when the document indexes after calculation and integration meet the corresponding requirements, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. The invention determines the credit batch types of the financial product in different life cycles through model deep learning so as to judge whether the corresponding credit batch is delayed or wrong, thereby improving the data abnormity monitoring efficiency.

Description

Document abnormity monitoring method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a document abnormity monitoring method, a device, equipment and a storage medium.
Background
The existing credit batch monitoring is completely passive, a product manager or operation and maintenance staff are required to configure relevant credit batch rules in time for different products, a program can calculate whether corresponding credit batches are delayed or wrong according to the rules configured by the product manager, an alarm can be sent at regular time when problems exist, the relevant product manager and the operation and maintenance are reminded to configure and disclose information in time, and the problem of delayed disclosure is avoided.
The credit approval monitoring is realized quickly, a product manager or operation and maintenance staff is not required to configure relevant credit approval rules aiming at different products in time, a program can calculate whether corresponding credit approval is delayed or wrong according to the rules configured by the product manager, and the problem that an alarm is sent at regular time is solved, so that the technical problem to be solved by technical personnel in the field is solved.
Disclosure of Invention
The invention mainly aims to judge whether the corresponding credit is delayed or wrong through artificial intelligence, thereby improving the data abnormity monitoring efficiency.
The invention provides a document abnormity monitoring method in a first aspect, which comprises the following steps: acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model; carrying out classification prediction on the real-time service data through the classification model to obtain the document type of the real-time service data; calculating and integrating the real-time service data according to the document type, and detecting whether the document index after precise calculation and integration meets a corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
Optionally, in a first implementation manner of the first aspect of the present invention, before the acquiring real-time service data of a monitored object and inputting the real-time service data into a preset target classification model, the method further includes: obtaining historical service data of a monitored object and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion, wherein the historical service data comprises document data and credit data; extracting features of the first training sample data based on a feature extraction algorithm to obtain a category vector corresponding to the first training sample data; building a convolutional neural network model, and inputting the class vector corresponding to the first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data; obtaining target representation weight vectors of all classes, and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and labels corresponding to the training sample data; and iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model converges to obtain an initial classification model.
Optionally, in a second implementation manner of the first aspect of the present invention, the performing, by using a feature extraction algorithm, feature extraction on the first training sample data to obtain a class vector corresponding to the first training sample data includes: inputting the first training sample data into a preset bidirectional LSTM model, and extracting a hidden state sequence corresponding to the first training sample data through the bidirectional LSTM model; performing self-attention processing on the hidden state sequence corresponding to the first training sample data through an attention mechanism to obtain a characterization vector corresponding to the first training sample data; and constructing a category vector corresponding to the characterization vector corresponding to the first training sample data.
Optionally, in a third implementation manner of the first aspect of the present invention, the obtaining target characterization weight vectors of each category includes: inputting the first training sample data into a feature embedding network of a convolutional neural network model to obtain a characterization vector corresponding to the first training sample data; acquiring initial characterization weight vectors of all classes, and determining initial distances between the characterization vectors corresponding to the first training sample data and the initial characterization weight vectors of all classes; determining second probabilities of the classes corresponding to the first training sample data based on the initial distance; constructing a first loss function according to the label corresponding to the first training sample data and the second probability; and training the feature embedded network of the convolutional neural network model through the first loss function, and obtaining target characterization weight vectors corresponding to all classes when the training stopping condition is met.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model converges, and obtaining an initial classification model includes: training a convolutional neural network model through the target loss function, and adjusting the weight parameters of a feature extraction layer and the weight parameters of a full connection layer in a classification network; stopping training when a preset condition is met, and obtaining the target weight of a feature extraction layer and the target weight of a full connection layer in the classification network; and the target weight of the feature extraction layer and the target weight of the full connection layer are parameters in the trained convolutional neural network model.
Optionally, in a fifth implementation manner of the first aspect of the present invention, after the iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model converges to obtain an initial classification model, the method further includes: testing the initial classification model through the test sample data, and according to the test result, classifying the classification accuracy of the test sample data of different label classes and the overall classification accuracy of the test sample data; and when the integral classification accuracy does not reach the expected effect, acquiring new training sample data of different label categories from a preset database, and training the classification model through the new training sample data until the integral classification accuracy meets the preset requirement to obtain a target classification model.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the testing the initial classification model by using the test sample data, and the accuracy of classifying the classes of the test sample data of different label classes and the accuracy of classifying the whole test sample data according to the test result include: determining a confusion matrix corresponding to each intention type test sample according to the test result of each intention type test sample, wherein the confusion matrix comprises test result parameters corresponding to a positive test sample and a negative test sample; calculating the classification precision of the intention classification test sample according to the test result parameters in the confusion matrix; and smoothing the classification precision of the test samples with different intention classes to obtain the overall classification precision of the test sample set.
A second aspect of the present invention provides a document abnormality monitoring apparatus, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring real-time service data of a monitored object and inputting the real-time service data into a preset target classification model; the prediction module is used for carrying out classification prediction on the real-time business data through the classification model to obtain the document type of the real-time business data; the detection module is used for calculating and integrating the real-time service data according to the document type and detecting whether document indexes subjected to precise calculation and integration meet corresponding preset alarm rules or not; and the warning module is used for carrying out warning reminding on the monitored object through a preset monitoring host when the document index after calculation and integration is determined to meet the corresponding preset warning rule.
Optionally, in a first implementation manner of the second aspect of the present invention, the document abnormality monitoring apparatus further includes: the second acquisition module is used for acquiring historical service data of a monitored object and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion, wherein the historical service data comprises document data and letter data; the characteristic extraction module is used for extracting characteristics of the first training sample data based on a characteristic extraction algorithm to obtain a category vector corresponding to the first training sample data; the building module is used for building a convolutional neural network model and inputting the class vector corresponding to the first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data; the calculation module is used for acquiring target representation weight vectors of all categories and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and the label corresponding to the training sample data; and the updating module is used for iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model is converged to obtain an initial classification model.
Optionally, in a second implementation manner of the second aspect of the present invention, the feature extraction module is specifically configured to: inputting the first training sample data into a preset bidirectional LSTM model, and extracting a hidden state sequence corresponding to the first training sample data through the bidirectional LSTM model; performing self-attribute ion processing on the hidden state sequence corresponding to the first training sample data through an attribute ion mechanism to obtain a characterization vector corresponding to the first training sample data; and constructing a category vector corresponding to the characterization vector corresponding to the first training sample data.
Optionally, in a third implementation manner of the second aspect of the present invention, the calculation module is specifically configured to: inputting the first training sample data into a feature embedding network of a convolutional neural network model to obtain a characterization vector corresponding to the first training sample data; acquiring initial characterization weight vectors of all classes, and determining initial distances between the characterization vectors corresponding to the first training sample data and the initial characterization weight vectors of all classes; determining second probabilities of the first training sample data corresponding to the classes based on the initial distance; constructing a first loss function according to the label corresponding to the first training sample data and the second probability; and training the feature embedded network of the convolutional neural network model through the first loss function, and obtaining target characterization weight vectors corresponding to all classes when the training stopping condition is met.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the update module includes: the adjusting unit is used for training a convolutional neural network model through the target loss function and adjusting the weight parameters of the feature extraction layer and the weight parameters of the full connection layer in the classification network; the training unit is used for stopping training when a preset condition is met to obtain the target weight of the feature extraction layer and the target weight of the full connection layer in the classification network; and the target weight of the feature extraction layer and the target weight of the full connection layer are parameters in the trained convolutional neural network model.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the document abnormality monitoring apparatus further includes: the test module is used for testing the initial classification model through the test sample data, and classifying the classification accuracy of the test sample data of different label classes and the integral classification accuracy of the test sample data according to the test result; and the training module is used for acquiring new training sample data of different label categories from a preset database when the overall classification accuracy does not reach the expected effect, and training the classification model through the new training sample data until the overall classification accuracy meets the preset requirement to obtain a target classification model.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the test module is specifically configured to: determining a confusion matrix corresponding to each intention type test sample according to the test result of each intention type test sample, wherein the confusion matrix comprises test result parameters corresponding to a positive test sample and a negative test sample; calculating the classification precision of the intention classification test sample according to the test result parameters in the confusion matrix; and smoothing the classification precision of the test samples with different intention classes to obtain the overall classification precision of the test sample set.
A third aspect of the present invention provides a document abnormality monitoring apparatus, comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the document anomaly monitoring device to perform the steps of the document anomaly monitoring method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the above-described document anomaly monitoring method.
In the technical scheme provided by the invention, the real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and credit batch monitoring is rapidly realized according to the document types, a product manager or operation and maintenance staff is not required to configure related credit batch rules aiming at different products in time, a program can calculate whether corresponding credit batches are delayed or wrong according to the rules configured by the product manager, and alarms can be sent at regular time if problems exist, so that the data abnormity monitoring efficiency is improved.
Drawings
FIG. 1 is a schematic diagram of a document anomaly monitoring method according to a first embodiment of the present invention;
FIG. 2 is a diagram of a document anomaly monitoring method according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a third embodiment of a document anomaly monitoring method provided by the present invention;
FIG. 4 is a diagram of a fourth embodiment of a document anomaly monitoring method provided by the present invention;
FIG. 5 is a diagram of a fifth embodiment of a document anomaly monitoring method provided by the present invention;
FIG. 6 is a schematic diagram of a document anomaly monitoring device according to a first embodiment of the present invention;
FIG. 7 is a schematic diagram of a document anomaly monitoring device according to a second embodiment of the present invention;
fig. 8 is a schematic diagram of an embodiment of a document abnormality monitoring device provided by the present invention.
Detailed Description
The method, the device, the equipment and the storage medium for monitoring the document abnormity, provided by the embodiment of the invention, firstly acquire real-time business data of the real-time business data, and input the preset alarm rule real-time business data into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and referring to fig. 1, a first embodiment of a document anomaly monitoring method in the embodiment of the present invention includes:
101. acquiring real-time service data of the real-time service data, and inputting the real-time service data into a preset target classification model;
in this embodiment, in mature industrial production, the yield of production is extremely high, and the defective products available for learning account for a small percentage. Therefore, although industrial production can provide a training set with huge data for smart manufacturing, the proportion of positive samples (generated by good products) is high, the number of negative samples (generated by defective products) for learning is small, and the distribution of the negative samples among various categories is seriously uneven. Therefore, the effect obtained by directly using all the training sets to train the classification models is not good. Therefore, in the embodiment, part of the data is extracted to be used for the classification model training, and then the adjustment and optimization are performed according to the training result, so that the training effect can be improved, and the training time can be shortened.
102. Classifying and predicting the real-time service data through a classification model to obtain the document type of the real-time service data;
in this embodiment, considering that characteristics of each target mechanism of the trust company are different, in the present application, a classification prediction model may be established for each target mechanism of the trust company in a targeted manner, and it can be understood that, because characteristics of each target mechanism have a certain difference, the prediction model may be used for different target mechanisms, so that a credit lot may be monitored according to the characteristics of the target mechanism itself, and if the credit lot is not uploaded at this stage, corresponding reminding is required and expiration time and disclosure time are calculated, thereby avoiding some unnecessary complaints and resource loss situations.
In order to ensure that the financial products of each target organization of the trust company have reliable data support in different life cycles, document data and batch data related to the financial products of different financial products in different cycles can be acquired from a preset database in advance. The period may be selected to be a time range with a certain rule, such as one week, one month, etc., and the application is not limited thereto.
In this embodiment, documents refer to various documents and documents, which may indicate whether a contract performs, and the degree of performing, for example, a credit contract.
Specifically, the 'document data' and 'letter and batch data' acquired from the database can be used as training samples, and are divided into a training set and a testing set according to a preset proportion to train the convolutional neural network model, so that an initial classification prediction model is obtained.
When the classification prediction model is trained, testing the initial classification prediction model through a test set; meanwhile, calculating the classification prediction accuracy of the initial classification prediction model (namely, whether the credit rule corresponding to the deep learning result product is accurate or not) according to the actual label (the document type and the credit type) of the training sample data;
when the accuracy is low, new training sample data is obtained to train the model (the input characteristic training data set needs to be increased again when the accuracy is inaccurate, so that the algorithm accuracy is provided), and the accuracy of the deep learning of the model is improved.
103. Calculating and integrating the real-time service data according to the document type, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule;
in this embodiment, when a document abnormal investigation request is received, an abnormal document corresponding to the document abnormal investigation request and a first document identifier of the abnormal document are obtained.
When the server receives the document abnormal investigation request, the server obtains the abnormal document to be investigated corresponding to the document abnormal investigation request and a first document identification corresponding to the abnormal document, wherein the first document identification refers to identification information of the abnormal document, for example, a document number of the abnormal document.
In this embodiment, the triggering form of the document abnormal investigation request is not specifically limited, that is, the document abnormal investigation request may be manually triggered by a user, for example, the user clicks an "abnormal investigation" button on the terminal to trigger the document abnormal investigation request, the terminal sends the document abnormal investigation request to the server, the server receives the document abnormal investigation request, and the server takes a document that does not pass the test (the test does not pass can be understood as that the test result includes an abnormal tag) as an abnormal document corresponding to the abnormal document to be investigated; for another example, the user selects an abnormal document on the terminal, sends a voice abnormal investigation instruction, and triggers a document abnormal investigation request, the terminal sends the document abnormal investigation request to the server, the server receives the document abnormal investigation request, and the server takes the abnormal document selected by the user as the abnormal document to be investigated corresponding to the abnormal document to be investigated; or the document abnormal investigation request may also be automatically triggered, for example, a triggering condition of the abnormal investigation request is preset in the server: when the server receives an exception document, then the document test output is received at the server: when the test is not passed, the server automatically triggers a document abnormal investigation request.
Further, acquiring editing information in the warehousing documents, taking the editing information and editing position identification corresponding to the editing information as document synthesis information, storing the document synthesis information in a preset synthesis table, and taking the preset warehousing table and the preset synthesis table as a preset document record table.
The server acquires editing information in the warehousing documents, the server takes editing position identification corresponding to the editing information and the editing information as document synthesis information, the server stores the document synthesis information and document identification in a preset synthesis table in an associated mode, the server takes the preset warehousing table and the preset synthesis table as a preset document record table, and abnormal reasons of abnormal documents are analyzed through the document warehousing information and the document synthesis information in the preset document record table, so that abnormal analysis of the documents is more accurate. The server obtains abnormal reasons of the abnormal documents and classifies the abnormal reasons according to preset alarm rules, wherein the preset alarm rules refer to preset abnormal classification rules, for example, document information filling errors, and the server counts the occurrence frequency of various abnormal reasons. Namely, according to the document type, calculating and integrating the real-time service data, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule.
104. And when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the real-time service data through a preset monitoring host.
In this embodiment, after the monitoring host receives the monitoring instruction and acquires the corresponding document index through the data acquisition host according to the monitoring instruction, the monitoring host determines the corresponding monitoring template according to the monitoring instruction, performs calculation and integration on the acquired document index through the monitoring template, and detects whether the document index after calculation and integration meets the corresponding preset alarm rule. And if the document indexes after calculation and integration meet the corresponding preset alarm rules, carrying out alarm reminding to prompt corresponding operation and maintenance personnel to carry out maintenance and inspection in time, or informing corresponding responsible persons of the abnormal conditions to prompt the responsible persons to solve the abnormal conditions. And if the document indexes after calculation and integration do not meet the corresponding preset alarm rules, no alarm reminding is performed.
It should be noted that, when the monitoring host acquires the document indexes, the corresponding monitoring template is determined according to the control instruction, and then the corresponding indexes are selected from each document index through the monitoring template for calculation and integration. Of course, in a specific embodiment, the document index to be acquired may also be determined according to the monitoring instruction, for example, the model input layer established in the present invention is product document data configured in the operation and maintenance work and a series of product approval relationship data such as product establishment date, expiration date, product classification, etc., a product approval type label is defined (because the product approval rule is one according to the classification rule), then the input data is divided into a training data set and a testing data set, and the output result is the classification label to which the product belongs. Data can be input into the trained model as soon as a subsequent product is established, letter classification (document type) of the product can be obtained immediately, and then letter batch monitoring is rapidly achieved by using letter batches corresponding to the set of labels, so that a large amount of labor cost is released.
In the embodiment of the invention, real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and credit batch monitoring is rapidly realized according to the document types, a product manager or operation and maintenance staff is not required to configure related credit batch rules aiming at different products in time, a program can calculate whether corresponding credit batches are delayed or wrong according to the rules configured by the product manager, and alarms can be sent at regular time if problems exist, so that the data abnormity monitoring efficiency is improved.
Referring to fig. 2, a second embodiment of the document anomaly monitoring method according to the embodiment of the present invention includes:
201. acquiring historical service data of the real-time service data and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion;
in this embodiment, historical service data of real-time service data and a label corresponding to the historical service data are obtained, the historical service data is divided into first training sample data and first test sample data according to a preset proportion, and a method of setting a training set (trainingset) and a test set (test set) is mostly adopted in a supervised machine learning method of an existing neural network model to train and test the model. Wherein, the data contained in the training set is used for training the model, namely parameters such as the weight and the bias of the model are determined; the test set is used only once for evaluation of the generalization ability of the final model after training is completed.
It is common to divide a data set into a training set and a test set, that is, to directly divide the data set into two mutually exclusive sets, where one set is used as the training set and the remaining set is used as the test set. And sampling the samples in the data set according to a preset proportion. The consistency of the data distribution in the training set and the test set is kept, so that the model trained by the data samples in the training set can obtain the best performance on the test set.
202. Extracting features of the first training sample data based on a feature extraction algorithm to obtain a class vector corresponding to the first training sample data;
in this embodiment, the features of the samples extracted from the test set and the training sets are respectively determined by using pre-trained recognition models, where the recognition models are used to perform feature extraction on the sample images sampled from the test set and the training sets, so that the features extracted on similar images are similar.
203. Building a convolutional neural network model, and inputting a class vector corresponding to first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data;
in this embodiment, the terminal inputs the first training sample data into the convolutional neural network model to be trained, and the convolutional neural network model performs feature extraction on the first training sample data through the classification network to be trained to obtain the first probability of each class corresponding to the first training sample data.
In this embodiment, inputting the training sample into the classification network of the convolutional neural network model to obtain the first probability of each class corresponding to the training sample, includes: inputting the training sample into a classification network of a convolutional neural network model; performing feature extraction on the image to be processed through the initial weight of the feature extraction layer in the classification network to obtain an initial feature vector corresponding to the training sample; and carrying out full-connection processing on the initial characteristic vector through the initial weight of a full-connection layer in the classification network to obtain the first probability of each class corresponding to the training sample.
Specifically, the convolutional neural network model to be trained includes a classification network, and the classification network includes a feature extraction layer and a full connection layer. The terminal inputs the first training sample data into a feature extraction layer of the classification model, obtains initial weight of the feature extraction layer, and performs feature extraction on the image to be processed based on the initial weight of the feature extraction layer, so that the training sample is converted into a corresponding initial feature vector. Then, the initial feature vector output by the feature extraction layer is used as the input of the full-connection layer. And acquiring initial weights corresponding to the full connection layer, and performing full connection processing on the initial feature vectors based on the initial weights of the full connection layer to obtain first probabilities of the training samples corresponding to the categories.
204. Acquiring target representation weight vectors of all classes, and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and labels corresponding to training sample data;
in this embodiment, target representation weight vectors of each category are obtained, and association probability between each category is determined according to the target representation weight vectors of each category. The target representation weight vector of each category is a target weight for representing key information corresponding to each category, and the association probability represents the association degree between each category.
Specifically, target characterization weight vectors of each category may be obtained, and a distance between each target characterization weight vector is calculated, that is, a distance between each two target characterization weight vectors is calculated. And calculating the association probability between each category according to the distance between each two target characterization weight vectors.
And further, constructing a target loss function according to the label corresponding to the first training sample data, the first probability and the association probability among all the categories.
Specifically, the terminal constructs a first loss function according to the label corresponding to the first training sample data and the first probability of each category corresponding to the first training sample data. And then, the terminal constructs a second loss function according to the first probability of the first training sample data corresponding to each category and the association probability among the categories. Then, the terminal constructs a target loss function according to the first loss function and the second loss function.
205. Iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model is converged to obtain an initial classification model;
in this embodiment, the initial classification model is trained through the target loss function, and parameters of the initial classification model are adjusted according to the training result. And continuing training based on the initial classification model after the parameters are adjusted until the training is stopped when the preset conditions are met, so as to obtain the trained initial classification model.
In this embodiment, the preset condition may be that a loss value obtained by training the initial classification model through the target loss function is less than or equal to a loss threshold. And when the loss value obtained by the initial classification model trained by the target loss function is less than or equal to the loss threshold, stopping training to obtain the trained initial classification model.
206. Acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model;
207. classifying and predicting the real-time service data through a classification model to obtain the document type of the real-time service data;
208. calculating and integrating the real-time service data according to the document type, and detecting whether the document indexes after precise calculation and integration meet the corresponding preset alarm rules;
209. and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
Steps 201 and 204 and 207 in this embodiment are similar to steps 101 and 106 in the first embodiment, and are not described herein again.
In the embodiment of the invention, real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
Referring to fig. 3, a third embodiment of the document anomaly monitoring method according to the embodiment of the present invention includes:
301. acquiring historical service data of the real-time service data and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion;
302. inputting first training sample data into a preset bidirectional LSTM model, and extracting a hidden state sequence corresponding to the first training sample data through the bidirectional LSTM model;
in this embodiment, the bidirectional LSTM model has a bidirectional hidden layer, and therefore, the training samples are processed based on the bidirectional LSTM model to obtain a hidden state sequence corresponding to the training samples. The hidden state sequence is (h1, h2, …, hr).
303. Carrying out self-attention processing on the hidden state sequence corresponding to the first training sample data through an attention mechanism to obtain a characterization vector corresponding to the first training sample data;
in this embodiment, the above described attention mechanism is adopted, and is used to select a fixed-length vector, i.e. the above described token vector, from the above described hidden state sequence based on the attention mechanism, and the token vector is denoted by e.
304. Constructing a category vector corresponding to a characterization vector corresponding to the first training sample data;
in the embodiment, a random initialization matrix shared by all categories is obtained, and affine transformation is performed on each characterization vector based on the random initialization matrix; normalizing the dynamic routing values of the characterization vectors, and performing weighted summation on each characterization vector after affine transformation based on the dynamic routing values to obtain a characterization vector of each category;
in this embodiment, the normalization process includes: di ═ softmax (bi), bi is the initial logic of dynamic routing, initially 0. The larger the dynamic routing value corresponding to a sample vector that is close to the class vector. And calculating to obtain a class vector corresponding to the characterization vector of each class based on the square function. The above-mentioned class vector is denoted as ci, wherein the length of the class vector does not exceed 1; and updating the dynamic routing value, so that the dynamic routing value corresponding to the sample characterization vector similar to the class vector is increased. The dynamic routing values are: when updating bij + e' ij · ei, the route value corresponding to the sample vector close to the class vector is ensured to be increased.
305. Building a convolutional neural network model, and inputting a class vector corresponding to first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data;
306. acquiring target representation weight vectors of all classes, and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and labels corresponding to training sample data;
307. training a convolutional neural network model through a target loss function, and adjusting the weight parameters of a feature extraction layer and the weight parameters of a full connection layer in a classification network;
in this embodiment, the convolutional neural network model to be trained includes a classification network, and the classification network includes a feature extraction layer and a full connection layer. The weight parameter of the feature extraction layer in the untrained classification network is an initial weight, the weight parameter of the full connection layer is an initial weight, and the weight parameter is adjusted to be a first weight after training. The terminal inputs the first training sample data into a feature extraction layer of an untrained classification network, obtains initial weights of the feature extraction layer, and performs feature extraction on the first training sample data based on the initial weights of the feature extraction layer to obtain initial feature vectors corresponding to the first training sample data. And then, the terminal acquires the full connection layer as an initial weight, and processes the initial feature vector based on the full connection layer as the initial weight to obtain a first probability of each category corresponding to the first training sample data. And then, the terminal constructs a first loss function according to the first probability and the label corresponding to the first training sample data, and trains the classification network of the convolutional neural network model based on the first loss function. And adjusting the weight parameters of the feature extraction layer and the full connection layer according to the training result of each time, and repeatedly training until the training stop condition is met, so as to obtain the first weight corresponding to the feature extraction layer and the first weight of the full connection layer.
In this embodiment, a first loss function is constructed according to a label and a first probability corresponding to first training sample data, a classification network of a convolutional neural network model is trained based on the first loss function, the training is stopped when a training stop condition is met, and first weights corresponding to a feature extraction layer are obtained, where the first weights corresponding to the feature extraction layer are weight parameters in the trained classification network. And carrying out preliminary training on the classification network, obtaining a feature vector corresponding to first training sample data based on the trained classification network, training the feature embedded network on the basis, and accurately obtaining a target characterization weight vector corresponding to each category.
308. Stopping training when a preset condition is met, obtaining target weights of a feature extraction layer and a full connection layer in a classification network, and obtaining an initial classification model based on the target weights;
in the embodiment, a convolutional neural network model is trained through the target loss function, and the weight parameters of a feature extraction layer and the weight parameters of a full connection layer in a classification network are adjusted; stopping training when a preset condition is met, and obtaining the target weight of a feature extraction layer and the target weight of a full connection layer in the classification network; the target weight of the feature extraction layer and the target weight of the full connection layer are parameters in a trained convolutional neural network model.
The preset condition may be that a loss value obtained by training the convolutional neural network model through the target loss function is less than or equal to a loss threshold.
Specifically, the terminal trains the convolutional neural network model through the target loss function, and adjusts the weight parameters of the feature extraction layer and the weight parameters of the full connection layer in the classification network according to each training result. Further, the terminal trains the convolutional neural network model through the target loss function to calculate a loss value, obtains a loss threshold value, and compares the calculated loss value with the loss threshold value. And when the calculated loss value is larger than the loss threshold value, adjusting the weight parameters of the feature extraction layer and the full connection layer in the classification network, and continuing training based on the convolutional neural network model after the weight parameters are adjusted. And stopping training until the loss value obtained by training the convolutional neural network model through the target loss function is less than or equal to the loss threshold value to obtain a trained convolutional neural network model, and obtaining the target weight of a feature extraction layer of the classification network and the target weight of a full connection layer in the trained convolutional neural network model.
In this embodiment, the convolutional neural network model is trained through the target loss function, the weight parameters of the feature extraction layer and the weight parameters of the full connection layer in the classification network are adjusted, and the training is stopped when a preset condition is met, so that the target weights of the feature extraction layer and the full connection layer in the classification network are obtained, and the target weights of the feature extraction layer and the full connection layer are parameters in the trained convolutional neural network model, so that the convolutional neural network model learns the relevance among the classes in the training process, the trained convolutional neural network model can classify the images based on the relevance among the classes, and the classification is more accurate.
309. Acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model;
310. classifying and predicting the real-time service data through a classification model to obtain the document type of the real-time service data;
311. calculating and integrating the real-time service data according to the document type, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule;
312. and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
The steps 301-.
In the embodiment of the invention, real-time service data of the real-time service data is obtained, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document indexes after calculation and integration meet the corresponding preset alarm rules, carrying out alarm reminding on abnormal real-time service data through a preset monitoring host. According to the invention, the document types of financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and credit batch monitoring is rapidly realized according to the document types, a product manager or operation and maintenance staff is not required to configure related credit batch rules aiming at different products in time, a program can calculate whether corresponding credit batches are delayed or wrong according to the rules configured by the product manager, and alarms can be sent at regular time if problems exist, so that the data abnormity monitoring efficiency is improved.
Referring to fig. 4, a fourth embodiment of the document anomaly monitoring method according to the embodiment of the present invention includes:
401. acquiring historical service data of the real-time service data and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion;
402. performing feature extraction on the first training sample data based on a feature extraction algorithm to obtain a category direction corresponding to the first training sample data;
403. building a convolutional neural network model, and inputting a class vector corresponding to first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data;
404. inputting first training sample data into a feature embedding network of a convolutional neural network model to obtain a characterization vector corresponding to the first training sample data;
in this embodiment, the characterization vector corresponding to the first training sample data is a vector obtained by processing the feature vector of the first training sample data by using an untrained feature embedding network.
Specifically, the initial classification model in this embodiment includes a trained classification network and an untrained feature embedding network. After the terminal trains the classification network in the initial classification model, first training sample data is input into the trained classification network, and a first feature vector corresponding to the first training sample data is obtained. Then, the terminal inputs the first feature vector into a trained feature embedded network, and obtains an initial weight value of the feature embedded network. And further extracting the features of the first feature vector based on the initial weight value of the feature embedded network to obtain a characterization vector corresponding to the first training sample data.
405. Acquiring initial characterization weight vectors of all classes, and determining an initial distance between a characterization vector corresponding to the first training sample data and the initial characterization weight vectors of all classes;
in this embodiment, the initial characterization weight vector of each category refers to a preliminary weight used for representing key information corresponding to each category, and characterizes an initial association relationship between each category. The initial distance is the distance between the token vector and the initial token weight vector for each class.
Specifically, the terminal presets initial characterization weight vectors corresponding to the categories, and expresses the association relationship among the categories. Then, the terminal may calculate an initial distance between the token vector corresponding to the first training sample data and the initial token weight vector corresponding to each class.
In this embodiment, the terminal may calculate, according to the distance metric function, a distance between the characterization vector corresponding to the first training sample data and each initial characterization weight vector, to obtain each initial distance.
406. Determining second probabilities of the first training sample data corresponding to the classes based on the initial distance;
in this embodiment, the second probability refers to a probability that the first training sample data output by the untrained feature embedding network belongs to each class.
Specifically, the terminal normalizes the initial distance between the characterization vector and the initial characterization weight vector corresponding to each category to obtain a second probability corresponding to the first training sample data. By calculating the distance between the characterization vector and the initial characterization weight vector corresponding to each category, the similarity degree, i.e. the similarity degree, between the features of the first training sample data and the features of each category can be determined, so as to obtain the second probability that the first training sample data belongs to each category.
407. Constructing a first loss function according to the label corresponding to the first training sample data and the second probability;
in this embodiment, the terminal obtains a label corresponding to first training sample data, and constructs a cross entropy loss function between a second probability that the first training sample data belongs to each category and a predetermined category corresponding to the first training sample data, that is, a first loss function.
408. Training a feature embedding network of a convolutional neural network model through a first loss function, and obtaining target representation weight vectors corresponding to all categories when a training stopping condition is met;
in this embodiment, the feature-embedded network is trained based on the constructed loss function, and parameters of the feature-embedded network are adjusted to perform repeated training, and the training is stopped until a training stop condition is satisfied, so as to obtain a trained feature-embedded network, thereby obtaining target weights corresponding to the feature-embedded network and target characterization weight vectors corresponding to each category.
In this embodiment, the feature-embedded network is trained by a first loss function, and a loss value for each training is calculated based on the first loss function. And when the loss value output by the feature embedded network is smaller than a preset loss threshold value, satisfying a training stopping condition to obtain the trained feature embedded network, thereby obtaining the target weight corresponding to the feature embedded network and the target representation weight vector corresponding to each category.
In this embodiment, the first training sample data is input into the feature embedding network of the initial classification model to obtain the characterization vector corresponding to the first training sample data, obtain the initial characterization weight vector corresponding to each category, determine the initial distance between the characterization vector corresponding to the first training sample data and the initial characterization weight vector corresponding to each category, and be able to determine the degree of similarity between the features of the first training sample data and the features of each category, thereby obtaining the second probability that the first training sample data belongs to each category. And constructing a first loss function according to the label corresponding to the first training sample data and the second probability, embedding the feature of the first training sample data classification model of the first loss function into a network, and obtaining a target representation weight vector corresponding to each category when a training stopping condition is met, so that the initial classification model learns the association degree between each category in the training process, and thus learning the feature information with uncertain association relation between the categories, accurately identifying and classifying the images based on the association degree between the categories, and improving the classification performance of the initial classification model.
409. Iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model is converged to obtain an initial classification model;
410. acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model;
411. classifying and predicting the real-time service data through a classification model to obtain the document type of the real-time service data;
412. calculating and integrating the real-time service data according to the document type, and detecting whether the document indexes after precise calculation and integration meet the corresponding preset alarm rules;
413. and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
The steps 410-414 in the present embodiment are similar to the steps 101-104 in the first embodiment, and are not described herein again.
In the embodiment of the invention, the real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document indexes after calculation and integration meet the corresponding preset alarm rules, carrying out alarm reminding on abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
Referring to fig. 5, a fifth embodiment of the document anomaly monitoring method according to the embodiment of the present invention includes:
501. obtaining historical service data of a monitored object and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion;
502. extracting features of the first training sample data based on a feature extraction algorithm to obtain a class vector corresponding to the first training sample data;
503. building a convolutional neural network model, and inputting a class vector corresponding to first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data;
504. acquiring target representation weight vectors of all classes, and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and labels corresponding to training sample data;
505. iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model is converged to obtain an initial classification model;
506. determining a confusion matrix corresponding to the test sample of each label type according to the test result of the test sample of each label type, wherein the confusion matrix comprises test result parameters corresponding to a positive test sample and a negative test sample;
in this embodiment, the confusion matrix includes test result parameters corresponding to the positive test sample and the negative test sample, and as described above, for each class label, the current classification model outputs a two-dimensional vector.
507. Calculating the class classification accuracy of the label class test sample according to the test result parameters in the confusion matrix;
in this embodiment, the class classification accuracy of the different label class test samples is smoothed to obtain the overall classification accuracy of the first test sample data, and the smoothing process is to average the class classification accuracy of the different label class test samples to obtain the overall classification accuracy of the first test sample data.
The classification accuracy in any of the above manners is the classification accuracy after the normalization processing, as described above, when the classification model test is performed, the output result of the classification model is a two-dimensional vector, and the normalized classification accuracy is obtained based on the output result of the classification model, and the value range is (0-1).
In this embodiment, a Softmax function is used to normalize the classification accuracy of each class, and then the increased number of training samples of each label class is calculated according to a formula (1-F11) × Dm, where F11 is the classification accuracy of the normalized class, where Dm is a set sample base number, or the total number of samples of the label class that has been trained.
508. Smoothing the classification accuracy of the different label classification test samples to obtain the overall classification accuracy of the test sample data;
in this embodiment, the first test sample data includes test samples of different label categories, and the test samples of each label category include a positive test sample with a label belonging to the label category and a negative test sample with a label not belonging to the label category.
The method comprises the steps of utilizing test samples to test a classification model, inputting the test samples into a current classification model for testing according to input characteristics of the test samples, determining whether label classification of each test sample is correct according to a result of whether the current classification model outputs the label classification and labels corresponding to the test samples, evaluating classification accuracy of the test samples of different label classifications according to whether the label classification of each test sample is correct, and determining integral classification accuracy of first test sample data according to the classification accuracy of the test samples of different label classifications. According to whether the label classification of each test sample is correct or not, when the classification accuracy of the test samples of different label classes is evaluated, the existing function for evaluating the classification accuracy of the classification model can be adopted for evaluation, such as a loss function.
509. When the overall classification accuracy does not reach the expected effect, acquiring new training sample data of different label categories from a preset database, and training the classification model through the new training sample data until the overall classification accuracy meets the preset requirement to obtain a target classification model;
in this embodiment, each time training samples of different label types are re-extracted from the preset database, the unextracted training samples are extracted from the preset database to form second training sample data; and when samples are added on the basis of the initial first training sample data to form second training sample data, triggering and utilizing the second training sample data to train the initial classification model to obtain a target classification model.
The second training sample data comprises training samples of different label categories, and the training samples of each label category comprise positive training samples with labels belonging to the label category and negative training samples with labels not belonging to the label category.
And training the classification model by using the new training sample, wherein the training sample is used as an input characteristic, and the label corresponding to the training sample is used as an output characteristic.
Specifically, the corresponding linear reduction coefficient may be obtained according to the classification accuracy of each class, and the increased number of training samples of each label class may be obtained by multiplying the linear reduction coefficient by the set sample base number/the total number of training samples of the label class that completes training. As an alternative embodiment, the increased number of training samples of each label category decreases linearly as the classification accuracy of each category increases, and the increased number of training samples of each label category is obtained by multiplying the linear decrease coefficient by the set base number of samples.
As another alternative, the increased number of training samples of each label class is obtained by multiplying the linear reduction coefficient by the total number of training samples of the label class that has completed training. The above-mentioned class classification accuracy is the class classification accuracy of the normalization process, and assuming that F11 is the class classification accuracy of the normalization process of a certain label class, the increment number of the training samples of the label class is (1-F11) × Dm, where Dm is the set sample base number or the total number of samples of the label class that has completed training. In the specific example 1, the classification accuracy is 0.8, and the set sample base is 100, the number of samples to be added is (1-0.8) × 100, that is, the number of samples to be added is 20. In the specific example 2, the class classification accuracy is 0.8, the total number of samples of the label class after the training is currently 100, and the number of samples to be added is (1-0.8) × 100, i.e. 20 samples to be added.
510. Acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model;
511. classifying and predicting the real-time service data through a classification model to obtain the document type of the real-time service data;
512. calculating and integrating the real-time service data according to the document type, and detecting whether the document indexes after precise calculation and integration meet the corresponding preset alarm rules;
513. and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
Steps 501-505 and 510-513 in the present embodiment are similar to steps 201-205 and 101-104 in the first embodiment, and are not described herein again.
In the embodiment of the invention, real-time service data of the real-time service data is obtained, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
With reference to fig. 6, the above describes a document anomaly monitoring method in an embodiment of the present invention, and a document anomaly monitoring apparatus in an embodiment of the present invention is described below, where a first embodiment of the document anomaly monitoring apparatus in an embodiment of the present invention includes:
the first obtaining module 601 is configured to obtain real-time service data of a monitored object, and input the real-time service data into a preset target classification model;
the prediction module 602 is configured to perform classification prediction on the real-time service data through the classification model to obtain a document type of the real-time service data;
the detection module 603 is configured to perform calculation and integration on the real-time service data according to the document type, and detect whether a document index after precision calculation and integration meets a corresponding preset alarm rule;
and an alarm module 604, configured to perform alarm reminding on the monitored object through a preset monitoring host when it is determined that the document index after calculation and integration meets a corresponding preset alarm rule.
In the embodiment of the invention, real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the preset alarm rule document type, calculating and integrating the real-time service data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
Referring to fig. 7, a document anomaly monitoring device according to a second embodiment of the present invention specifically includes:
the first obtaining module 601 is configured to obtain real-time service data of a monitored object, and input the real-time service data into a preset target classification model;
the prediction module 602 is configured to perform classification prediction on the real-time service data through the classification model to obtain a document type of the real-time service data;
the detection module 603 is configured to perform calculation and integration on the real-time service data according to the document type, and detect whether a document index after precision calculation and integration meets a corresponding preset alarm rule;
and the alarm module 604 is configured to perform alarm reminding on the monitored object through a preset monitoring host when it is determined that the computed and integrated document index meets the corresponding preset alarm rule.
In this embodiment, the document abnormality monitoring apparatus further includes:
a second obtaining module 605, configured to obtain historical service data of a monitored object and a label corresponding to the historical service data, and divide the historical service data into first training sample data and first test sample data according to a preset ratio, where the historical service data includes document data and batch data;
a feature extraction module 606, configured to perform feature extraction on the first training sample data based on a feature extraction algorithm to obtain a category vector corresponding to the first training sample data;
a building module 607, configured to build a convolutional neural network model, and input a class vector corresponding to the first training sample data into a classification network of the convolutional neural network model, to obtain a first probability of each class corresponding to the first training sample data;
a calculating module 608, configured to obtain target representation weight vectors of each category, and calculate a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability, and a label corresponding to the training sample data;
and the updating module 609 is configured to iteratively update the convolutional neural network model based on the target loss function until the convolutional neural network model converges to obtain an initial classification model.
In this embodiment, the feature extraction module 606 is specifically configured to:
inputting the first training sample data into a preset bidirectional LSTM model, and extracting a hidden state sequence corresponding to the first training sample data through the bidirectional LSTM model;
performing self-attribute ion processing on the hidden state sequence corresponding to the first training sample data through an attribute ion mechanism to obtain a characterization vector corresponding to the first training sample data;
and constructing a category vector corresponding to the characterization vector corresponding to the first training sample data.
In this embodiment, the calculating module 608 is specifically configured to:
inputting the first training sample data into a feature embedding network of a convolutional neural network model to obtain a characterization vector corresponding to the first training sample data;
acquiring initial characterization weight vectors of all classes, and determining initial distances between the characterization vectors corresponding to the first training sample data and the initial characterization weight vectors of all classes;
determining second probabilities of the classes corresponding to the first training sample data based on the initial distance;
constructing a first loss function according to the label corresponding to the first training sample data and the second probability;
and training the feature embedded network of the convolutional neural network model through the first loss function, and obtaining target characterization weight vectors corresponding to all classes when the training stopping condition is met.
In this embodiment, the update module 609 includes:
an adjusting unit 6091 configured to train a convolutional neural network model through the target loss function, and adjust a weight parameter of a feature extraction layer and a weight parameter of a full connection layer in a classification network;
a training unit 6092, configured to stop training when a preset condition is met, and obtain a target weight of a feature extraction layer and a target weight of a full connection layer in the classification network; and the target weight of the feature extraction layer and the target weight of the full connection layer are parameters in the trained convolutional neural network model.
In this embodiment, the document abnormality monitoring apparatus further includes:
the test module 610 is configured to test the initial classification model through the test sample data, and classify the class classification accuracy of the test sample data of different label classes and the overall classification accuracy of the test sample data according to the test result;
the training module 611 is configured to, when the overall classification accuracy does not reach an expected effect, obtain new training sample data of different label categories from a preset database, and train the classification model through the new training sample data until the overall classification accuracy meets a preset requirement, so as to obtain a target classification model.
In this embodiment, the test module 610 is specifically configured to:
determining a confusion matrix corresponding to each intention type test sample according to the test result of each intention type test sample, wherein the confusion matrix comprises test result parameters corresponding to a positive test sample and a negative test sample;
calculating the classification precision of the intention type test sample according to the test result parameters in the confusion matrix;
and carrying out smoothing treatment on the class classification precision of the test samples with different intention classes to obtain the overall classification precision of the test sample set.
In the embodiment of the invention, real-time service data of the real-time service data is acquired, and the preset alarm rule real-time service data is input into a preset target classification model; classifying and predicting the real-time service data of the preset alarm rule through a preset alarm rule classification model to obtain a document type of the real-time service data of the preset alarm rule; according to the type of the preset alarm rule document, calculating and integrating the real-time business data of the preset alarm rule, and detecting whether the document index after precise calculation and integration meets the corresponding preset alarm rule or not; and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the abnormal real-time service data through a preset monitoring host. According to the invention, the document types of the financial products needing to be maintained in different life cycles are determined based on artificial intelligence learning, and the credit batch monitoring is rapidly realized according to the document types, so that a product manager or operation and maintenance staff does not need to configure related credit batch rules aiming at different products in time, a program can calculate whether the corresponding credit batch is delayed or wrong according to the rules configured by the product manager, and if the corresponding credit batch is delayed or wrong, an alarm can be sent at regular time, and the data abnormity monitoring efficiency is improved.
Fig. 6 and fig. 7 describe the document anomaly monitoring apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the document anomaly monitoring apparatus in the embodiment of the present invention is described in detail from the perspective of hardware processing.
Fig. 8 is a schematic structural diagram of a document anomaly monitoring device according to an embodiment of the present invention, where the document anomaly monitoring device 800 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 810 (e.g., one or more processors) and a memory 820, and one or more storage media 830 (e.g., one or more mass storage devices) storing an application 833 or data 832. Memory 820 and storage medium 830 may be, among other things, transient or persistent storage. The program stored on the storage medium 830 may include one or more modules (not shown), each of which may include a series of instructions operating on the document anomaly monitoring device 800. Further, the processor 810 may be configured to communicate with the storage medium 830, and execute a series of instruction operations in the storage medium 830 on the document anomaly monitoring device 800 to implement the steps of the document anomaly monitoring method provided by the above-mentioned method embodiments.
Document anomaly monitoring device 800 may also include one or more power supplies 840, one or more wired or wireless network interfaces 850, one or more input-output interfaces 860, and/or one or more operating systems 831, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art will appreciate that the configuration of the document anomaly monitoring device shown in FIG. 8 does not constitute a limitation of the document anomaly monitoring devices provided herein, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, or a volatile computer-readable storage medium, where instructions are stored, and when the instructions are executed on a computer, the instructions cause the computer to execute the steps of the above document anomaly monitoring method.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, apparatuses, and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A document abnormity monitoring method is characterized by comprising the following steps:
acquiring real-time service data of a monitored object, and inputting the real-time service data into a preset target classification model;
carrying out classification prediction on the real-time service data through the classification model to obtain the document type of the real-time service data;
calculating and integrating the real-time service data according to the document type, and detecting whether the document index after precise calculation and integration meets a corresponding preset alarm rule;
and when the document index after calculation and integration is determined to meet the corresponding preset alarm rule, carrying out alarm reminding on the monitored object through a preset monitoring host.
2. The document anomaly monitoring method according to claim 1, before the acquiring real-time business data of the monitored object and inputting the real-time business data into a preset target classification model, further comprising:
obtaining historical service data of a monitoring object and a label corresponding to the historical service data, and dividing the historical service data into first training sample data and first test sample data according to a preset proportion, wherein the historical service data comprises document data and letter data;
extracting features of the first training sample data based on a feature extraction algorithm to obtain a category vector corresponding to the first training sample data;
building a convolutional neural network model, and inputting the class vector corresponding to the first training sample data into a classification network of the convolutional neural network model to obtain a first probability of each class corresponding to the first training sample data;
obtaining target representation weight vectors of all classes, and calculating a target loss function of the convolutional neural network model according to the target representation weight vectors, the first probability and labels corresponding to the training sample data;
and iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model is converged to obtain an initial classification model.
3. The document anomaly monitoring method according to claim 2, wherein the performing feature extraction on the first training sample data based on a feature extraction algorithm to obtain the class vector corresponding to the first training sample data comprises:
inputting the first training sample data into a preset bidirectional LSTM model, and extracting a hidden state sequence corresponding to the first training sample data through the bidirectional LSTM model;
performing self-attention processing on the hidden state sequence corresponding to the first training sample data through an attention mechanism to obtain a characterization vector corresponding to the first training sample data;
and constructing a category vector corresponding to the characterization vector corresponding to the first training sample data.
4. The document anomaly monitoring method according to claim 2, wherein the obtaining of the target characterization weight vector of each category comprises:
inputting the first training sample data into a feature embedding network of a convolutional neural network model to obtain a characterization vector corresponding to the first training sample data;
acquiring initial characterization weight vectors of all classes, and determining initial distances between the characterization vectors corresponding to the first training sample data and the initial characterization weight vectors of all classes;
determining second probabilities of the classes corresponding to the first training sample data based on the initial distance;
constructing a first loss function according to the label corresponding to the first training sample data and the second probability;
and training the characteristic embedded network of the convolutional neural network model through the first loss function, and obtaining target characterization weight vectors corresponding to all categories when the training stopping condition is met.
5. The document anomaly monitoring method according to claim 2, wherein the iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model converges to obtain an initial classification model comprises:
training a convolutional neural network model through the target loss function, and adjusting the weight parameters of a feature extraction layer and the weight parameters of a full connection layer in a classification network;
stopping training when a preset condition is met, and obtaining the target weight of a feature extraction layer and the target weight of a full connection layer in the classification network; and the target weight of the feature extraction layer and the target weight of the full connection layer are parameters in the trained convolutional neural network model.
6. The document anomaly monitoring method according to claim 2, wherein after the iteratively updating the convolutional neural network model based on the target loss function until the convolutional neural network model converges to obtain an initial classification model, further comprising:
testing the initial classification model through the test sample data, and classifying the classification accuracy of the test sample data of different label classes and the integral classification accuracy of the test sample data according to the test result;
and when the integral classification accuracy does not reach the expected effect, acquiring new training sample data of different label categories from a preset database, and training the classification model through the new training sample data until the integral classification accuracy meets the preset requirement to obtain a target classification model.
7. The document anomaly monitoring method according to claim 6, wherein the step of testing the initial classification model by the test sample data, and the step of classifying the classes of the test sample data of different label classes and the step of classifying the whole test sample data according to the test result comprises:
determining a confusion matrix corresponding to each intention type test sample according to the test result of each intention type test sample, wherein the confusion matrix comprises test result parameters corresponding to a positive test sample and a negative test sample;
calculating the classification precision of the intention type test sample according to the test result parameters in the confusion matrix;
and carrying out smoothing treatment on the class classification precision of the test samples with different intention classes to obtain the overall classification precision of the test sample set.
8. A document abnormality monitoring device, characterized by comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring real-time service data of a monitored object and inputting the real-time service data into a preset target classification model;
the prediction module is used for carrying out classification prediction on the real-time service data through the classification model to obtain the document type of the real-time service data;
the detection module is used for calculating and integrating the real-time service data according to the document type and detecting whether document indexes subjected to precise calculation and integration meet corresponding preset alarm rules or not;
and the warning module is used for carrying out warning reminding on the monitored object through a preset monitoring host when the document index after calculation and integration is determined to meet the corresponding preset warning rule.
9. A document anomaly monitoring device, comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invoking the instructions in the memory to cause the document anomaly monitoring device to perform the steps of the document anomaly monitoring method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the document anomaly monitoring method according to any one of claims 1 to 7.
CN202210590897.2A 2022-05-27 2022-05-27 Document abnormity monitoring method, device, equipment and storage medium Pending CN114936600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210590897.2A CN114936600A (en) 2022-05-27 2022-05-27 Document abnormity monitoring method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210590897.2A CN114936600A (en) 2022-05-27 2022-05-27 Document abnormity monitoring method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114936600A true CN114936600A (en) 2022-08-23

Family

ID=82866563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210590897.2A Pending CN114936600A (en) 2022-05-27 2022-05-27 Document abnormity monitoring method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114936600A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421848A (en) * 2022-11-04 2022-12-02 平安银行股份有限公司 Model processing method, electronic device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421848A (en) * 2022-11-04 2022-12-02 平安银行股份有限公司 Model processing method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN113378990B (en) Flow data anomaly detection method based on deep learning
CN113391239B (en) Mutual inductor anomaly monitoring method and system based on edge calculation
CN109462521A (en) A kind of network flow abnormal detecting method suitable for source net load interaction industrial control system
RU2686257C1 (en) Method and system for remote identification and prediction of development of emerging defects of objects
CN112288192A (en) Environment-friendly monitoring and early warning method and system
CN115858794B (en) Abnormal log data identification method for network operation safety monitoring
CN116485020B (en) Supply chain risk identification early warning method, system and medium based on big data
Wang et al. Contextual classification for smart machining based on unsupervised machine learning by Gaussian mixture model
CN112560997A (en) Fault recognition model training method, fault recognition method and related device
CN111833175A (en) Internet financial platform application fraud behavior detection method based on KNN algorithm
CN114936600A (en) Document abnormity monitoring method, device, equipment and storage medium
CN115060312A (en) Building material safety monitoring system based on artificial intelligence
CN117076869B (en) Time-frequency domain fusion fault diagnosis method and system for rotary machine
CN113947076A (en) Policy data detection method and device, computer equipment and storage medium
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN116030955B (en) Medical equipment state monitoring method and related device based on Internet of things
CN117094184A (en) Modeling method, system and medium of risk prediction model based on intranet platform
CN112115994A (en) Training method and device of image recognition model, server and storage medium
CN110647117B (en) Chemical process fault identification method and system
CN116678072B (en) Fault processing method and terminal of central air conditioning system and central air conditioning system
CN116956197B (en) Deep learning-based energy facility fault prediction method and device and electronic equipment
CN115831339B (en) Medical system risk management and control pre-prediction method and system based on deep learning
CN115953724B (en) User data analysis and management method, device, equipment and storage medium
Zhao et al. There is a gold mine in flight data: A framework of data mining in civil aviation
CN117749658A (en) Fault prediction method, network operation and maintenance management platform, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination