CN113919291A - Master-slave parallel operation current sharing method based on analog control - Google Patents

Master-slave parallel operation current sharing method based on analog control Download PDF

Info

Publication number
CN113919291A
CN113919291A CN202111129205.6A CN202111129205A CN113919291A CN 113919291 A CN113919291 A CN 113919291A CN 202111129205 A CN202111129205 A CN 202111129205A CN 113919291 A CN113919291 A CN 113919291A
Authority
CN
China
Prior art keywords
word
master
method based
analog control
sharing method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111129205.6A
Other languages
Chinese (zh)
Inventor
金鑫
李鹏辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Alphainsight Technology Co ltd
Original Assignee
Shanghai Alphainsight Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Alphainsight Technology Co ltd filed Critical Shanghai Alphainsight Technology Co ltd
Priority to CN202111129205.6A priority Critical patent/CN113919291A/en
Publication of CN113919291A publication Critical patent/CN113919291A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Abstract

The invention discloses a master-slave parallel operation current sharing method based on analog control, which comprises the following steps: step 1, preprocessing a table and a text; step 2, performing corpus annotation by adopting a brat annotation tool; directly marking the linguistic data in the excel file; step 3, according to the result of the brat corpus labeling, carrying out named entity identification by using a BilSTM + CRF model; classifying by using a BERT model according to the result of excel corpus labeling; and 4, fusing the extracted entity elements and the classified texts according to the corresponding relation, and analyzing the elements by carrying out named entity identification and text on the approval opinions of the bank loan so as to return related element information.

Description

Master-slave parallel operation current sharing method based on analog control
Technical Field
The invention relates to the technical field of network information, in particular to a master-slave parallel operation current sharing method based on analog control.
Background
The traditional issuing and auditing of public credit service is completed by manually auditing paper data and system information in credit operation, the credit operation intelligent auditing system aims to explore the system automation level of auditing by artificial intelligence related technology, the targeted pain point is mainly the substitute risk control capability (including reducing the influence of artificial subjective or objective factors and improving the auditing standard rate of loan data), the second aspect is to improve the efficiency and improve the user experience (especially the efficiency in a remote auditing mode), the third aspect is to relieve the situation of human resource tension (release the human resource of professional auditors), and meanwhile, the mechanism construction of a strong data accumulation sediment and an intelligent research and development platform can be enhanced.
Taking the implementation of the scene of the approval opinions as an example, the analysis necessity and value comprise three aspects:
1) in the aspect of human resources:
in 2019, in all years, the operation review of all banks of credit checks that the credit business is 193789 in total, and the average review time length of one approval opinion is calculated according to one business and 15 minutes: 193789 × 15/60 ═ 48447.25 hours, i.e., a business, and an intelligent audit of approval may release 48447.25 hours of work per year for a released underwriting audit.
2) The service handling efficiency is improved:
after the manual labor is replaced by the system automation, the business handling efficiency is directly improved, the market demand response time is greatly shortened, the customer experience is improved, and the market competitiveness of the client is further improved.
3) And (3) improving the risk management and control capability:
the operation completed by the system has the advantages of strong calculation ability, all weather and no fatigue factor influence, directly improves the operation standard rate and reduces the artificial subjective factor influence of the link.
Disclosure of Invention
The invention aims to provide a master-slave parallel operation current sharing method based on analog control, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
a master-slave parallel operation current sharing method based on analog control comprises the following steps:
step 1, preprocessing a table and a text;
step 2, performing corpus annotation by adopting a brat annotation tool; directly marking the linguistic data in the excel file;
step 3, according to the result of the brat corpus labeling, carrying out named entity identification by using a BilSTM + CRF model; classifying by using a BERT model according to the result of excel corpus labeling;
and 4, fusing the extracted entity elements and the classified texts according to the corresponding relation.
As a further technical solution of the present invention, step 1 specifically is: the preprocessing operation on the table text comprises the steps of extracting the lines where the opinions in the table are located, converting the extracted and combined opinion content into the text, and cleaning the characters.
As a further technical solution of the present invention, the step 2 includes the following substeps:
step 21, respectively storing the corpora into a txt file and a ann file by adopting a brat marking tool and using a BIO marking method;
and step 22, marking the rear example of the linguistic data cell in the excel file by using an integer label, and finally putting all the semantic data cell in the txt file.
As a further technical solution of the present invention, step 3 includes the following substeps:
step 31, segmenting training data through a jieba toolkit;
step 32, sequencing the data after word segmentation according to the sequence to obtain corresponding subscript indexes, and storing the subscript indexes into a word _ to _ index array;
step 33, converting the word segmentation data into corresponding index vectors through word _ to _ index, and performing truncation according to a fixed length;
step 34, loading the downloaded word2vec word vectors, and constructing a word vector matrix according to the stored indexes in the word _ to _ index;
step 35, inputting the knowledge of the word vector into the model, and encoding by using the word vector matrix obtained in the step 34;
step 36, inputting the encoded vector into the long-term and short-term memory network, wherein the module will also use the hidden node state of the previous time point as the input of the current neural network unit, and meanwhile, a gating mechanism is used, that is, part of information of the hidden node state of the previous time point is selected in a controllable manner, so that information fusion of the current time node is performed, and finally hidden layer information is obtained;
step 37, inputting the hidden layer information into a CRF layer, and calculating by using a Viterbi coding algorithm of the CRF layer to obtain an entity label result;
and step 38, inputting the hidden layer information into a softmax function to obtain a probability matrix of the classification labels, and finally obtaining the final classification label of each paragraph through an argmax function.
As a further technical solution of the present invention, step 4 includes the following substeps:
step 41, obtaining the elements of the number, the precondition, the management requirement and the risk prompt in the text by calling the single service for the single examination and approval opinions;
and step 42, obtaining the classification label of each sentence by calling the comprehensive approval opinions through the comprehensive service: 11. 0, 4, 5, 6, wherein 11 represents a sentence containing a subsidiary; 0 represents none; 4 represents a sentence containing a management requirement; 5 represents a sentence containing a precondition; and 6 represents a sentence containing a risk hint.
Compared with the prior art, the invention has the beneficial effects that: the invention carries out named entity recognition and text on the approval opinions of the bank loan to analyze the elements and return the related element information.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a master-slave parallel operation current sharing method based on analog control includes the following steps:
step 1, preprocessing a text;
the method comprises the following steps: the preprocessing operation on the table text comprises the steps of extracting the lines where the opinions in the table are located, converting the extracted and combined opinion content into the text, and cleaning the characters.
Step 2, performing corpus annotation by adopting a brat annotation tool; directly marking the linguistic data in the excel file;
wherein, the following substeps are included:
and step 21, storing the corpora into a txt file and a ann file respectively by adopting a brat marking tool and using a BIO marking method.
Step 22, marking the rear example of the linguistic data cell in the excel file by using an integer label, and finally putting all the semantic data cell in the txt file;
step 3, according to the result of the brat corpus labeling, carrying out named entity identification by using a BilSTM + CRF model; classifying by using a BERT model according to the result of excel corpus labeling;
wherein, the following substeps are included:
and 31, segmenting the training data through a jieba toolkit.
And 32, sequencing the data after word segmentation according to the sequence to obtain corresponding subscript indexes, and storing the subscript indexes in a word _ to _ index array.
And step 33, converting the word segmentation data into corresponding index vectors through word _ to _ index, and performing truncation according to a fixed length.
And step 34, loading the downloaded word2vec word vectors, and constructing a word vector matrix according to the stored indexes in the word _ to _ index process.
And 35, inputting the knowledge of the word vector into the model, and encoding by using the word vector matrix obtained in the step 34.
And step 36, inputting the coded vector into the long-term and short-term memory network, wherein the module also takes the hidden node state of the previous time point as the input of the current neural network unit, and meanwhile, a gating mechanism is utilized, namely, part of information of the hidden node state of the previous time point is selected in a controllable manner, the information fusion of the current time node is carried out, and finally hidden layer information is obtained.
And step 37, inputting the hidden layer information into a CRF layer, and calculating by using a Viterbi coding algorithm of the CRF layer to obtain an entity label result.
And 38, inputting the hidden layer information into a softmax function to obtain a probability matrix of the classification label, and finally obtaining the final classification label of each paragraph through an argmax function.
And 4, fusing the extracted entity elements and the classified texts according to the corresponding relation.
Wherein, the following substeps are included:
step 41, obtaining the elements of the number, the precondition, the management requirement and the risk prompt in the text by calling the single service for the single examination and approval opinions;
and step 42, obtaining the classification label of each sentence by calling the comprehensive approval opinions through the comprehensive service: 11. 0, 4, 5, 6, wherein 11 represents a sentence containing a subsidiary; 0 represents none; 4 represents a sentence containing a management requirement; 5 represents a sentence containing a precondition; and 6 represents a sentence containing a risk hint.
And 43, adding the management requirements, the preconditions and the risk prompts of the corresponding subsidiary companies to the single corresponding elements according to the credit clients.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (5)

1. A master-slave parallel operation current sharing method based on analog control is characterized by comprising the following steps:
step 1, preprocessing a table and a text;
step 2, performing corpus annotation by adopting a brat annotation tool; directly marking the linguistic data in the excel file;
step 3, according to the result of the brat corpus labeling, carrying out named entity identification by using a BilSTM + CRF model; classifying by using a BERT model according to the result of excel corpus labeling;
and 4, fusing the extracted entity elements and the classified texts according to the corresponding relation.
2. The master-slave parallel operation current sharing method based on analog control according to claim 1, wherein the step 1 is specifically: the preprocessing operation on the table text comprises the steps of extracting the lines where the opinions in the table are located, converting the extracted and combined opinion content into the text, and cleaning the characters.
3. The master-slave parallel-operation current sharing method based on analog control according to claim 2, wherein the step 2 includes the following sub-steps:
step 21, respectively storing the corpora into a txt file and a ann file by adopting a brat marking tool and using a BIO marking method;
and step 22, marking the rear example of the linguistic data cell in the excel file by using an integer label, and finally putting all the semantic data cell in the txt file.
4. The master-slave parallel-operation current sharing method based on analog control according to claim 1, wherein the step 3 comprises the following sub-steps:
step 31, segmenting training data through a jieba toolkit;
step 32, sequencing the data after word segmentation according to the sequence to obtain corresponding subscript indexes, and storing the subscript indexes into a word _ to _ index array;
step 33, converting the word segmentation data into corresponding index vectors through word _ to _ index, and performing truncation according to a fixed length;
step 34, loading the downloaded word2vec word vectors, and constructing a word vector matrix according to the stored indexes in the word _ to _ index;
step 35, inputting the knowledge of the word vector into the model, and encoding by using the word vector matrix obtained in the step 34;
step 36, inputting the encoded vector into the long-term and short-term memory network, wherein the module will also use the hidden node state of the previous time point as the input of the current neural network unit, and meanwhile, a gating mechanism is used, that is, part of information of the hidden node state of the previous time point is selected in a controllable manner, so that information fusion of the current time node is performed, and finally hidden layer information is obtained;
step 37, inputting the hidden layer information into a CRF layer, and calculating by using a Viterbi coding algorithm of the CRF layer to obtain an entity label result;
and step 38, inputting the hidden layer information into a softmax function to obtain a probability matrix of the classification labels, and finally obtaining the final classification label of each paragraph through an argmax function.
5. The master-slave parallel-operation current sharing method based on analog control according to claim 1, wherein the step 4 comprises the following sub-steps:
step 41, obtaining the elements of the number, the precondition, the management requirement and the risk prompt in the text by calling the single service for the single examination and approval opinions;
and step 42, obtaining the classification label of each sentence by calling the comprehensive approval opinions through the comprehensive service: 11. 0, 4, 5, 6, wherein 11 represents a sentence containing a subsidiary; 0 represents none; 4 represents a sentence containing a management requirement; 5 represents a sentence containing a precondition; and 6 represents a sentence containing a risk hint.
CN202111129205.6A 2021-09-26 2021-09-26 Master-slave parallel operation current sharing method based on analog control Pending CN113919291A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111129205.6A CN113919291A (en) 2021-09-26 2021-09-26 Master-slave parallel operation current sharing method based on analog control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111129205.6A CN113919291A (en) 2021-09-26 2021-09-26 Master-slave parallel operation current sharing method based on analog control

Publications (1)

Publication Number Publication Date
CN113919291A true CN113919291A (en) 2022-01-11

Family

ID=79236177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111129205.6A Pending CN113919291A (en) 2021-09-26 2021-09-26 Master-slave parallel operation current sharing method based on analog control

Country Status (1)

Country Link
CN (1) CN113919291A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777607A (en) * 2023-08-24 2023-09-19 上海银行股份有限公司 Intelligent auditing method based on NLP technology

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777607A (en) * 2023-08-24 2023-09-19 上海银行股份有限公司 Intelligent auditing method based on NLP technology
CN116777607B (en) * 2023-08-24 2023-11-07 上海银行股份有限公司 Intelligent auditing method based on NLP technology

Similar Documents

Publication Publication Date Title
CN109857990B (en) Financial bulletin information extraction method based on document structure and deep learning
CN113158665B (en) Method for improving dialog text generation based on text abstract generation and bidirectional corpus generation
CN108427771A (en) Summary texts generation method, device and computer equipment
CN109670035A (en) A kind of text snippet generation method
CN110110335A (en) A kind of name entity recognition method based on Overlay model
US20220300546A1 (en) Event extraction method, device and storage medium
CN112819604A (en) Personal credit evaluation method and system based on fusion neural network feature mining
CN110265098A (en) A kind of case management method, apparatus, computer equipment and readable storage medium storing program for executing
CN110297909B (en) Method and device for classifying unlabeled corpora
CN114139497A (en) Text abstract extraction method based on BERTSUM model
CN109919175A (en) A kind of more classification methods of entity of combination attribute information
CN112434159A (en) Method for classifying thesis multiple labels by using deep neural network
CN113032552A (en) Text abstract-based policy key point extraction method and system
CN113836866A (en) Text coding method and device, computer readable medium and electronic equipment
CN113919291A (en) Master-slave parallel operation current sharing method based on analog control
CN112598039B (en) Method for obtaining positive samples in NLP (non-linear liquid) classification field and related equipment
CN116562265B (en) Information intelligent analysis method, system and storage medium
CN116775975A (en) Deep learning network for analysis of complex news text public opinion in financial field
CN111581386A (en) Construction method, device, equipment and medium of multi-output text classification model
CN111966828B (en) Newspaper and magazine news classification method based on text context structure and attribute information superposition network
CN115587184A (en) Method and device for training key information extraction model and storage medium thereof
CN114996442A (en) Text abstract generation system combining abstract degree judgment and abstract optimization
CN114168720A (en) Natural language data query method and storage device based on deep learning
CN114091452A (en) Adapter-based transfer learning method, device, equipment and storage medium
CN113673222A (en) Social media text fine-grained emotion analysis method based on bidirectional collaborative network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination