CN117495538B - Risk assessment method and model training method for order financing - Google Patents

Risk assessment method and model training method for order financing Download PDF

Info

Publication number
CN117495538B
CN117495538B CN202311423786.3A CN202311423786A CN117495538B CN 117495538 B CN117495538 B CN 117495538B CN 202311423786 A CN202311423786 A CN 202311423786A CN 117495538 B CN117495538 B CN 117495538B
Authority
CN
China
Prior art keywords
news
target
information
language model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311423786.3A
Other languages
Chinese (zh)
Other versions
CN117495538A (en
Inventor
祝捷
孙杰
黄聪
陈灏
王悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ronghe Cloud Chain Technology Co ltd
Original Assignee
Beijing Ronghe Cloud Chain Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ronghe Cloud Chain Technology Co ltd filed Critical Beijing Ronghe Cloud Chain Technology Co ltd
Priority to CN202311423786.3A priority Critical patent/CN117495538B/en
Publication of CN117495538A publication Critical patent/CN117495538A/en
Application granted granted Critical
Publication of CN117495538B publication Critical patent/CN117495538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a risk assessment method and a model training method for order financing, which relate to the technical fields of natural language processing and deep learning, and the method comprises the following steps: acquiring target information of an order to be financing, and acquiring at least one initial keyword of the target information; inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword; news retrieval is conducted on the target database based on at least one target keyword to obtain at least one first retrieval news; and generating a risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news. Therefore, the risk assessment report of the order to be financing can be automatically generated, the cost of human resources can be reduced, and the problem of inaccuracy of the risk assessment report caused by subjectivity of people can be effectively avoided.

Description

Risk assessment method and model training method for order financing
Technical Field
The disclosure relates to the technical field of natural language processing and deep learning, in particular to a risk assessment method for order financing, a training method for a knowledge graph model and a training method for a language model.
Background
With the rapid development of global economy, businesses increasingly need flexible funds support in market competition to meet the execution needs of orders. Order financing is used as a supply chain financial mode, an effective financing mode is provided for enterprises, the enterprises can be helped to solve the problem of fund shortage, and the competitiveness of the enterprises is improved.
Suppliers, referring to upstream enterprises that provide raw materials for core enterprises; order financing refers to a manner in which suppliers meet the production and delivery requirements of an order by mortgage of future orders to financial institutions for funding support.
The characteristics of order financing mainly comprise the following aspects:
1. Highly flexible: order financing may formulate a financing scheme, including financing amount, deadline, etc., based on the specific needs of the enterprise. Meanwhile, order financing can be adjusted according to the specific conditions of the order, so that the fund requirements of enterprises at different stages are met.
2. High efficiency and high speed: the order financing is usually operated and managed by an on-line platform, so that the financing flow can be greatly shortened and the financing efficiency can be improved. Through order financing, enterprises can quickly obtain required funds, and the execution speed of orders is improved.
3. Risk is controllable: order financing typically involves the order as a mortgage, and financial institutions may assess risk based on the authenticity and executable of the order. At the same time, financial institutions can also reduce risk occurrence and loss generation through reasonable risk management measures.
After the financial institution receives the financing application of the enterprise, the authenticity and the executable of the order are evaluated, and the credit investigation and the risk evaluation of the enterprise are performed.
Disclosure of Invention
The present disclosure aims to solve, at least to some extent, one of the technical problems in the related art.
To this end, a first object of the present disclosure is to propose a risk assessment method for order financing.
A second object of the present disclosure is to provide a training method of a knowledge-graph model.
A third object of the present disclosure is to propose a training method of a language model.
A fourth object of the present disclosure is to provide a risk assessment apparatus for order financing.
A fifth object of the present disclosure is to provide a training device for a knowledge-graph model.
A sixth object of the present disclosure is to provide a training apparatus for language models.
A seventh object of the present disclosure is to propose an electronic device.
An eighth object of the present disclosure is to propose a computer readable storage medium.
A ninth object of the present disclosure is to propose a computer programme product.
To achieve the above object, an embodiment of a first aspect of the present disclosure provides a risk assessment method for order financing, including: acquiring target information of an order to be financing, and acquiring at least one initial keyword of the target information; inputting the at least one initial keyword into a knowledge graph model to obtain at least one target keyword; news retrieval is carried out on the target database based on the at least one target keyword so as to obtain at least one first retrieval news; and generating a risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news.
According to one embodiment of the disclosure, the generating, according to the target information, the at least one target keyword, and the at least one first search news, the risk assessment report of the order to be financing using a first language model includes: screening the at least one first search news to obtain at least one target news; splicing the target information, the at least one target keyword and the at least one target news which are all in a set format to obtain splicing information; and generating a risk assessment report of the order to be financing by adopting the first language model according to the splicing information.
According to one embodiment of the disclosure, the generating, according to the splicing information, the risk assessment report of the to-be-financing order by adopting the first language model includes: filling the spliced information into corresponding filling positions in a first prompt template to obtain first prompt information; compressing the first prompt information to obtain compressed information; and inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model.
According to one embodiment of the present disclosure, the screening the at least one first search news to obtain at least one target news includes: according to the at least one first search news, the first language model is adopted to obtain a first reliability score corresponding to each first search news; according to the at least one first retrieval news, a second language model is adopted to obtain a second reliability score corresponding to each first retrieval news; for any one of the first retrieval information, carrying out weighted summation on the first reliability score and the second reliability score corresponding to the first retrieval news to obtain the weight corresponding to the first retrieval news; and screening the target news from the at least one first search news according to the weight corresponding to each first search news.
According to one embodiment of the present disclosure, the screening the target news from the at least one first search news according to the weight corresponding to each first search news includes: ranking the first search news according to the order of the weights from big to small to obtain a first ranking sequence; and determining the first search news with the serial number smaller than the set serial number in the first ordering sequence as the target news.
According to one embodiment of the present disclosure, the obtaining at least one initial keyword of the target information includes: and acquiring at least one initial keyword of the target information by adopting the first language model according to the target information.
According to one embodiment of the present disclosure, the inputting the at least one initial keyword into the knowledge-graph model to obtain at least one target keyword includes: inputting the at least one initial keyword into the knowledge graph model to obtain an entity relationship object with an association relationship with each initial keyword; and taking each entity relation object and each initial keyword as the target keywords.
To achieve the above object, an embodiment of a second aspect of the present disclosure provides a training method for a knowledge graph model, including: acquiring first initial training data; wherein the first initial training data includes supply chain financial information and order financing information; acquiring a plurality of groups of first entity relation triples by adopting a third language model based on the first initial training data; based on the first initial training data, a relation extraction model is adopted to obtain a plurality of groups of second entity relation triples; acquiring first target training data according to the plurality of groups of first entity relation triples and the plurality of groups of second entity relation triples; and training the initial knowledge-graph model based on the first target training data to obtain a trained knowledge-graph model.
According to one embodiment of the disclosure, the acquiring, based on the first initial training data, a plurality of sets of first entity-relationship triples using a third language model includes: inputting the first initial training data to the third language model to extract at least one set of third entity-relationship triples from the first initial training data; wherein any one of the third entity relationship triples comprises two first entities; news searching is carried out on each first entity through a first search engine so as to obtain a plurality of first search news; and inputting each first search news to the third language model to obtain a plurality of groups of first entity relation triples.
According to one embodiment of the disclosure, the obtaining at least one set of second entity relationship triples based on the first initial training data using a relationship extraction model includes: extracting entity relation triples from the first initial training data through the relation extraction model to obtain at least one group of fourth entity relation triples; wherein any one of the fourth entity relationship triples comprises two second entities; news searching is carried out on each second entity through a second search engine so as to obtain a plurality of second search news; and extracting entity relation triples of the second search news through the relation extraction model to obtain a plurality of groups of second entity relation triples.
According to one embodiment of the disclosure, the obtaining the first target training data according to the multiple sets of first entity relationship triples and the multiple sets of second entity relationship triples includes: determining any one of the first target triples as first target training data in response to at least one of the plurality of sets of first entity-relationship triples and the plurality of sets of second entity-relationship triples; wherein any one of the first target triples is both the first entity relationship triplet and the second entity relationship triplet; determining an entity relationship triplet that is only the first entity relationship triplet and an entity relationship triplet that is only the second entity relationship triplet as a second target triplet; and aiming at any second target triplet, determining the second target triplet with no verification errors as the first target training data under the condition that the second target triplet is verified with no verification errors.
To achieve the above object, an embodiment of a third aspect of the present disclosure provides a training method for a language model, including: acquiring a self-supervision training data set, a positive and negative sample training data set and an instruction fine tuning training data set; performing first training on the initial first language model based on the self-supervision training data set to obtain a first language model subjected to the first training; performing second training on the first language model subjected to the first training based on the positive and negative sample training data sets to obtain a first language model subjected to the second training; and performing third training on the first language model subjected to the second training based on the instruction fine-tuning training data set to obtain a trained first language model.
To achieve the above object, a fourth aspect of the present disclosure provides a risk assessment apparatus for order financing, including:
The acquisition module is used for acquiring target information of an order to be financing and acquiring at least one initial keyword of the target information;
the input module is used for inputting the at least one initial keyword into the knowledge graph model so as to obtain at least one target keyword;
the retrieval module is used for carrying out news retrieval on the target database based on the at least one target keyword so as to obtain at least one first retrieval news;
and the generation module is used for generating the risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news.
To achieve the above object, an embodiment of a fifth aspect of the present disclosure provides a training device for a knowledge-graph model, the device including:
the first acquisition module is used for acquiring first initial training data; wherein the first initial training data includes supply chain financial information and order financing information;
the second acquisition module is used for acquiring a plurality of groups of first entity relation triples by adopting a third language model based on the first initial training data;
The third acquisition module is used for acquiring a plurality of groups of second entity relationship triples by adopting a relationship extraction model based on the first initial training data;
A fourth obtaining module, configured to obtain first target training data according to the multiple sets of first entity relationship triples and the multiple sets of second entity relationship triples;
and the training module is used for training the initial knowledge-graph model based on the first target training data so as to obtain a trained knowledge-graph model.
To achieve the above object, an embodiment of a sixth aspect of the present disclosure provides a training apparatus for a language model, the apparatus including:
The acquisition module is used for acquiring the self-supervision training data set, the positive and negative sample training data set and the instruction fine tuning training data set;
The first training module is used for carrying out first training on the initial first language model based on the self-supervision training data set to obtain a first language model after the first training;
the second training module is used for carrying out second training on the first language model subjected to the first training based on the positive and negative sample training data set to obtain the first language model subjected to the second training;
And the third training module is used for performing third training on the first language model subjected to the second training based on the instruction fine tuning training data set to obtain a trained first language model.
To achieve the above object, an embodiment of a seventh aspect of the present disclosure proposes an electronic device, including: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to implement the risk assessment method for order financing according to the first aspect of the present disclosure, or the training method for the knowledge graph model according to the second aspect of the present disclosure, or the training method for the language model according to the third aspect of the present disclosure.
To achieve the above object, an eighth aspect of the present disclosure proposes a computer-readable storage medium storing computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, are configured to implement a risk assessment method for order financing according to an embodiment of the first aspect of the present disclosure, or a training method for a knowledge graph model according to an embodiment of the second aspect, or a training method for a language model according to an embodiment of the third aspect.
To achieve the above object, an embodiment of a ninth aspect of the present disclosure proposes a computer program product, comprising a computer program, which when executed by a processor is configured to implement a risk assessment method for order financing according to an embodiment of the first aspect of the present disclosure, or a training method for a knowledge graph model according to an embodiment of the second aspect, or a training method for a language model according to an embodiment of the third aspect.
According to the risk assessment method for order financing, target information of an order to be financing is obtained, and at least one initial keyword of the target information is obtained; inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword; news retrieval is conducted on the target database based on at least one target keyword to obtain at least one first retrieval news; and generating a risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news. Therefore, the risk assessment report of the order to be financing can be automatically generated, the cost of human resources can be reduced, and the problem of inaccuracy of the risk assessment report caused by subjectivity of people can be effectively avoided.
Drawings
FIG. 1 is a flow chart of a risk assessment method for order financing according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a risk assessment method for order financing according to a second embodiment of the present disclosure;
FIG. 3 is a flow chart of a risk assessment method for order financing according to a third embodiment of the present disclosure;
Fig. 4 is a flow chart of a training method of a knowledge graph model according to a fourth embodiment of the disclosure;
FIG. 5 is a flowchart of a training method of a language model according to a fifth embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a risk assessment device for order financing according to a sixth embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a training device for a knowledge-graph model according to a seventh embodiment of the disclosure;
FIG. 8 is a schematic structural diagram of a training device for language models according to an embodiment of the present disclosure;
Fig. 9 is a schematic diagram of an electronic device provided in a ninth embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present disclosure and are not to be construed as limiting the present disclosure.
In the related art, in risk analysis and evaluation of orders to be financing in supply chain finance of some industries (such as power industry), the following methods are often adopted:
1. The method is characterized in that the method carries out prejudgment by depending on the experiences of abundant domain experts through the related information, data and the like of core enterprises and suppliers; the method has the defects of time consumption, large fluctuation of prediction accuracy, scarcity of field experts with rich experience, and certain subjectivity of the field experts in risk analysis and evaluation.
2. After a large number of indexes are established based on the data of suppliers and core enterprises, historical information is collected to serve as a sample, the sample is modeled by adopting simple statistical learning methods such as logistic regression and the like, coefficients are fitted to obtain a proper model, and risk analysis and evaluation are carried out on an order to be financing in supply chain finance through the model; the method has the defects that a large number of feature attempts are required for fine feature engineering, the features used by the model can be simply subjected to slice statistics and derivative, and when information in data of suppliers and core enterprises is missing, the obtained model has insufficient precision.
3. Risk analysis is performed using a deep learning model, such as a CNN (Convolutional Neural Network ) model; however, the current deep learning model only analyzes and evaluates the risk of the order to be financing from a certain angle, and lacks a deep learning model for comprehensively analyzing the risk.
Aiming at least one existing problem, the disclosure provides a risk assessment method for order financing, a training method for a knowledge graph model and a training method for a language model.
The risk assessment method of order financing, the training method of a knowledge graph model, and the training method of a language model according to the embodiments of the present disclosure are described below with reference to the accompanying drawings.
Fig. 1 is a flowchart illustrating a risk assessment method for order financing according to an embodiment of the present disclosure.
The embodiment of the disclosure is illustrated by the fact that the risk assessment method for order financing is configured in a risk assessment device for order financing, and the risk assessment device for order financing can be applied to any electronic equipment so that the electronic equipment can execute the risk assessment function of order financing.
The electronic device may be any device with computing capability, for example, a PC (Personal Computer ), a mobile terminal, a server, and the like, and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, and other hardware devices with various operating systems, touch screens, and/or display screens.
As shown in fig. 1, the risk assessment method for order financing may include the following steps:
step S101, obtaining target information of an order to be financing, and obtaining at least one initial keyword of the target information.
In the embodiment of the disclosure, the target information may include order information of an order to be financing, such as an order contract, financing information, and the like. It should be noted that, in the case where there is an analyst evaluation report, that is, a risk evaluation report of the analyst to be used for the financing order, the target information may also include the analyst evaluation report.
In the embodiment of the present disclosure, the initial keywords may include, for example, a core enterprise name, a provider legal name, a provider total manager name, an industry to which the order to be finalised belongs, a start time of the order to be finalised, a deadline of the order to be finalised, and the like, which is not limited by the present disclosure.
In the embodiment of the disclosure, target information of an order to be financing may be acquired. For example, the target information of the order to be financing may be obtained by manually uploading or sending to the device where the executing body of the present disclosure is located; or may obtain target information for obtaining the order to be financing from local storage, etc., which is not limiting in this disclosure.
In the embodiment of the present disclosure, an initial keyword of target information may be acquired. It should be noted that the initial keyword in the target information may be, but is not limited to, one, which is not limited in this disclosure.
As an example, assuming that the order information of the order to be financing is an order contract for the core enterprise a to purchase a batch of batteries with the provider B, the initial keywords obtained from the batch of order contracts may include A, B, new energy, batteries, for example.
It should be noted that the examples of the order information and the initial keywords thereof for the order to be finalised are merely exemplary, and may be other in practical applications.
Step S102, inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword.
In the embodiment of the present disclosure, the knowledge graph model may be, but is not limited to, a knowledge graph relation extraction model based on an LSTM (Long Short-Term Memory network), which is not limited in this disclosure.
In the embodiment of the disclosure, each initial keyword may be input into a knowledge graph model, and at least one target keyword may be obtained. It should be noted that the number of target keywords may be, but is not limited to, one, which is not limited by the present disclosure.
As a possible implementation manner, at least one initial keyword may be input into the knowledge graph model, an entity relationship object having an association relationship with each initial keyword may be obtained, and each entity relationship object and each initial keyword may be used as a target keyword.
Still referring to the above example, assuming that the initial keyword is "a" or "battery", inputting the initial keyword into the knowledge graph model, an entity relationship object having an association relationship with the initial keyword "a" and an entity relationship object having an association relationship with the initial keyword "battery" may be obtained; for example, an entity relationship object "C" having an association relationship with the initial keyword "a", where C is a core enterprise, and the association relationship between the core enterprise a and the core enterprise C is a "competitor" relationship, that is, the core enterprise a and the core enterprise C are competitors; the entity relation object 'lithium battery' with the association relation with the initial keyword 'battery', wherein the association relation between the battery and the lithium battery is an 'containing' relation, namely the battery contains the lithium battery; thus, after the entity relationship objects having the association relationship with the initial keywords are obtained, each entity relationship object and each initial keyword may be used as a target keyword, that is, "a", "battery", "C", and "lithium battery" are all used as target keywords.
It should be noted that the above examples of the entity relationship object having the association relationship with the initial keyword are merely exemplary, and may be other in practical application, which is not limited in this disclosure.
Step S103, news searching is conducted on the target database based on the at least one target keyword to obtain at least one first search news.
In the embodiment of the present disclosure, the first search news may be, for example, enterprise news, industry news, provider news, and the like, which is not limited by the present disclosure.
In the embodiment of the disclosure, news retrieval can be performed on the target database according to each target keyword, so that first retrieval news can be obtained. It should be noted that the number of the first search news may be, but is not limited to, one, which is not limited in this disclosure.
As a possible implementation, after the at least one first search news is acquired, the at least one first search news may be subjected to a cleaning process.
As an example, performing the cleaning process on the at least one first search news may include:
1. removing repeated news in all the first search news;
2. Removing target news of all the first search news, which are of the source websites and belong to the set source websites, according to the source websites of the first search news; the setting source website may be preset, which is not limited in this disclosure.
3. And removing all irrelevant information in the first retrieval news. For example, HTML Tag information acquired by a crawler technique is removed.
Therefore, the cleaning process is carried out on at least one first search news, so that noise can be effectively reduced, and subsequent data processing is facilitated.
Step S104, according to the target information, at least one target keyword and at least one first retrieval news, a first language model is adopted to generate a risk assessment report of the order to be financing.
In an embodiment of the present disclosure, the first language model may be, for example, but not limited to, a trained Baichuan Baichuan-13B large model, which is not limited by the present disclosure.
In the embodiment of the disclosure, a risk assessment report of an order to be financing can be generated by adopting a first language model according to target information, target keywords and first search news.
In one possible implementation manner of the embodiment of the present disclosure, in order to obtain an initial keyword in the target information, a first language model may be used to obtain at least one initial keyword of the target information according to the target information. For example, the target information may be input to the first language model, and the output of the first language model may be used as an initial keyword of the target information.
According to the risk assessment method for order financing, target information of an order to be financing is obtained, and at least one initial keyword of the target information is obtained; inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword; news retrieval is conducted on the target database based on at least one target keyword to obtain at least one first retrieval news; and generating a risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news. Therefore, the risk assessment report of the order to be financing can be automatically generated, the cost of human resources can be reduced, and the problem of inaccuracy of the risk assessment report caused by subjectivity of people can be effectively avoided.
In order to clearly illustrate how, in the above embodiments of the present disclosure, a risk assessment report of an order to be financing is generated by using a first language model according to target information, at least one target keyword, and at least one first search news, the present disclosure also proposes a risk assessment method for order financing.
Fig. 2 is a flowchart of a risk assessment method for order financing according to a second embodiment of the present disclosure.
As shown in fig. 2, the risk assessment method for order financing may include the following steps:
Step S201, obtaining target information of an order to be financing, and obtaining at least one initial keyword of the target information.
Step S202, inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword.
In step S203, news searching is performed on the target database based on the at least one target keyword, so as to obtain at least one first search news.
The explanation of step S201 to step S203 may be referred to the related description in any embodiment of the disclosure, and will not be repeated here.
Step S204, screening the at least one first search news to obtain at least one target news.
In the embodiment of the disclosure, the obtained at least one first search news may be screened, and at least one target news may be obtained from the at least one first search news.
It should be noted that the number of the target news may be, but is not limited to, one, which is not limited in this disclosure.
In order to acquire the target news, in one possible implementation manner of the embodiment of the disclosure, the following steps may be adopted to implement acquisition of the target news:
Step S2041, according to at least one first search news, a first language model is adopted to obtain a first reliability score corresponding to each first search news.
In the embodiment of the disclosure, the first language model may evaluate the reliability of each first search news.
In the embodiment of the disclosure, for any first search news, a first language model may be used for the first search news to obtain a first reliability score corresponding to the first search news. For example, the first search news may be input to the first language model, resulting in a first reliability score corresponding to the first search news.
Step S2042, according to at least one first search news, a second language model is adopted, and a second reliability score corresponding to each first search news is obtained.
In the embodiment of the present disclosure, the second language model may be, for example, a large-scale model of the text-to-text, a large-scale model of the meaning, and the like, which is not limited by the present disclosure.
It should be noted that, in the present disclosure, the second language model is different from the first language model.
In the embodiment of the disclosure, the second language model may also evaluate the reliability of each first search news.
In the embodiment of the disclosure, for any first search news, a second language model may be used for the first search news to obtain a second reliability score corresponding to the first search news. For example, the first search news may be input to a second language model, resulting in a second reliability score corresponding to the first search news.
It should be noted that, the execution timing of step S2041 and step S2042 is not limited by the present disclosure, that is, step S2042 may be executed in parallel with step S2041, or step S2042 may be executed before step S2041, and the present disclosure is merely exemplified by step S2041 executed before step S2042, which is not limited by the present disclosure.
Step S2043, for any first retrieval information, performs weighted summation on the first reliability score and the second reliability score corresponding to the first retrieval news to obtain the weight corresponding to the first retrieval news.
In the embodiment of the disclosure, for any first retrieval information, a weighted sum is performed on a first reliability score and a second reliability analysis corresponding to the first retrieval news, so as to obtain a weight corresponding to the first retrieval news.
As an example, assume that the reliability score includes a proportion of the logic score and the effective information, that is, the evaluation of the reliability of the first search news evaluates from two aspects of the logic (for example, whether the news content is in paradox) and the proportion of the effective information of the news, for example, assuming that the first search news is New1, a first language model is used to obtain a first reliability score corresponding to the first search news New1 for the first search news New1 as follows: [ logical score A1, effective information proportion B1]; aiming at the first search news New1, a second language model is adopted to obtain a second reliability score corresponding to the first search news New1, wherein the second reliability score is as follows: [ logical score is A2, and the proportion of effective information is B2]; the weighted summation may be performed on the first reliability score and the second reliability score corresponding to the first search news New1 according to the following formula, so as to obtain the weight W corresponding to the first search news New1 as follows:
W=50%*(A1+A2)/2+50%*(B1+B2)/2; (1)
Step S2044, screening and obtaining target news from at least one first search news according to the weight corresponding to each first search news.
In the embodiment of the disclosure, the target news may be screened from at least one first search news according to the weight corresponding to each first search news.
As a possible implementation manner, the first search news may be ranked in order of from large to small according to the weight, so as to obtain a first ranking sequence, and the first search news with a sequence number smaller than the set sequence number in the first ranking sequence may be determined as the target news.
In the embodiment of the present disclosure, the set sequence number may be preset, for example, may be 4, 5, etc., which is not limited in this disclosure.
As an example, assume that there is a first search news N 1、N2、N3、N4、N5, set a sequence number of 3; the first search news is assumed to be ranked according to the order of the weights from big to small, and the first ranking sequence is obtained as follows: n 3、N2、N4、N1、N5, the first search news with the sequence number smaller than the set sequence number in the first ranked sequence, that is, N 3 and N 2, may be determined as the target news.
Thus, the target news can be effectively determined.
Step S205, splicing the target information, at least one target keyword and at least one target news which are all in the set format to obtain splicing information.
In the embodiment of the present disclosure, the setting format may be preset, for example, may be JSON (JavaScript Object Notation, JS key-value pair data) format, XML (Extensible Markup Language ) format, or the like, which is not limited in this disclosure.
In the embodiment of the disclosure, the target information, at least one target keyword and at least one target news which are all in the set format can be spliced to obtain the spliced information.
And S206, generating a risk assessment report of the order to be financing by adopting a first language model according to the splicing information.
In the embodiment of the disclosure, a risk assessment report of an order to be financing can be generated by adopting a first language model according to splicing information.
According to the risk assessment method for order financing, at least one first search news is screened to obtain at least one target news; splicing the target information, at least one target keyword and at least one target news which are all in a set format to obtain splicing information; and generating a risk assessment report of the order to be financing by adopting a first language model according to the splicing information. Therefore, the splicing information can be effectively acquired, and the risk assessment report of the order to be financing can be effectively acquired based on the splicing information.
In order to clearly explain how to generate a risk assessment report of an order to be finalised according to splicing information by using a first language model in the above embodiments of the present disclosure, the present disclosure further provides a risk assessment method of order financing.
Fig. 3 is a flowchart of a risk assessment method for order financing according to a third embodiment of the present disclosure.
As shown in fig. 3, according to the above embodiment of the present disclosure, the risk assessment method for order financing may further include the steps of:
step S301, filling the spliced information into the corresponding filling position in the first prompt template to obtain first prompt information.
In the embodiment of the disclosure, the splicing information may be filled into the corresponding filling position in the first prompt template to obtain the first prompt information.
As an example, assume that the first hint template is:
roles: you are a risk assessment expert of a financial institution who is good at data analysis
Tasks: and evaluating the risk of the RMB of the order and financing target credit TARGETCREDIT shown by a supplier enterprise and a certificate contract according to the information in the JSON format, and generating a risk evaluation report within 2000 words. Please report the accuracy, authenticity of the report as the highest priority. Such as JSON format information contains analysts reporting analysis, please score each analyst report.
JSON information:
{ (here filling position) } $ $ $ $ { (here filling position) })
Evaluation report format:
···
Risk scoring:
Analysts report scores (if any):
report text:
···
Assuming that the splicing information is splicing information 1, the splicing information 1 can be filled in the corresponding filling position in the first prompting template, and the first prompting information can be obtained, namely:
roles: you are a risk assessment expert of a financial institution who is good at data analysis
Tasks: and evaluating the risk of the RMB of the order and financing target credit TARGETCREDIT shown by a supplier enterprise and a certificate contract according to the information in the JSON format, and generating a risk evaluation report within 2000 words. Please report the accuracy, authenticity of the report as the highest priority. Such as JSON format information contains analysts reporting analysis, please score each analyst report.
JSON information:
Splicing information 1} $ $ $ $ $ $ $ $ $
Evaluation report format:
···
Risk scoring:
Analysts report scores (if any):
report text:
···
It should be noted that the foregoing examples of the first alert template are merely exemplary, and in practical applications, the first alert template may be other, which is not limited in this disclosure.
Step S302, the first prompt information is compressed to obtain compressed information.
In the embodiment of the present disclosure, the first prompt information may be compressed to obtain compressed information.
As an example, the following steps may be employed to compress the first hint information:
step 1, compressing order information when target information (or splicing information) in the first prompt information includes the order information:
1.1 when the first total number of character units token included in the order information is greater than a first set number (such as 1000, 1500, etc.), the order information can be compressed to a second set number (such as 800, 1000, etc.) by using a first language model;
1.2 when the first total number of character units token included in the order information is not greater than the first set number, the compression operation may not be performed on the order information.
Step 2, compressing the analysis report when the target information (or the splicing information) in the first prompt information includes the analysis report:
2.1 when the second total number of character units token included in the analyst evaluation report is greater than the third set number (e.g., 500, 600, etc.), the analyst evaluation report may be compressed to a fourth set number (e.g., 500, 400, etc.) using the first language model;
2.2 when the second total number of character units token included in the analyst evaluation report is not greater than the third set number, the compression operation may not be performed on the analyst evaluation report.
Step 3, compressing at least one target news in the splicing information in the first prompt information:
3.1 determining the maximum value of the token which can be borne by the first language model, the first value of the token in the target information in the splicing information, the second value of the token except the target information and the token in each target news in the splicing information, and the third value of the token except the token in the splicing information in the first prompt information;
3.2 determining a fourth value of the remaining inputtable token of the first language model according to the maximum value, the first value, the second value and the third value;
For example, assuming that the maximum value is n, the first value is k, the second value is q, and the third value is p, the fourth value m may be determined according to the following formula:
m=n-p-q-k; (2)
and 3.3, compressing at least one target news in the spliced information so that the sum of the number of tokens in each target news after compression is not greater than a fourth value.
As an example, determining a fifth value of a token in any target news in the splicing information, and summing the fifth values of the target news to obtain a first sum; assuming that the fourth value is m, when m < y <10m, splicing all target news to obtain a first spliced news; dividing the first stitched news intoThe first sub news is shared; for any first sub news, compressing the first sub news to a fifth set number, wherein the fifth set number is smaller thanAnd merging the compressed first sub-news to obtain at least one compressed target news.
Still referring to the above example, when y >10m, determining a first type news and a second type news in at least one target news; the first type of news can be enterprise news and/or creator news, and the second type of news can be industry news; aiming at the first type news, acquiring the weight (or priority) of each first type news, and sequencing each first type news according to the sequence from the big weight to the small weight to obtain a second sequencing sequence; splicing the news of the first category according to the second sequencing sequence to obtain second spliced news; intercepting first intercepted news containing 5m token from the starting position of the second spliced news; dividing the first intercepted news into 5 (=5m/m) second sub-news; for any second sub news, compressing the second sub news to a sixth set number, wherein the sixth set number is smaller thanCombining the compressed second sub-news to obtain compressed first-class news;
Aiming at the second class news, acquiring the weight (or priority) of each second class news, and sequencing each second class news according to the sequence from the big weight to the small weight to obtain a third sequencing sequence; splicing the second news according to the third ordering sequence to obtain third spliced news; intercepting a second intercepted news containing 5m token from the starting position of the third spliced news; dividing the second intercepted news into 5 (=5 m/m) parts of third sub-news; compressing the third sub-news to a seventh set number for any third sub-news, wherein the seventh set number is smaller than And merging the compressed third sub-news to obtain the compressed second-class news.
Therefore, the first prompt information can be compressed through the steps 1,2 and 3, so that compressed information can be obtained.
Step S303, inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model.
In the embodiment of the disclosure, the compressed information can be input into the first language model, so that a risk assessment report output by the first language model can be obtained.
According to the risk assessment method for order financing, splicing information is filled in corresponding filling positions in a first prompt template to obtain first prompt information; compressing the first prompt information to obtain compressed information; and inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model. Therefore, the compression information can be effectively acquired based on the splicing information, and the risk assessment report of the order to be financing can be effectively acquired based on the compression information.
The above-mentioned embodiments correspond to the risk assessment method for financing orders, and the following describes how to train the knowledge graph model.
Fig. 4 is a flowchart of a training method of a knowledge graph model according to a fourth embodiment of the disclosure.
As shown in fig. 4, the training of the knowledge-graph model may include the following steps:
step S401, acquiring first initial training data.
It should be noted that, the first initial training data may include supply chain financial information and order financing information.
Step S402, based on the first initial training data, a third language model is adopted to obtain a plurality of groups of first entity relation triples.
In the embodiment of the present disclosure, the third language model may be, for example, a large-scale model of the text-to-text, a large-scale model of the meaning, and the like, which is not limited by the present disclosure.
It should be noted that, in the present disclosure, the third language model is different from the first language model.
It should be further noted that the third language model may be the same as the second language model, or may be different, which is not limited by the present disclosure.
In an embodiment of the present disclosure, multiple sets of first entity-relationship triples may be obtained using a third language model based on the first initial training data.
As one possible implementation, first, the first initial training data may be input to a third language model to extract at least one set of third entity-relationship triples from the first initial training data; any third entity relationship triplet may include two first entities; secondly, news searching is carried out on each first entity through a first search engine, so that a plurality of first search news can be obtained; finally, each first search news may be input to a third language model, thereby obtaining a plurality of sets of first entity relationship triples.
It should be noted that the number of third entity relationship triples may be, but is not limited to, a group, which is not limited by the present disclosure.
Thus, the acquisition of the first entity relationship triplet may be achieved.
Step S403, based on the first initial training data, a relation extraction model is adopted to obtain a plurality of groups of second entity relation triples.
In the embodiment of the present disclosure, the relation extraction model may be, for example, BERT model, biLSTM-CRF (Bi-directional Long Short Term Memory-Conditional Random Field, two-way long and short memory time-conditional random field) model, and the present disclosure is not limited thereto.
In the embodiment of the disclosure, a relationship extraction model may be used to obtain multiple sets of second entity relationship triples based on the first initial training data.
As a possible implementation manner, first, the entity relationship triples may be extracted from the first initial training data through a relationship extraction model, so as to obtain at least one group of fourth entity relationship triples; wherein any fourth entity relationship triplet may include two second entities; secondly, news searching can be carried out on each second entity through a second search engine so as to acquire a plurality of second search news; and finally, extracting entity relation triples from each second search news through a relation extraction model, so as to obtain a plurality of groups of second entity relation triples.
It should be noted that the number of the third entity relationship triples may be, but is not limited to, a group, which is not limited in this disclosure.
Thereby, the acquisition of the second entity relationship triplet may be achieved.
Step S404, obtaining first target training data according to a plurality of groups of first entity relation triples and a plurality of groups of second entity relation triples.
As one possible implementation manner, in response to at least one set of first target triples in the sets of first entity-relationship triples and the sets of second entity-relationship triples, any first target triplet may be determined to be first target training data, where any first target triplet may be both the first entity-relationship triplet and the second entity-relationship triplet; and may determine both the entity relationship triplet that is only the first entity relationship triplet and the entity relationship triplet that is only the second entity relationship triplet as the second target triplet; for any second target triplet, in case the second target triplet is checked error-free, the checked error-free second target triplet may also be determined as the first target training data.
As an example, assuming that the multiple sets of first entity relationship triples include triples a, triples B, and triples C, and the multiple sets of second entity relationship triples include triples a, triples C, and triples E, comparing the multiple sets of first entity relationship triples with the multiple sets of second entity relationship triples, where the triples a and the triples C are both the first entity relationship triples and the second entity relationship triples, it may be determined that both the triples a and the triples C are first target triples; any first target triplet may be determined as first target training data;
And the entity relationship triplet which is only the first entity relationship triplet comprises a triplet B and the entity relationship triplet which is only the second entity relationship triplet comprises a triplet E, both the triplet B and the triplet E can be determined as second target triples; assuming that the triplet B is checked, determining the triplet B as the first target training data under the condition that the triplet B is checked without errors; assuming that the triplet E is checked, in the case that the triplet E is checked with errors, the triplet E is not determined as the first target training data.
Thereby, a determination of the first target training data may be achieved.
Step S405, training the initial knowledge-graph model based on the first target training data to obtain a trained knowledge-graph model.
In the embodiment of the disclosure, the initial knowledge-graph model may be trained based on the first target training data, and a trained knowledge-graph model may be obtained.
According to the training method of the knowledge graph model, first initial training data are obtained; wherein the first initial training data includes supply chain financial information and order financing information; acquiring a plurality of groups of first entity relation triples by adopting a third language model based on the first initial training data; based on the first initial training data, a relation extraction model is adopted to obtain a plurality of groups of second entity relation triples; acquiring first target training data according to a plurality of groups of first entity relation triples and a plurality of groups of second entity relation triples; based on the first target training data, training the initial knowledge-graph model to obtain a trained knowledge-graph model. Therefore, training of the knowledge graph model can be achieved, and the prediction capability of the model on the entities in the entity relation triples is improved.
The following describes how the first language model is trained.
Fig. 5 is a flowchart of a training method of a language model according to a fifth embodiment of the present disclosure.
As shown in fig. 5, the training method of the language model may include the steps of:
Step S501, a self-supervision training dataset, a positive and negative sample training dataset and an instruction fine tuning training dataset are obtained.
In the embodiment of the present disclosure, the self-monitoring training data set may include, for example, financial news in open news media or platform, financial news in social media, news related to newspaper of group companies, order information and financing information authorized by group companies, encyclopedia data, open-source LLaMa-Alpaca chinese data set, chinese core journal Meta data set, and the like, which is not limited in this disclosure.
In the disclosed embodiments, the positive and negative sample training data sets may include positive sample news and negative sample news.
In one possible implementation of an embodiment of the present disclosure, the positive sample news may include news labeled "correct and relevant (i.e., true AND RELATIVE)" and news labeled "correct and irrelevant (i.e., true But Inrelative)"; news labeled "error but related (i.e., false But Relative)" and news labeled "error and unrelated (i.e., FALSE AND INRELATIVE)" may be included in the negative sample news.
It should be noted that, the present disclosure does not limit the number of news marked as "correct and relevant (i.e., true AND RELATIVE)" and the number of news marked as "correct and irrelevant (i.e., true But Inrelative)" in the positive sample news, that is, the number of news marked as "correct and relevant (i.e., true AND RELATIVE)" may be the same as the number of news marked as "correct and irrelevant (i.e., true But Inrelative)", or may be different.
Similar to the positive sample news, the present disclosure does not limit the number of news labeled "error but correlated (i.e., false But Relative)" and the number of news labeled "error and uncorrelated (i.e., FALSE AND INRELATIVE)" in the negative sample news, that is, the number of news labeled "error but correlated (i.e., false But Relative)" may be the same as the number of news labeled "error and uncorrelated (i.e., FALSE AND INRELATIVE)", or may be different.
In embodiments of the present disclosure, the instruction fine training dataset may be obtained by manually constructing samples.
As an example, when manually constructing a sample, multiple sets of instructions, questions, and answers related to order financing, supply chain finance, etc. may be preset, such as:
1. instructions to: please answer the following questions with a sentence
Problems: what are the main repayment sources of the real estate mortgage service?
Answer: sale return fund and mortgage change
2. Instructions to: please select the correct sequence number of the following sentences, separated by commas
Problems: 1. the security warehouse is suitable for commodities with high sales pressure, and future right-of-goods mortgages are generally suitable for pretty commodities. 2. The warranty warehouse must approve the credit limit for the manufacturer, and future credit mortgages do not have to approve the credit limit for the manufacturer. 3. Both the security warehouse and future right-of-goods mortgage mode
Answer: 1,2,3
According to the preset instructions, questions and answers, a sample is generated, and the sample may be:
{
The instruction is that "please select the correct sequence number of the following sentences, separate by comma",
Input 1. The security warehouse is suitable for the commodity with high sales pressure, and the future right-to-goods mortgage is generally suitable for the pretty commodity. 2. The warranty warehouse must approve the credit limit for the manufacturer, and future credit mortgages do not have to approve the credit limit for the manufacturer. 3. The security warehouse and the future right-to-hand mortgage are both in a first money/ticket and then-hand mode, and result is 1,2 and 3"
}
{
The instruction "please answer the following questions with a sentence",
Input, "what are the main repayment sources of the real estate mortgage service? ",
Result: "sales return funds, mortgage rendering"
}
It should be noted that the above examples of the setting and sampling of the instructions, the questions and the answers are merely exemplary, and in practical applications, the instructions, the questions, the answers, and the corresponding samples may be set as needed.
Thus, in the present disclosure, after a plurality of samples are manually constructed, the plurality of samples may be used as an instruction to fine tune the training dataset.
In embodiments of the present disclosure, a self-supervising training data set, a positive and negative sample training data set, and an instruction fine training data set may be obtained.
Step S502, based on the self-supervision training data set, performing first training on the initial first language model to obtain a first trained language model.
In the embodiment of the present disclosure, the initial first language model may be an open-source language big model, such as an open-source baichuan Baichuan-13B big model (i.e., a pretrained baichuan Baichuan-13B big model), which is not limited by the present disclosure.
In the embodiment of the disclosure, the initial first language model may be first trained based on the self-supervision training data set, so as to obtain a first trained first language model.
As an example, when the initial first language model is first trained, the initial first language model may be loaded, the underlying parameters of the initial first language model may be frozen, and the self-supervised training dataset may be directly input to the first language model with the underlying parameters frozen, so as to perform autoregressive training, thereby enabling fine tuning of the initial first language model to obtain the first trained first language model.
Step S503, based on the positive and negative sample training data sets, performing second training on the first language model subjected to the first training to obtain the first language model subjected to the second training.
In the embodiment of the disclosure, the data sets can be trained by positive and negative samples, and the first language model after the first training is subjected to the second training, so that the first language model after the second training is obtained.
As an example, while performing the second training on the first trained language model, positive sample news and negative sample news may be input to the first trained language model to perform a conventional supervised machine learning classification model training to obtain a second trained first language model; the positive sample news may include news labeled "correct and relevant (i.e., true AND RELATIVE)" and news labeled "correct and irrelevant (i.e., true But Inrelative)", and the negative sample news may include news labeled "incorrect and relevant (i.e., false But Relative)", and news labeled "incorrect and irrelevant (i.e., FALSE AND INRELATIVE)". It should be noted that, after the second training, the first language model may identify the correct and relevant news, the correct and irrelevant news, the wrong and irrelevant news, and the wrong and relevant news in the news.
Step S504, based on the instruction fine tuning training data set, performing third training on the first language model after the second training to obtain a trained first language model.
In the embodiment of the disclosure, the training data set can be finely tuned based on the instruction, and the first language model after the second training is subjected to the third training, so that a trained first language model is obtained.
As an example, the instruction fine training dataset may be input to the second trained first language model for instruction fine model training, i.e., for third training. It should be noted that, when the first language model of the open source provides an interface for performing instruction fine tuning, the interface may be directly called to perform third training on the model; when the first language model of the open source does not provide an interface for performing instruction fine adjustment, under the condition that the instruction fine adjustment training data set is generated by adopting an instruction, a question and an answer which are set manually, the manually set instruction, the question and the answer can be spliced to obtain a spliced instruction, and the instruction fine adjustment training data set can be constructed based on the spliced instruction, so that the first language model after the second training can be subjected to third training based on the instruction fine adjustment training data set, and the trained first language model can be obtained.
For example, assume that the manually set instructions, questions, and answers are respectively:
Instructions to: please answer the following questions with a sentence
Problems: what are the main repayment sources of the real estate mortgage service?
Answer: sale return fund and mortgage change
The manually set instructions, questions and answers are spliced, and the corresponding splicing instructions can be obtained as follows:
User: [ CLS ] + "please answer the following questions with a sentence" + [ SEP ] + "what are the main repayment sources of the real estate mortgage service? "+ [ SEP ], system + [ CLS ] +" -sales of returned funds, mortgage changes "+ [ SEP ]
The splicing instruction adopts User and System as marking heads of instructions, questions and answers, and adopts separators (such as [ SEP ], [ CLS ], etc.) to separate token of different components.
Therefore, after the splicing instruction is obtained in the mode, an instruction fine-tuning training data set can be constructed based on the splicing instruction, and further, the third training can be performed on the first language model subjected to the second training based on the instruction fine-tuning training data set, so that the trained first language model is obtained.
According to the training method of the language model, a self-supervision training data set, a positive and negative sample training data set and an instruction fine tuning training data set are obtained; performing first training on the initial first language model based on the self-supervision training data set to obtain a first language model subjected to the first training; performing second training on the first language model subjected to the first training based on the positive and negative sample training data sets to obtain the first language model subjected to the second training; and performing third training on the first language model subjected to the second training based on the instruction fine tuning training data set to obtain a trained first language model. Thus, training of the first language model can be achieved.
Corresponding to the above-mentioned risk assessment method for order financing provided by the embodiments of fig. 1 to 3, an embodiment of the present disclosure further provides a risk assessment device for order financing, and since the risk assessment device for order financing provided by the embodiments of the present disclosure corresponds to the above-mentioned risk assessment method for order financing provided by the embodiments of fig. 1 to 3, implementation of the risk assessment method for order financing is also applicable to the risk assessment device for order financing provided by the embodiments, and will not be described in detail in the present embodiment.
Fig. 6 is a schematic structural diagram of a risk assessment device for order financing according to a sixth embodiment of the present disclosure.
As shown in fig. 6, the risk assessment device 600 for financing an order includes: an acquisition module 601, an input module 602, a retrieval module 603 and a generation module 604.
The acquiring module 601 is configured to acquire target information of an order to be financing, and acquire at least one initial keyword of the target information.
An input module 602, configured to input at least one initial keyword into the knowledge-graph model to obtain at least one target keyword.
The retrieving module 603 is configured to perform news retrieval on the target database based on the at least one target keyword, so as to obtain at least one first retrieval news.
The generating module 604 is configured to generate a risk assessment report of the order to be financing by using the first language model according to the target information, the at least one target keyword and the at least one first search news.
In one embodiment of the present disclosure, the generating module 604 is configured to: screening the at least one first search news to obtain at least one target news; splicing the target information, at least one target keyword and at least one target news which are all in a set format to obtain splicing information; and generating a risk assessment report of the order to be financing by adopting a first language model according to the splicing information.
In one embodiment of the present disclosure, the generating module 604 is configured to: filling the spliced information into the corresponding filling position in the first prompting template to obtain first prompting information; compressing the first prompt information to obtain compressed information; and inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model.
In one embodiment of the present disclosure, the generating module 604 is configured to: according to at least one first search news, a first language model is adopted to obtain a first reliability score corresponding to each first search news; according to at least one first search news, a second language model is adopted to obtain a second reliability score corresponding to each first search news; for any first retrieval information, carrying out weighted summation on a first reliability score and a second reliability score corresponding to the first retrieval news to obtain a weight corresponding to the first retrieval news; and screening and obtaining target news from at least one first retrieval news according to the weight corresponding to each first retrieval news.
In one embodiment of the present disclosure, the generating module 604 is configured to: sequencing the first search news according to the sequence from the big weight to the small weight to obtain a first sequencing sequence; and determining the first search news with the serial number smaller than the set serial number in the first ordering sequence as the target news.
In one embodiment of the present disclosure, the obtaining module 601 is configured to: and according to the target information, acquiring at least one initial keyword of the target information by adopting a first language model.
In one embodiment of the present disclosure, the input module 602 is configured to: inputting at least one initial keyword into a knowledge graph model, and obtaining an entity relation object with an association relation with each initial keyword; and taking each entity relation object and each initial keyword as target keywords.
The risk assessment device for order financing in the embodiment of the present disclosure obtains target information of an order to be financing, and obtains at least one initial keyword of the target information; inputting at least one initial keyword into a knowledge graph model to obtain at least one target keyword; news retrieval is conducted on the target database based on at least one target keyword to obtain at least one first retrieval news; and generating a risk assessment report of the order to be financing by adopting a first language model according to the target information, the at least one target keyword and the at least one first retrieval news. Therefore, the risk assessment report of the order to be financing can be automatically generated, the cost of human resources can be reduced, and the problem of inaccuracy of the risk assessment report caused by subjectivity of people can be effectively avoided.
Corresponding to the training method of the knowledge graph model provided in the embodiment of fig. 4, an embodiment of the present disclosure further provides a training device of the knowledge graph model, and since the training device of the knowledge graph model provided in the embodiment of the present disclosure corresponds to the training method of the knowledge graph model provided in the embodiment of fig. 4, an implementation manner of the training method of the knowledge graph model is also applicable to the training device of the knowledge graph model provided in the embodiment, and will not be described in detail in the embodiment.
Fig. 7 is a schematic structural diagram of a training device for a knowledge-graph model according to a seventh embodiment of the disclosure.
As shown in fig. 7, the training apparatus 700 of the knowledge graph model includes: a first acquisition module 701, a second acquisition module 702, a third acquisition module 703, a fourth acquisition module 704, and a training module 705.
The first acquiring module 701 is configured to acquire first initial training data; wherein the first initial training data includes supply chain financial information and order financing information.
The second obtaining module 702 is configured to obtain a plurality of groups of first entity relationship triples using a third language model based on the first initial training data.
A third obtaining module 703, configured to obtain a plurality of sets of second entity relationship triples by using a relationship extraction model based on the first initial training data.
A fourth obtaining module 704, configured to obtain the first target training data according to the multiple sets of first entity relationship triples and the multiple sets of second entity relationship triples.
The training module 705 is configured to train the initial knowledge-graph model based on the first target training data, so as to obtain a trained knowledge-graph model.
In one embodiment of the present disclosure, the second obtaining module 702 is configured to: inputting the first initial training data into a third language model to extract at least one set of third entity-relationship triples from the first initial training data; wherein any third entity relationship triplet comprises two first entities; news searching is carried out on each first entity through a first search engine so as to obtain a plurality of first search news; and inputting each first search news into a third language model to obtain a plurality of groups of first entity relation triples.
In one embodiment of the present disclosure, the third obtaining module 703 is configured to: extracting entity relation triples from the first initial training data through a relation extraction model to obtain at least one group of fourth entity relation triples; wherein any fourth entity relationship triplet comprises two second entities; news searching is carried out on each second entity through a second search engine so as to obtain a plurality of second search news; and extracting entity relation triples of the second search news through the relation extraction model to obtain a plurality of groups of second entity relation triples.
In one embodiment of the present disclosure, the fourth obtaining module 704 is configured to: determining any first target triplet as first target training data in response to at least one set of first target triples in the plurality of sets of first entity-relationship triples and the plurality of sets of second entity-relationship triples; wherein any first target triplet is both a first entity relationship triplet and a second entity relationship triplet; determining an entity relationship triplet that is only the first entity relationship triplet and an entity relationship triplet that is only the second entity relationship triplet as a second target triplet; and determining the second target triplet without error in the check of the second target triplet as the first target training data aiming at any second target triplet.
The training device of the knowledge graph model of the embodiment of the disclosure obtains first initial training data; wherein the first initial training data includes supply chain financial information and order financing information; acquiring a plurality of groups of first entity relation triples by adopting a third language model based on the first initial training data; based on the first initial training data, a relation extraction model is adopted to obtain a plurality of groups of second entity relation triples; acquiring first target training data according to a plurality of groups of first entity relation triples and a plurality of groups of second entity relation triples; based on the first target training data, training the initial knowledge-graph model to obtain a trained knowledge-graph model. Therefore, training of the knowledge graph model can be achieved, and the prediction capability of the model on the entities in the entity relation triples is improved.
Corresponding to the training method of the language model provided in the embodiment of fig. 5, an embodiment of the present disclosure further provides a training device of the language model, and since the training device of the language model provided in the embodiment of the present disclosure corresponds to the training method of the language model provided in the embodiment of fig. 5, implementation of the training method of the language model is also applicable to the training device of the language model provided in the embodiment, and will not be described in detail in the embodiment.
Fig. 8 is a schematic structural diagram of a training device for language model according to an embodiment of the present disclosure.
As shown in fig. 8, the training apparatus 800 for language model includes: acquisition module 801, first training module 802, second training module 803, and third training module 804.
The acquiring module 801 is configured to acquire a supervised training data set, a positive and negative sample training data set, and an instruction fine tuning training data set.
The first training module 802 is configured to perform a first training on the initial first language model based on the self-supervision training data set, to obtain a first trained first language model.
And the second training module 803 is configured to perform a second training on the first language model after the first training based on the positive and negative sample training data set, so as to obtain the first language model after the second training.
The third training module 804 is configured to perform third training on the second trained first language model based on the instruction fine tuning training data set, to obtain a trained first language model.
The training device of the language model of the embodiment of the disclosure is used for fine-tuning a training data set through acquiring a self-supervision training data set, a positive and negative sample training data set and an instruction; performing first training on the initial first language model based on the self-supervision training data set to obtain a first language model subjected to the first training; performing second training on the first language model subjected to the first training based on the positive and negative sample training data sets to obtain the first language model subjected to the second training; and performing third training on the first language model subjected to the second training based on the instruction fine tuning training data set to obtain a trained first language model. Thus, training of the first language model can be achieved.
In order to implement the foregoing embodiments, a ninth embodiment of the present disclosure further proposes an electronic device 900, as shown in fig. 9, where the electronic device 900 includes: the processor 901 and a memory 902 communicatively coupled to the processors, the memory 902 storing instructions executable by the at least one processor, the instructions being executable by the at least one processor 901 to implement a method of risk assessment of order financing as an embodiment of a first aspect of the present disclosure.
To achieve the foregoing embodiments, the embodiments of the present disclosure further provide a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are configured to cause a computer to implement a risk assessment method for order financing according to an embodiment of the first aspect of the present disclosure, or a training method for a knowledge graph model according to an embodiment of the second aspect, or a training method for a language model according to an embodiment of the third aspect.
To achieve the above embodiments, the embodiments of the present disclosure further provide a computer program product, including a computer program, which when executed by a processor, implements a risk assessment method for order financing according to an embodiment of the first aspect of the present disclosure, or a training method for a knowledge graph model according to an embodiment of the second aspect, or a training method for a language model according to an embodiment of the third aspect.
In the description of the present disclosure, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", "axial", "radial", "circumferential", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present disclosure and simplifying the description, and do not indicate or imply that the device or element being referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present disclosure.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present disclosure, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Although embodiments of the present disclosure have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the present disclosure, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the present disclosure.

Claims (6)

1. A method for risk assessment of order financing, the method comprising:
Acquiring target information of an order to be financing, and acquiring at least one initial keyword of the target information, wherein the target information comprises order information of the order to be financing, the order information of the order to be financing comprises an order contract and financing information, and the initial keyword comprises a provider name, a provider legal name, a provider total manager name, an industry to which the order to be financing belongs, a starting time of the order to be financing and a deadline of the order to be financing;
Inputting the at least one initial keyword into a knowledge graph model to obtain at least one target keyword, wherein the at least one initial keyword is input into the knowledge graph model to obtain an entity relationship object with an association relationship with each initial keyword; taking each entity relation object and each initial keyword as the target keywords;
news searching is conducted on the target database based on the at least one target keyword to obtain at least one first search news, wherein the first search news comprises industry news and provider news;
according to the at least one first search news, a first language model is adopted to obtain a first reliability score corresponding to each first search news;
According to the at least one first retrieval news, a second language model is adopted to obtain a second reliability score corresponding to each first retrieval news;
For any first retrieval information, carrying out weighted summation on the first reliability score and the second reliability score corresponding to the first retrieval news to obtain the weight corresponding to the first retrieval news;
Screening and obtaining target news from at least one first search news according to the weight corresponding to each first search news;
Splicing the target information, the at least one target keyword and the at least one target news which are all in the set format to obtain splicing information;
filling the spliced information into corresponding filling positions in a first prompt template to obtain first prompt information;
Compressing the first prompt information to obtain compressed information, wherein the order information is compressed when the spliced information in the first prompt information comprises order information, and the analyst evaluation report is compressed when the spliced information in the first prompt information comprises the analyst evaluation report, and at least one target news in the spliced information in the first prompt information is compressed;
inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model;
the training method of the knowledge graph model comprises the following steps:
acquiring first initial training data; wherein the first initial training data includes supply chain financial information and order financing information;
inputting the first initial training data into a third language model to extract at least one set of third entity relationship triples from the first initial training data; wherein any one of the third entity relationship triples comprises two first entities;
news searching is carried out on each first entity through a first search engine so as to obtain a plurality of first search news;
Inputting each first search news to the third language model to obtain a plurality of groups of first entity relation triples;
extracting entity relation triples from the first initial training data through a relation extraction model to obtain at least one group of fourth entity relation triples; wherein any one of the fourth entity relationship triples comprises two second entities;
news searching is carried out on each second entity through a second search engine so as to obtain a plurality of second search news;
Extracting entity relation triples of the second search news through the relation extraction model to obtain a plurality of groups of second entity relation triples;
Determining any one of the first target triples as first target training data in response to at least one of the plurality of sets of first entity-relationship triples and the plurality of sets of second entity-relationship triples;
wherein any one of the first target triples is both the first entity relationship triplet and the second entity relationship triplet;
Determining an entity relationship triplet that is only the first entity relationship triplet and an entity relationship triplet that is only the second entity relationship triplet as a second target triplet;
for any second target triplet, determining the second target triplet with no check as first target training data under the condition that the second target triplet is checked with no check;
training the initial knowledge-graph model based on the first target training data to obtain a trained knowledge-graph model;
The training method of the first language model comprises the following steps:
acquiring a self-supervision training data set, a positive and negative sample training data set and an instruction fine tuning training data set, wherein the positive and negative sample training data set comprises positive sample news and negative sample news;
Performing first training on an initial first language model based on the self-supervision training dataset to obtain a first trained first language model, wherein when the initial first language model is subjected to first training, loading the initial first language model, freezing bottom parameters of the initial first language model, directly inputting the self-supervision training dataset into the first language model frozen with the bottom parameters, so as to perform autoregressive training, and fine-tuning the initial first language model to obtain the first trained first language model;
Performing second training on the first language model subjected to the first training based on the positive and negative sample training data sets to obtain a first language model subjected to the second training;
Performing third training on the first language model subjected to the second training based on the instruction fine-tuning training data set to obtain a trained first language model;
Under the condition that the splicing information in the first prompt information comprises order information, compressing the order information comprises the following steps: when the first total number of character units token included in the order information is larger than the first set number, compressing the order information to the second set number by adopting the first language model, and when the first total number of character units token included in the order information is not larger than the first set number, not executing compression operation on the order information;
Under the condition that the splicing information in the first prompt information comprises an analyst evaluation report, compressing the analyst evaluation report comprises the following steps:
When the second total number of character unit tokens included in the analyst evaluation report is greater than the third set number, compressing the analyst evaluation report to a fourth set number by adopting the first language model;
when the second total number of character unit tokens included in the analyst evaluation report is not greater than the third set number, not performing a compression operation on the analyst evaluation report;
The compressing the at least one target news in the splicing information in the first prompt information includes:
Determining the maximum value of the token which can be borne by the first language model, the first value of the token in the target information in the splicing information, the second value of the token except the target information and the token in each target news in the splicing information, and the third value of the token except the token in the splicing information in the first prompt information;
Determining a fourth value of the remaining inputtable token of the first language model according to the maximum value, the first value, the second value, and the third value, wherein the fourth value m is determined according to the following formula assuming that the maximum value is n, the first value is k, the second value is q, and the third value is p:
m=n-p-q-k;
Compressing at least one target news in the splicing information so that the sum of the number of tokens in each compressed target news is not greater than the fourth value;
Determining a fifth value of a token in any target news in the splicing information, and summing the fifth values of all the target news to obtain a first sum; assuming that the fourth value is m, when m < y <10m, splicing all target news to obtain a first spliced news; dividing the first spliced news into The first sub news is shared; for any first sub-news, compressing the first sub-news to a fifth set number, wherein the fifth set number is smaller than the fifth set numberCombining the compressed first sub-news to obtain at least one compressed target news;
Determining a first type of news and a second type of news in at least one target news when y >10 m; the first type of news can be enterprise news and/or creator news, and the second type of news can be industry news; aiming at the first type news, acquiring the weight (or priority) of each first type news, and sequencing each first type news according to the sequence from the big weight to the small weight to obtain a second sequencing sequence; splicing the news of the first category according to the second sequencing sequence to obtain second spliced news; intercepting first intercepted news containing 5m token from the starting position of the second spliced news; dividing the first intercepted news into 5 (=5m/m) second sub-news; for any second sub news, compressing the second sub news to a sixth set number, wherein the sixth set number is smaller than Combining the compressed second sub-news to obtain compressed first-class news;
Aiming at the second class news, acquiring the weight (or priority) of each second class news, and sequencing each second class news according to the sequence from the big weight to the small weight to obtain a third sequencing sequence; splicing the second news according to the third ordering sequence to obtain third spliced news; intercepting a second intercepted news containing 5m token from the starting position of the third spliced news; dividing the second intercepted news into 5 (=5 m/m) parts of third sub-news; compressing the third sub-news to a seventh set number for any third sub-news, wherein the seventh set number is smaller than
And merging the compressed third sub-news to obtain the compressed second-class news.
2. The method of claim 1, wherein the screening the target news from the at least one first search news according to the weight corresponding to each first search news comprises:
ranking the first search news according to the order of the weights from big to small to obtain a first ranking sequence;
and determining the first search news with the serial number smaller than the set serial number in the first ordering sequence as the target news.
3. The method of claim 1, wherein the obtaining at least one initial keyword of the target information comprises:
and acquiring at least one initial keyword of the target information by adopting the first language model according to the target information.
4. A risk assessment device for financing an order, the device comprising:
the system comprises an acquisition module, a financing module and a financing module, wherein the acquisition module is used for acquiring target information of an order to be financing and at least one initial keyword of the target information, the target information comprises order information of the order to be financing, the order information of the order to be financing comprises an order contract and financing information, and the initial keyword comprises a provider name, a provider legal name, a provider total manager name, industries to which the order to be financing belongs, starting time of the order to be financing and ending time of the order to be financing;
The input module is used for inputting the at least one initial keyword into a knowledge graph model to obtain at least one target keyword, wherein the at least one initial keyword is input into the knowledge graph model to obtain an entity relationship object with an association relationship with each initial keyword; taking each entity relation object and each initial keyword as the target keywords;
The retrieval module is used for carrying out news retrieval on the target database based on the at least one target keyword so as to obtain at least one first retrieval news, wherein the first retrieval news comprises industry news and provider news;
the generation module is used for obtaining first reliability scores corresponding to the first search news by adopting a first language model according to the at least one first search news;
According to the at least one first retrieval news, a second language model is adopted to obtain a second reliability score corresponding to each first retrieval news;
For any first retrieval information, carrying out weighted summation on the first reliability score and the second reliability score corresponding to the first retrieval news to obtain the weight corresponding to the first retrieval news;
Screening and obtaining target news from at least one first search news according to the weight corresponding to each first search news;
Splicing the target information, the at least one target keyword and the at least one target news which are all in the set format to obtain splicing information;
filling the spliced information into corresponding filling positions in a first prompt template to obtain first prompt information;
Compressing the first prompt information to obtain compressed information, wherein the order information is compressed when the spliced information in the first prompt information comprises order information, and the analyst evaluation report is compressed when the spliced information in the first prompt information comprises the analyst evaluation report, and at least one target news in the spliced information in the first prompt information is compressed;
inputting the compressed information into the first language model to obtain a risk assessment report output by the first language model;
the training method of the knowledge graph model comprises the following steps:
the first acquisition module is used for acquiring first initial training data; wherein the first initial training data includes supply chain financial information and order financing information;
a second acquisition module for inputting the first initial training data into a third language model to extract at least one set of third entity relationship triples from the first initial training data; wherein any one of the third entity relationship triples comprises two first entities;
news searching is carried out on each first entity through a first search engine so as to obtain a plurality of first search news;
Inputting each first search news to the third language model to obtain a plurality of groups of first entity relation triples;
the third acquisition module is used for extracting the entity relation triples of the first initial training data through the relation extraction model so as to obtain at least one group of fourth entity relation triples; wherein any one of the fourth entity relationship triples comprises two second entities;
news searching is carried out on each second entity through a second search engine so as to obtain a plurality of second search news;
Extracting entity relation triples of the second search news through the relation extraction model to obtain a plurality of groups of second entity relation triples;
A fourth obtaining module, configured to determine any one of the first target triples as first target training data in response to at least one of the plurality of first entity-relationship triples and the plurality of second entity-relationship triples existing in the plurality of first entity-relationship triples; wherein any one of the first target triples is both the first entity relationship triplet and the second entity relationship triplet;
Determining an entity relationship triplet that is only the first entity relationship triplet and an entity relationship triplet that is only the second entity relationship triplet as a second target triplet;
for any second target triplet, determining the second target triplet with no check as first target training data under the condition that the second target triplet is checked with no check;
the training module is used for training the initial knowledge-graph model based on the first target training data so as to obtain a trained knowledge-graph model;
wherein the training device of the first language model comprises:
The acquisition module is used for acquiring a self-supervision training data set, a positive and negative sample training data set and an instruction fine tuning training data set, wherein the positive and negative sample training data set comprises positive sample news and negative sample news;
The first training module is used for carrying out first training on an initial first language model based on the self-supervision training data set to obtain a first trained first language model, wherein when the initial first language model is subjected to first training, the initial first language model is loaded, bottom layer parameters of the initial first language model are frozen, the self-supervision training data set is directly input into the first language model with the frozen bottom layer parameters, so that autoregressive training is carried out, fine adjustment on the initial first language model is achieved, and the first trained first language model is obtained;
the second training module is used for carrying out second training on the first language model subjected to the first training based on the positive and negative sample training data set to obtain the first language model subjected to the second training;
The third training module is used for carrying out third training on the first language model subjected to the second training based on the instruction fine tuning training data set to obtain a trained first language model;
Under the condition that the splicing information in the first prompt information comprises order information, compressing the order information comprises the following steps: when the first total number of character units token included in the order information is larger than the first set number, compressing the order information to the second set number by adopting the first language model, and when the first total number of character units token included in the order information is not larger than the first set number, not executing compression operation on the order information;
Under the condition that the splicing information in the first prompt information comprises an analyst evaluation report, compressing the analyst evaluation report comprises the following steps:
When the second total number of character unit tokens included in the analyst evaluation report is greater than the third set number, compressing the analyst evaluation report to a fourth set number by adopting the first language model;
when the second total number of character unit tokens included in the analyst evaluation report is not greater than the third set number, not performing a compression operation on the analyst evaluation report;
The compressing the at least one target news in the splicing information in the first prompt information includes:
Determining the maximum value of the token which can be borne by the first language model, the first value of the token in the target information in the splicing information, the second value of the token except the target information and the token in each target news in the splicing information, and the third value of the token except the token in the splicing information in the first prompt information;
Determining a fourth value of the remaining inputtable token of the first language model according to the maximum value, the first value, the second value, and the third value, wherein the fourth value m is determined according to the following formula assuming that the maximum value is n, the first value is k, the second value is q, and the third value is p:
m=n-p-q-k;
Compressing at least one target news in the splicing information so that the sum of the number of tokens in each compressed target news is not greater than the fourth value;
Determining a fifth value of a token in any target news in the splicing information, and summing the fifth values of all the target news to obtain a first sum; assuming that the fourth value is m, when m < y <10m, splicing all target news to obtain a first spliced news; dividing the first spliced news into The first sub news is shared; for any first sub-news, compressing the first sub-news to a fifth set number, wherein the fifth set number is smaller than the fifth set numberCombining the compressed first sub-news to obtain at least one compressed target news;
Determining a first type of news and a second type of news in at least one target news when y >10 m; the first type of news can be enterprise news and/or creator news, and the second type of news can be industry news; aiming at the first type news, acquiring the weight (or priority) of each first type news, and sequencing each first type news according to the sequence from the big weight to the small weight to obtain a second sequencing sequence; splicing the news of the first category according to the second sequencing sequence to obtain second spliced news; intercepting first intercepted news containing 5m token from the starting position of the second spliced news; dividing the first intercepted news into 5 (=5m/m) second sub-news; for any second sub news, compressing the second sub news to a sixth set number, wherein the sixth set number is smaller than Combining the compressed second sub-news to obtain compressed first-class news;
Aiming at the second class news, acquiring the weight (or priority) of each second class news, and sequencing each second class news according to the sequence from the big weight to the small weight to obtain a third sequencing sequence; splicing the second news according to the third ordering sequence to obtain third spliced news; intercepting a second intercepted news containing 5m token from the starting position of the third spliced news; dividing the second intercepted news into 5 (=5 m/m) parts of third sub-news;
Compressing the third sub-news to a seventh set number for any third sub-news, wherein the seventh set number is smaller than And merging the compressed third sub-news to obtain the compressed second-class news.
5. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
The processor executes computer-executable instructions stored by the memory to implement the method of any one of claims 1-3.
6. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to implement the method of claims 1-3
The method of any one of the claims.
CN202311423786.3A 2023-10-30 2023-10-30 Risk assessment method and model training method for order financing Active CN117495538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311423786.3A CN117495538B (en) 2023-10-30 2023-10-30 Risk assessment method and model training method for order financing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311423786.3A CN117495538B (en) 2023-10-30 2023-10-30 Risk assessment method and model training method for order financing

Publications (2)

Publication Number Publication Date
CN117495538A CN117495538A (en) 2024-02-02
CN117495538B true CN117495538B (en) 2024-08-13

Family

ID=89680800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311423786.3A Active CN117495538B (en) 2023-10-30 2023-10-30 Risk assessment method and model training method for order financing

Country Status (1)

Country Link
CN (1) CN117495538B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118395456A (en) * 2024-06-24 2024-07-26 合肥天帷信息安全技术有限公司 Iso-insurance assessment risk analysis method, system and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231494A (en) * 2020-12-16 2021-01-15 完美世界(北京)软件科技发展有限公司 Information extraction method and device, electronic equipment and storage medium
CN112434812A (en) * 2020-11-26 2021-03-02 中山大学 Knowledge graph link prediction method and system based on dual quaternion
CN114519524A (en) * 2022-02-18 2022-05-20 平安国际智慧城市科技股份有限公司 Enterprise risk early warning method and device based on knowledge graph and storage medium
CN115456584A (en) * 2022-09-16 2022-12-09 深圳今日人才信息科技有限公司 Similar JD recall and recommendation method based on deep learning model and expert system
CN116910105A (en) * 2023-09-12 2023-10-20 成都瑞华康源科技有限公司 Medical information query system and method based on pre-training large model

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598702A (en) * 2020-04-14 2020-08-28 徐佳慧 Knowledge graph-based method for searching investment risk semantics
CN111967761B (en) * 2020-08-14 2024-04-02 国网数字科技控股有限公司 Knowledge graph-based monitoring and early warning method and device and electronic equipment
CN113537796A (en) * 2021-07-22 2021-10-22 大路网络科技有限公司 Enterprise risk assessment method, device and equipment
CN113792122A (en) * 2021-09-29 2021-12-14 中国银行股份有限公司 Method and device for extracting entity relationship, electronic equipment and storage medium
CN116955646A (en) * 2023-06-30 2023-10-27 腾讯科技(深圳)有限公司 Knowledge graph generation method and device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434812A (en) * 2020-11-26 2021-03-02 中山大学 Knowledge graph link prediction method and system based on dual quaternion
CN112231494A (en) * 2020-12-16 2021-01-15 完美世界(北京)软件科技发展有限公司 Information extraction method and device, electronic equipment and storage medium
CN114519524A (en) * 2022-02-18 2022-05-20 平安国际智慧城市科技股份有限公司 Enterprise risk early warning method and device based on knowledge graph and storage medium
CN115456584A (en) * 2022-09-16 2022-12-09 深圳今日人才信息科技有限公司 Similar JD recall and recommendation method based on deep learning model and expert system
CN116910105A (en) * 2023-09-12 2023-10-20 成都瑞华康源科技有限公司 Medical information query system and method based on pre-training large model

Also Published As

Publication number Publication date
CN117495538A (en) 2024-02-02

Similar Documents

Publication Publication Date Title
Guo et al. Developer activity motivated bug triaging: via convolutional neural network
Zhou et al. Classifying the political leaning of news articles and users from user votes
US20180373701A1 (en) Adaptive evaluation of meta-relationships in semantic graphs
US10853697B2 (en) System and method for monitoring online retail platform using artificial intelligence and fixing malfunction
CN106095942B (en) Strong variable extracting method and device
US20100280985A1 (en) Method and system to predict the likelihood of topics
US20240311941A1 (en) Analysis of intellectual-property data in relation to products and services
US11803927B2 (en) Analysis of intellectual-property data in relation to products and services
US11348195B2 (en) Analysis of intellectual-property data in relation to products and services
CN111859969A (en) Data analysis method and device, electronic equipment and storage medium
CN118211832A (en) Financial tax data risk monitoring method, system, electronic equipment and storage medium
CN117495538B (en) Risk assessment method and model training method for order financing
Gadelha et al. Traceability recovery between bug reports and test cases-a Mozilla Firefox case study
CN119003891B (en) Method, device and equipment for generating employee search recommended content
CN114303140A (en) Analysis of intellectual property data related to products and services
US11461616B2 (en) Method and system for analyzing documents
Zhao et al. State and tendency: an empirical study of deep learning question&answer topics on Stack Overflow
US20230281541A1 (en) Systems and methods for generating insights based on regulatory reporting and analysis
CN113849618B (en) Strategy determination method and device based on knowledge graph, electronic equipment and medium
CN112860652A (en) Operation state prediction method and device and electronic equipment
KR20230059364A (en) Public opinion poll system using language model and method thereof
Pei [Retracted] Construction of a Legal System of Corporate Social Responsibility Based on Big Data Analysis Technology
CN113095078A (en) Associated asset determination method and device and electronic equipment
Sun Sourcing Risk Detection and Prediction with Online Public Data: An Application of Machine Learning Techniques in Supply Chain Risk Management
US12073947B1 (en) Meta-learning for automated health scoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant