CN113742495B

CN113742495B - Rating feature weight determining method and device based on prediction model and electronic equipment

Info

Publication number: CN113742495B
Application number: CN202111043898.7A
Authority: CN
Inventors: 舒畅; 陈又新
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-09-07
Filing date: 2021-09-07
Publication date: 2024-02-23
Anticipated expiration: 2041-09-07
Also published as: CN113742495A

Abstract

The application discloses a rating feature weight determining method and device based on a prediction model and electronic equipment, and belongs to the technical field of artificial intelligence. The method comprises the following steps: and acquiring a knowledge graph, wherein the knowledge graph is constructed according to the acquired rating characteristic data, the knowledge graph comprises a plurality of characteristic entities and association relations among the characteristic entities, and the characteristic entities comprise a plurality of target entities. And generating associated vector features of each target entity according to the knowledge graph, wherein the associated vector features are vector representations of feature entities with associated relations with the target entities. And training a predetermined prediction model by utilizing the associated vector features of each target entity to obtain a trained prediction model, and determining a weight matrix of the rating features according to the trained prediction model, so that objective feature weight distribution is realized based on a rating feature knowledge system, intervention of artificial factors is not needed, and fairness of ESG rating is facilitated.

Description

Rating feature weight determining method and device based on prediction model and electronic equipment

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a rating feature weight determining method and device based on a prediction model and electronic equipment.

Background

ESG ratings, i.e., environmental, social, and Governance (Governance) ratings, are used to provide intelligent tools and data support for integrated applications of enterprise ESG wind control, model building, portfolio management. In the existing mode, an ESG rating system of an enterprise comprises three aspects of qualitative, quantitative and negative behaviors and risks, wherein each aspect depends on characteristic indexes and corresponding weights summarized by artificial experience, and then the ESG rating system is quantified by a model. Therefore, the weight distribution of the ESG rating characteristic index has more subjective judgment and is not accurate enough.

Disclosure of Invention

The application provides a rating feature weight determining method and device based on a prediction model and electronic equipment, and the method and device mainly aim to improve accuracy of ESG rating.

To achieve the above object, an embodiment of the present application provides a method for determining a rating feature weight based on a prediction model, the method including the steps of:

acquiring a knowledge graph, wherein the knowledge graph is constructed according to the collected rating characteristic data, the knowledge graph comprises a plurality of characteristic entities and association relations among the characteristic entities, and the characteristic entities comprise a plurality of target entities; generating associated vector features of each target entity according to the knowledge graph, wherein the associated vector features are vector representations of feature entities with association relation with the target entity; training a predetermined prediction model by using the associated vector features of each target entity to obtain a trained prediction model; and determining a weight matrix of the rating feature according to the trained prediction model.

To achieve the above object, an embodiment of the present application further provides a rating feature weight determining apparatus based on a prediction model, where the apparatus includes:

the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a knowledge graph, the knowledge graph is constructed according to the acquired rating characteristic data, the knowledge graph comprises a plurality of characteristic entities and association relations among the characteristic entities, and the characteristic entities comprise a plurality of target entities;

the generation module is used for generating association vector features of each target entity according to the knowledge graph, wherein the association vector features are vector representations of feature entities with association relation with the target entity;

the training module is used for training a predetermined prediction model by utilizing the associated vector characteristics of each target entity to obtain a prediction model after training;

and the determining module is used for determining a weight matrix of the rating characteristic according to the trained prediction model.

To achieve the above object, an embodiment of the present application further proposes an electronic device including a memory, a processor, a program stored on the memory and executable on the processor, and a data bus for enabling a connection communication between the processor and the memory, the program, when executed by the processor, implementing the steps of the aforementioned method.

To achieve the above object, the present application provides a storage medium for computer-readable storage, the storage medium storing one or more programs executable by one or more processors to implement the steps of the foregoing method.

According to the method and the device for determining the rating feature weight based on the prediction model and the electronic equipment, the knowledge graph is constructed according to the collected rating feature data, and each feature entity for rating and the association relation thereof can be effectively structured to form an objective and transparent knowledge system. Wherein the plurality of feature entities includes a plurality of target entities. Based on the method, the feature entity associated with each target entity can be determined and corresponding associated vector features are generated by acquiring the knowledge graph, and the associated vector features of each target entity are utilized to train a pre-determined prediction model, so that the parameter learning of the prediction model is realized by combining the feature association of each target entity and different feature entities, and finally, a trained prediction model is obtained and can be used for determining a weight matrix of the rating feature. Therefore, the whole process has no intervention of human factors, but realizes objective characteristic weight distribution based on a rating characteristic knowledge system, and is beneficial to the accuracy of ESG rating.

Drawings

Fig. 1 is a block diagram of an electronic device to which an embodiment of the present application is applied.

Fig. 2 is a flowchart of a method for determining a rating feature weight based on a prediction model according to an embodiment of the present application.

FIG. 3 is a schematic diagram of a data structure of an ontology description model according to an embodiment of the present application.

Fig. 4 is a flowchart of a method for determining a rating feature weight based on a prediction model according to a second embodiment of the present application.

Fig. 5 is a block diagram of a rating feature weight determining apparatus based on a prediction model applied to an embodiment of the present application.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In the following description, suffixes such as "module", "component", or "unit" for representing elements are used only for facilitating the description of the present application, and have no specific meaning in themselves. Thus, "module," "component," or "unit" may be used in combination.

The application provides a rating feature weight determining method based on a prediction model, which is applied to electronic equipment. Referring to fig. 1, fig. 1 is a block diagram of an electronic device to which an embodiment of the present application is applied.

In this embodiment, the electronic device may be a terminal device having an operation function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

The electronic device includes: memory 11, processor 12, network interface 13, and data bus 14.

The memory 11 includes at least one type of readable storage medium, which may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card memory, or the like. In some embodiments, the readable storage medium may be an internal storage unit of an electronic device, such as a hard disk of the electronic device. In other embodiments, the readable storage medium may also be an external memory of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the electronic device.

In the present embodiment, the readable storage medium of the memory 11 is generally used to store the rating feature weight determining program 10, various sample sets, a prediction model, and the like mounted on the electronic device. The memory 11 may also be used to temporarily store data that has been output or is to be output.

The processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, e.g. executing a rating feature weight determining program or the like.

The network interface 13 may optionally comprise a standard wired interface, a wireless interface (e.g., WI-FI interface), typically used to establish a communication connection between the electronic device and other electronic devices.

The data bus 14 is used to enable connection communications between these components.

Fig. 1 shows only an electronic device having components 11-14, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead.

The following specifically describes a rating feature weight determining method based on a prediction model.

Example 1

As shown in fig. 2, fig. 2 is a flowchart of a method for determining a rating feature weight based on a prediction model according to an embodiment of the present application. Based on the electronic device shown in fig. 1, the processor 12 implements the following steps when executing the rating feature weight determining program 10 stored in the memory 11:

step S210: and obtaining a knowledge graph.

In the embodiment of the application, the knowledge graph is constructed according to the collected rating characteristic data. The rating feature data may include collected data obtained for a plurality of rating features, respectively, and the rating features represent feature types for ESG rating, which may be as shown in table 1, and the number and classification of the feature types are not particularly limited.

TABLE 1 Classification and association Table of feature types

It can be understood that the rating feature belonging to the second subclass may be further divided, and the association relationship between each first subclass and the second subclass and the association relationship between different second subclasses are recorded, and the number of the further divided levels is not particularly limited. For example, the "management event" may be subdivided into a second subclass of rating features pertaining to "management event" such as acquisition and high management thesaurus events; alternatively, "pollution events" are subdivided into rating features of a second subclass of pollution leakage and illegal emission events, etc., that are subordinate to "pollution events".

In some alternative implementations, prior to step S210, the mining element may be determined based on a specified variety of rating features. The mining element may represent keywords involved in data collection for various rating features, and may specifically include keywords of the rating feature itself and keywords related to the rating feature (such as time, location, emission, penalty amount, investment target, etc.). And then, data acquisition is carried out according to the mining elements, so that rating characteristic data for constructing the knowledge graph is obtained. Optionally, the data acquisition mode may include, but is not limited to: collecting data from related notices of corporate announcements, financial news, etc. and other various internet sources using search engines or crawler technology; data querying and mining are performed from the specified ontology database and cloud database through data warehouse technology (extract transform load, ETL).

In this embodiment of the present application, the knowledge graph includes a plurality of feature entities and association relationships between the plurality of feature entities, where the plurality of feature entities includes a plurality of target entities. Each feature entity corresponds to a rating feature, and the feature entity is specifically one collected data of the corresponding rating feature, for example, for the rating feature of "high management", three feature entities of manager a, manager B and manager C may be collected. The target entity may be collected data of target features, which are feature types specified in a plurality of rating features, related to the ESG rating purpose and its objects. By way of example, if applied to ESG ratings for various businesses, the target feature may be the business.

In some alternative implementations, before determining the mining element, a defined ontology description model may also be obtained, where the ontology description model may be a data model that represents a plurality of rating features and association relationships between different rating features, forming a structured rating feature system. The types of the ontology description model can adopt network ontology language (web ontology language, OWL), resource description framework (resource description framework, RDF) or RDF Schema, etc., without specific limitation. Taking fig. 3 as an example for illustration, fig. 3 is a schematic diagram of a data structure of an ontology description model in an embodiment of the present application. As shown in fig. 3, in the ontology description model based on OWL, rating features with association relationships are connected through connection edges, the connection edges may specifically represent types of corresponding association relationships, and the types of association relationships may include, but are not limited to, subordinate relationships, opposite relationships, symmetrical relationships, transfer relationships, and other association relationships with semantic labels. For example, the association type of the first subclass 11 (or the first subclass 12) and the large class 1 is a subordinate type, and the association type of the first subclass 31 and the second subclass 32 may be an inverse relationship type. Therefore, the ontology description model can provide rapid and flexible data modeling capability and realize efficient automatic reasoning.

In practical application, the ontology description model can be defined manually, and then the mining elements are determined based on the rating features contained in the ontology description model and the association relations among different rating features so as to acquire data, so that the ontology description model is instantiated, and a knowledge graph is constructed. In one implementation, each major class may be used for classifying and referencing the first subclass, and performing data mining on the major class to obtain a feature entity corresponding to each first subclass having a dependency relationship with the major class, for example, performing data mining on major class 3 to obtain a feature entity E1 corresponding to the first subclass 32. Further, by way of example, the feature entity E1 is mined in combination with the association relationship corresponding to the first subclass 32, so as to obtain the feature entity E2 corresponding to the second subclass 321 and the feature entity E3 corresponding to the first subclass 31. The feature entity E2 is mined in combination with the association relationship corresponding to the second subclass 321, so that the feature entity E4 corresponding to the first subclass 21 and the feature entity E5 corresponding to the second subclass 121 can be obtained. And the feature entity E5 is mined in combination with the association relationship corresponding to the second subclass 121, so as to obtain the feature entity E6 corresponding to the first subclass 12.

It can be appreciated that the association relationship between any two feature entities in the knowledge graph may be consistent with the association relationship between the rating features corresponding to the two feature entities. For example, if the first subclass 31 and the second subclass 321 are both directly connected to the first subclass 32, the above-mentioned feature entity E2 and feature entity E3 are both directly associated with the feature entity E1. The first subclass 21 and the second subclass 121 are both connected to the first subclass 32 through the second subclass 321, and then the above-mentioned feature entity E4 and feature entity E5 both have an association relationship with the feature entity E1 through the feature entity E2. The first sub-class 12 is connected with the first sub-class 32 sequentially through the second sub-class 121 and the second sub-class 321, and then the above-mentioned feature entity E6 has an association relationship with the feature entity E1 sequentially through the feature entity E5 and the feature entity E2.

Furthermore, the knowledge graph can be continuously updated and supplemented through multiple data acquisition, so that the structural system of the ontology description model is continuously perfected. For example, when a newly added feature entity exists in the knowledge graph and the rating feature corresponding to the newly added feature entity does not exist in the ontology description model, the rating feature corresponding to the newly added feature entity may be added to the ontology description model. When the association relationship exists between any two feature entities in the knowledge graph, but the association relationship between the rating feature corresponding to one feature entity and the rating feature corresponding to the other feature entity does not exist in the ontology description model, the association relationship can be established for the rating features corresponding to the two feature entities in the ontology description model.

Step S220: and generating the association vector features of each target entity according to the knowledge graph.

In the embodiment of the application, the association vector feature is a vector representation of a feature entity having an association relationship with a target entity. Optionally, a predetermined encoder may be used to encode the feature entity having an association relationship with the target entity, so as to obtain an association vector feature, so as to facilitate analysis of association information of the target entity in the knowledge graph. The encoder may be a word2vec model, and the like, which is not limited thereto.

Step S230: and training a predetermined prediction model by utilizing the associated vector features of each target entity to obtain a trained prediction model.

In the embodiment of the present application, the prediction model may adopt a neural network model (such as a convolutional neural network model and a cyclic neural network model), a bayesian model, or an attention model, which is not limited thereto. Specifically, for each target entity, the relevance vector features may include at least one relevance vector, where each relevance vector corresponds to a different type of rating feature, so that after the relevance vector features are input into the prediction model, the prediction model may implement weight allocation for each relevance vector.

In some implementations, the associated vector features of each target entity may be respectively input into a predetermined prediction model for training, and then the training step of the prediction model may be: the training weight of each target entity is obtained to be used as a verification set, the associated vector feature of each target entity is used as a training set, the training set is utilized to train a preset and determined prediction model to obtain the output of the prediction model to each associated vector feature, the verification set is utilized to verify the accuracy of the corresponding output, if the accuracy is greater than or equal to the preset accuracy, the training is finished, and if the accuracy is less than the preset accuracy, the training step is continuously executed. The preset accuracy is a parameter which needs to be preset, and can be correspondingly adjusted according to the requirements of users.

Alternatively, the training weights of the target entities may be obtained as follows: determining a sequencing index, and acquiring sequencing data corresponding to each target entity according to the sequencing index. The ranking index may be, without limitation, a yield rate, any of the above-mentioned rating characteristics, and the like. And then, distributing training weights for all target entities by combining the sequencing data corresponding to all target entities. Specific training weight allocation manners may include, but are not limited to: and carrying out normalization processing (such as linear function normalization or 0-mean normalization) on the sequencing data corresponding to each target entity to obtain the training weight of each target entity.

In other implementations, the comparison result can be obtained by performing pairwise comparison on all target entities in the knowledge graph. For each comparison processing, determining a training ending condition according to the comparison result, and training a predetermined prediction model by using the association vector characteristics corresponding to each of the two target entities subjected to the comparison processing until the output of the prediction model meets the training ending condition. Therefore, the training of the prediction model can be guided by combining the comparison results of the two target entities, so that the prediction model continuously learns and optimizes the configuration of model parameters by taking the output meeting the comparison results as a standard, and the actual comparison sequences of different target entities are fused to carry out the weight distribution of the rating characteristics.

Specifically, the comparison processing may refer to performing pairwise comparison on respective comparison data of all target entities, where the comparison data may be data collected by the target entities according to a specified index, and the specified index may specifically refer to the above-mentioned sorting index, and is not limited.

Step S240: and determining a weight matrix of the rating feature according to the trained prediction model.

In the embodiment of the present application, the model parameters of the prediction model may represent weight distribution results for multiple types of rating features, so by obtaining the model parameters of the prediction model after training, a final weight matrix may be determined, and the sum of all elements included in the weight matrix is 1.

In practical application, for each target entity, the scoring condition of at least one target rating feature related to the target entity can also be obtained. After the weight matrix of the rating feature is determined, the weight matrix is utilized to carry out weighted calculation on the scoring condition corresponding to each target rating feature, so that the total scoring value of the target entity can be obtained, and the complete rating process is realized.

Therefore, by implementing the embodiment of the method, each characteristic entity and the association relation thereof for enterprise rating are effectively structured to form an objective transparent knowledge system, and the parameter learning of the prediction model is realized by combining the characteristic association of each target entity and different characteristic entities, so that the trained prediction model is finally obtained and used for determining the weight matrix of the rating characteristics. Therefore, the whole process has no intervention of human factors, but realizes objective characteristic weight distribution based on a rating characteristic knowledge system, and is beneficial to the accuracy and fairness of ESG rating.

Example two

As shown in fig. 4, fig. 4 is a flowchart of a method for determining a rating feature weight based on a prediction model according to a second embodiment of the present application. Based on the electronic device shown in fig. 1, the processor 12 implements the following steps when executing the rating feature weight determining program 10 stored in the memory 11:

step S410: and obtaining a knowledge graph.

Step S420: and obtaining the feature vector of each feature entity in the knowledge graph.

In some alternative embodiments, step S420 may specifically be:

and carrying out random walk based on the knowledge graph to obtain a plurality of triplets, wherein the triplets comprise two directly-related characteristic entities and an association relationship between the two directly-related characteristic entities. And substituting the triples into a predetermined objective function for training aiming at each triplet to obtain the respective feature vectors of the two feature entities included in the triples when the objective function is minimum, thereby realizing vectorization of the feature entities and the direct association relation thereof in the knowledge graph. Wherein the objective function satisfies:，

wherein,for the objective function, h and t are two feature entities comprised by the triplet +.>For the eigenvector of h, +.>And r is the characteristic vector of the association relationship between h and t. But->=h/>，/>=t/>，r=/>,/>For training matrix +.>And the association relation between h and t.

Specifically, the random walk algorithm may employ node2vec, depth-first walk, or breadth-first walk, etc. It can be seen that minimizing the above objective function allows vector representations of two directly related feature entities to be close to each other in space after projection.

In other optional embodiments, step S420 may specifically further use a translation-based model, such as a TransH model, a TransR model, or a TransD model, to obtain feature vectors of each feature entity in the knowledge-graph.

Step S430: and aiming at each target entity, acquiring m entity groups corresponding to the target entities from the knowledge graph.

In the embodiment of the present application, m is a positive integer. And the m entity groups corresponding to the target entity have an association relationship with the target entity, wherein the j-th entity group comprises characteristic entities which have an association relationship with the target entity through j-1 characteristic entities, j is a positive integer and j is E [1, m ]. For example, the 1 st entity group includes feature entities directly associated with the target entity, and the 2 nd entity group includes feature entities separated by 1 feature entity associated with the target entity. It can be understood that the greater the value of j, the weaker the association between the jth entity group and the target entity.

The description is still given with reference to the above description taking fig. 3 as an example. After the knowledge graph is constructed based on the ontology description model shown in fig. 3, assuming that the feature entity E1 corresponding to the first subclass 32 is a target entity, the feature entities E2 and E3 may be added to the 1 st entity group of the target entity, the feature entities E4 and E5 may be added to the 2 nd entity group of the target entity, and the feature entity E6 may be added to the 3 rd entity group of the target entity.

Step S440: and aiming at each target entity, carrying out fusion processing on the feature vectors of all feature entities in the entity groups according to each entity group corresponding to the target entity to obtain fusion vectors.

Based on step S440, feature vector fusion of feature entities under the same hop count is achieved. Specifically, the manner of fusion processing may include, but is not limited to: summing up and calculating the feature vectors of all feature entities in the entity group; or, carrying out fusion processing on the feature vectors of all the feature entities in the entity group, and carrying out summation and average calculation.

Step S450: for each target entity, respectively inputting m fusion vectors corresponding to the target entity into a predetermined graph meaning neural network for training to obtain m association vectors corresponding to the target entity, and taking the m association vectors as association vector features corresponding to the target entity.

In the embodiment of the application, the graph attention neural network specifically may use a transducer neural network model, but is not limited thereto. Based on a transformer neural network model, the fusion vectors of different hop counts can be encoded by combining the statistics and semantic information associated with the target entity under different hop counts, so that the association vector output by the schematic force neural network can more accurately reflect the association degree with the target entity.

Step S460: and carrying out pairwise comparison treatment on all target entities in the knowledge graph to obtain a comparison result.

In the embodiment of the present application, the two target entities performing the comparison processing may include a first target entity and a second target entity. In an alternative embodiment, step S460 may specifically be: and carrying out pairwise comparison processing based on the yield on all target entities in the knowledge graph to obtain a comparison result. Correspondingly, the training ending conditions include: if the comparison result indicates that the yield of the first target entity is greater than that of the second target entity, the output of the neural network model is greater than a preset output value; or if the comparison result indicates that the yield of the first target entity is smaller than that of the second target entity, the output of the neural network model is smaller than a preset output value. The preset output value may be manually set and adjusted, for example, the preset output value is 0.5, which is not limited in particular. Therefore, the method and the system can also dynamically compare and sort the gain values of the target entities based on the gain rates of the target entities, and further guide the weight distribution of the rating features by combining the gain value sorting of the target entities, so that the obtained weight distribution result meets the rating requirements of investors on investment portfolio configuration and strategy updating.

In another alternative embodiment, the designated time may be determined, and step S460 may specifically be: and performing pairwise comparison processing on all target entities in the knowledge graph based on the comparison data in the designated time to obtain a comparison result. Optionally, before step S440, for each target entity, feature entities that do not belong to the specified time may be removed from m entity groups corresponding to the target entity, to obtain m updated entity groups. Correspondingly, step S440 specifically includes: and aiming at each target entity, carrying out fusion processing on the feature vectors of all feature entities in the updated entity group according to each updated entity group corresponding to the target entity to obtain fusion vectors.

Each feature entity records a corresponding acquisition time, the designated time can be a designated time period, and whether each feature entity belongs to the designated time can be determined by comparing the acquisition time corresponding to each feature entity with the designated time. Therefore, by carrying out feature screening in a specified time range, dynamic feature selection and weight updating can be realized while the knowledge graph is constructed, perfected and updated, so that the timeliness characteristic of weight distribution is improved, and the acquired data in an important time period can be fused flexibly to meet the requirement of more diversified weight distribution.

Step S470: for each comparison processing, determining a training ending condition according to the comparison result, and training a predetermined prediction model by using the association vector characteristics corresponding to each of the two target entities subjected to the comparison processing until the output of the prediction model meets the training ending condition.

In the embodiment of the present application, specifically, the predetermined prediction model may satisfy:

，

wherein i is a positive integer, O is the output of the prediction model,for a given predictive function +.>For the ith association vector corresponding to the first target entity,>for the ith association vector corresponding to the second target entity,>the i-th element in the weight matrix that is the rating feature.

For example, assuming that m takes a value of 3, the output of the predictive model is（/>）+/>（/>）+（/>））。

Step S480: and determining a weight matrix of the rating feature according to the trained prediction model.

It can be appreciated that the specific implementation manner of steps S410, S470 and S480 in this embodiment may also refer to the descriptions of steps S210, S230 and S240 in the first embodiment, which are not repeated herein.

In the embodiment of the application, the ith element in the weight matrix may represent a weight value of the ith-hop rating feature, where the ith-hop rating feature satisfies: the feature entity corresponding to the ith hop rating feature can have an association relationship with the target entity through i-1 feature entities in the knowledge graph. Then, weighting calculation is performed on the scoring condition corresponding to each target rating feature by using the weight matrix, which may specifically be: for each target rating feature, determining a hop count w of the target rating feature relative to the target entity. And according to the hop count w, acquiring a w element in the weight matrix, and carrying out weighted calculation by utilizing the scoring condition corresponding to the w element and the target rating feature to obtain weighted scores. And finally, carrying out summation calculation on the weighted scores of the target rating characteristics to obtain the total rating value of the target entity.

Taking tables 1 and 2 as an example for illustration, assume that the weight matrix w= [ ]=[0.5 0.3 0.2]Among the rating characteristics shown in table 1, the number of hops corresponding to investment funds, high-rise pipes, upstream industrial chains and business management events is 1, the number of hops corresponding to pollution events is 2, the number of hops corresponding to legal persons, fund managers and downstream industrial chains is 3, and the rating conditions of the rating characteristics of a certain enterprise are shown in table 2:

table 2 scoring Condition Table for an enterprise

Then, the total Score value score=of the enterprise46.7。

Therefore, by implementing the embodiment of the method, each characteristic entity and the association relation thereof for enterprise rating are effectively structured to form an objective transparent knowledge system, and the parameter learning of the prediction model is realized by combining the characteristic association of each target entity and different characteristic entities, so that the trained prediction model is finally obtained and used for determining the weight matrix of the rating characteristics. Therefore, the whole process does not have intervention of artificial factors, but realizes objective characteristic weight distribution based on a rating characteristic knowledge system, and is beneficial to the accuracy of ESG rating. In addition, the training of the prediction model can be guided by combining the comparison results of the two target entities, so that the prediction model continuously learns and optimizes the configuration of model parameters by taking the output of the comparison results as a standard, and the actual comparison sequences of different target entities are fused to perform the weight distribution of the rating characteristics.

The embodiment of the application also provides a rating characteristic weight determining device based on the prediction model. Referring to fig. 5, fig. 5 is a block diagram illustrating a rating feature weight determining apparatus based on a prediction model according to an embodiment of the present application. As shown in fig. 5, the prediction model-based rating feature weight determining apparatus 500 includes:

the obtaining module 510 is configured to obtain a knowledge graph, where the knowledge graph is constructed according to the collected rating feature data, the knowledge graph includes a plurality of feature entities and association relationships between the feature entities, and the feature entities include a plurality of target entities.

The generating module 520 is configured to generate, according to the knowledge graph, an association vector feature of each target entity, where the association vector feature is a vector representation of a feature entity that has an association relationship with the target entity.

The training module 530 is configured to train a predetermined prediction model by using the associated vector features of each target entity, so as to obtain a trained prediction model.

A determining module 540, configured to determine a weight matrix of the rating feature according to the trained prediction model.

It should be noted that, the specific implementation process of the present embodiment may refer to the specific implementation process described in the foregoing method embodiment, and will not be described again.

In addition, the embodiment of the application can acquire and process related data based on artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the present application shall fall within the scope of the claims of the present application.

Claims

1. A method for determining a rating feature weight based on a predictive model, the method comprising:

acquiring a knowledge graph, wherein the knowledge graph is constructed according to collected rating feature data, the knowledge graph comprises a plurality of feature entities and association relations among the feature entities, the feature entities comprise a plurality of target entities, the rating feature data comprise a plurality of rating features, each feature entity corresponds to one rating feature, and the target entities are enterprises;

generating associated vector features of each target entity according to the knowledge graph, wherein the associated vector features are vector representations of feature entities with association relation with the target entity;

training a predetermined prediction model by using the associated vector features of each target entity to obtain a trained prediction model;

determining a weight matrix of the rating feature according to the trained prediction model;

the training of the predetermined prediction model by using the associated vector features of each target entity to obtain a trained prediction model comprises the following steps:

performing pairwise comparison treatment on all target entities in the knowledge graph to obtain a comparison result;

for each comparison processing, determining a training ending condition according to the comparison result, and training a predetermined prediction model by using the associated vector features corresponding to the two target entities for the comparison processing until the output of the prediction model meets the training ending condition; the correlation vector features comprise m correlation vectors, the two target entities for the comparison processing comprise a first target entity and a second target entity, and the predetermined prediction model satisfies the following conditions:

，

wherein m is a positive integer, i is a positive integer, O is the output of the prediction model,for predictive function +.>For the i-th association vector corresponding to the first target entity>For the ith association vector corresponding to the second target entity>An ith element in a weight matrix which is a rating feature;

and performing pairwise comparison treatment on all the target entities in the knowledge graph to obtain a comparison result, wherein the comparison result comprises the following steps:

performing pairwise contrast processing based on the yield on all the target entities in the knowledge graph to obtain a contrast result;

the training ending conditions include:

if the comparison result indicates that the yield of the first target entity is greater than that of the second target entity, the output of the prediction model is greater than a preset output value;

or,

and if the comparison result indicates that the yield of the first target entity is smaller than that of the second target entity, the output of the prediction model is smaller than a preset output value.

2. The method of claim 1, wherein the relevance vector feature includes m relevance vectors, m being a positive integer, the method further comprising:

acquiring feature vectors of all feature entities in the knowledge graph;

and generating the association vector features of each target entity according to the knowledge graph, wherein the association vector features comprise the following steps:

for each target entity, obtaining m entity groups corresponding to the target entity from the knowledge graph, wherein the j-th entity group comprises characteristic entities with association relations with the target entity through j-1 characteristic entities, j is a positive integer and j epsilon [1, m ];

aiming at each target entity, carrying out fusion processing on feature vectors of all feature entities in the entity groups according to each entity group corresponding to the target entity to obtain fusion vectors;

and respectively inputting m fusion vectors corresponding to the target entities into a predetermined graph-annotation force neural network for training aiming at each target entity to obtain m association vectors corresponding to the target entities.

3. The method according to claim 2, wherein, for each target entity, before performing fusion processing on feature vectors of all feature entities in the entity group according to each entity group corresponding to the target entity, the method further includes:

determining a designated time, and removing feature entities which do not belong to the designated time from m entity groups corresponding to the target entities for each target entity to obtain m updated entity groups;

and for each target entity, according to each entity group corresponding to the target entity, performing fusion processing on feature vectors of all feature entities in the entity group to obtain fusion vectors, including:

and aiming at each target entity, carrying out fusion processing on the feature vectors of all feature entities in the updated entity group according to each updated entity group corresponding to the target entity to obtain fusion vectors.

4. The method according to claim 2, wherein the obtaining feature vectors of each feature entity in the knowledge-graph includes:

performing random walk based on the knowledge graph to obtain a plurality of triplets, wherein the triplets comprise two directly related characteristic entities and an association relationship between the two directly related characteristic entities;

substituting the triples into a predetermined objective function for training aiming at each triplet to obtain two triples included by the triplet when the objective function is minimumThe characteristic vector of each characteristic entity, the objective function satisfies:，

wherein,for the objective function, h and t are two characteristic entities comprised by the triplet,/->For the eigenvector of h, +.>And r is the characteristic vector of the association relationship between h and t.

5. A predictive model-based rating feature weight determination apparatus, the apparatus comprising:

the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a knowledge graph, the knowledge graph is constructed according to collected rating feature data, the knowledge graph comprises a plurality of feature entities and association relations among the feature entities, the feature entities comprise a plurality of target entities, the rating feature data comprise a plurality of rating features, each feature entity corresponds to one type of rating feature, and the target entities are enterprises;

the determining module is used for determining a weight matrix of the rating feature according to the trained prediction model;

，

the training ending conditions include:

or,

6. An electronic device comprising a memory, a processor, a program stored on the memory and executable on the processor, and a data bus for enabling a connection communication between the processor and the memory, the program when executed by the processor implementing the steps of the predictive model based rating feature weight determination method according to any of claims 1 to 4.

7. A storage medium for computer readable storage, wherein the storage medium stores one or more programs executable by one or more processors to implement the steps of the predictive model-based rating feature weight determination method of any of claims 1 to 4.