CN110956224B

CN110956224B - Evaluation model generation and evaluation data processing method, device, equipment and medium

Info

Publication number: CN110956224B
Application number: CN201911398404.XA
Authority: CN
Inventors: 陈娴娴; 阮晓雯; 徐亮
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-08-01
Filing date: 2019-12-30
Publication date: 2024-03-08
Anticipated expiration: 2039-12-30
Also published as: CN110956224A

Abstract

The invention discloses an evaluation model generation and evaluation data processing method, an evaluation model generation and evaluation data processing device, evaluation data processing equipment and an evaluation data processing medium. The method comprises the following steps: acquiring a first target feature data set; inputting the first target characteristic data set into the multi-layer sensor containing the initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value; if the loss value is smaller than or equal to the preset loss value, confirming that the evaluation model training integrated by the multi-layer sensor and the logistic regression model is completed. And the probability evaluation of the characteristics of the item to be evaluated is accurately and efficiently realized through the evaluation model.

Description

Evaluation model generation and evaluation data processing method, device, equipment and medium

Technical Field

The present invention relates to the field of prediction models, and in particular, to a method, apparatus, device, and medium for generating an evaluation model and processing evaluation data.

Background

At present, with the development of science and technology, vehicles have come into thousands of households as common vehicles, and in the use process of the vehicles, the requirements of users on the safety and the comfort of the vehicles are higher and higher, so that the safety and the comfort of the vehicles need to be evaluated when the vehicles are manufactured and used, so as to ensure the safety and the comfort of the vehicles at the same time. In the prior art, the safety and the comfort of a vehicle are usually evaluated manually according to massive relevant parameters of the vehicle, the evaluation efficiency is low, and the influence of certain subjective consciousness exists in the process, so that the evaluation accuracy is low, the cliff effect is easily generated on the comfort and the safety of the vehicle, the manufacturing of the vehicle is seriously inclined towards one of the safety and the comfort, the balance of the safety and the comfort cannot be ensured, the experience effect is extremely poor for a user of the vehicle, and certain threat is also caused to the safety of the user of the vehicle; also, when the person evaluating the vehicle is not a technician who is very aware of the vehicle, the evaluation efficiency and the evaluation accuracy will be lower; therefore, a technical solution is needed to solve the above-mentioned problems of low evaluation efficiency and low evaluation accuracy.

Disclosure of Invention

Based on the foregoing, it is necessary to provide an evaluation model generation method, an evaluation data processing method, an apparatus, a device, and a medium, by which an evaluation model is trained, and by which an item to be evaluated is evaluated, so as to evaluate the characteristics of the item to be evaluated with high efficiency and accuracy.

An evaluation model generation method, comprising:

acquiring a first target feature data set; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance;

inputting the first target characteristic data set into a multi-layer sensor containing initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; the second output result comprises second probability evaluation data of the characteristics of the evaluation sample corresponding to the first target characteristic data set;

respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value;

And if the loss value is smaller than or equal to the preset loss value, confirming that the training of the evaluation model integrated by the multi-layer inductor and the logistic regression model is completed.

An evaluation data processing method comprising:

acquiring a second evaluation tag corresponding to the item to be evaluated;

acquiring all second image dimensions of the second evaluation labels from a preset knowledge graph, and collecting second image dimension data corresponding to various second image dimensions of each second evaluation label;

performing data cleaning on the second image dimension data corresponding to various second image dimensions of the second evaluation labels, performing feature engineering on the second image dimension data after data cleaning to extract second target features in the second image dimension data, and generating a second target feature data set corresponding to the second evaluation labels according to the second target features extracted from the second image dimension data;

inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated; the evaluation model is obtained by the evaluation model generation method.

An evaluation model generation apparatus comprising:

the acquisition module is used for acquiring a first target characteristic data set; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance;

the input module is used for inputting the first target characteristic data set into the multi-layer sensor containing the initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; the second output result comprises second probability evaluation data of the characteristics of the evaluation sample corresponding to the first target characteristic data set;

the judging module is used for respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value;

and the confirming module is used for confirming that the evaluation model integrated by the multi-layer inductor and the logistic regression model is trained if the loss value is smaller than or equal to the preset loss value.

An evaluation data processing apparatus comprising:

the second evaluation tag acquisition module is used for acquiring a second evaluation tag corresponding to the item to be evaluated;

the second image dimension data collection module is used for obtaining all second image dimensions of the second evaluation labels from a preset knowledge graph and collecting second image dimension data corresponding to various second image dimensions of each second evaluation label;

the second target feature data set generating module is used for carrying out data cleaning on the second image dimension data corresponding to the second image dimensions of each type of the second evaluation labels, carrying out feature engineering on the second image dimension data after data cleaning to extract second target features in the second image dimension data, and generating the second target feature data set corresponding to the second evaluation labels according to the second target features extracted from the second image dimension data;

the second target characteristic data set input module is used for inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated; wherein the evaluation model is obtained by the evaluation model generation method according to any one of claims 1 to 5.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the above-mentioned evaluation model generation method when executing the computer program.

A computer readable storage medium storing a computer program which, when executed by a processor, implements the above-described evaluation model generation method.

In the evaluation model generation method, a first target characteristic data set is acquired; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance; inputting the first target characteristic data set into a multi-layer sensor containing initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; the second output result comprises second probability evaluation data of the characteristics of the evaluation sample corresponding to the first target characteristic data set; respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value; and if the loss value is smaller than or equal to the preset loss value, confirming that the training of the evaluation model integrated by the multi-layer inductor and the logistic regression model is completed.

In the evaluation data processing method, a second evaluation tag corresponding to the item to be evaluated is obtained; acquiring all second image dimensions of the second evaluation labels from a preset knowledge graph, and collecting second image dimension data corresponding to various second image dimensions of each second evaluation label; performing data cleaning on the second image dimension data corresponding to various second image dimensions of the second evaluation labels, performing feature engineering on the second image dimension data after data cleaning to extract second target features in the second image dimension data, and generating a second target feature data set corresponding to the second evaluation labels according to the second target features extracted from the second image dimension data; and inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated. According to the invention, the second target feature set is input into the evaluation model obtained by the evaluation model generation method, so that a probability evaluation result for efficiently and accurately evaluating the characteristics of the item to be evaluated can be obtained.

The evaluation model trained by the invention is an integrated model of the multilayer inductor and the logistic regression model, and because the information content contained in the hidden layer of the multilayer inductor is more comprehensive, the information contained in the hidden layer (namely, the first output result output by the fully-connected layer) is directly output to the logistic regression model through the fully-connected layer, so that the input information of the logistic regression model can be reserved to the maximum extent, and the logistic regression model is a multi-component (including two-component) model with one of the highest operation efficiency. Therefore, the integrated evaluation model of the multilayer sensor and the logistic regression model can intelligently and efficiently evaluate the item to be evaluated to determine the characteristics of the item to be evaluated, further prevent cliff effect of each characteristic of the item to be evaluated, and can evaluate the characteristics of the item to be evaluated with high efficiency and accuracy.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an application environment of an evaluation model generation method or an evaluation data processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for generating an assessment model according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a method for generating an evaluation model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an evaluation model generating apparatus according to an embodiment of the present invention;

FIG. 5 is a flow chart of a method for evaluating data processing according to an embodiment of the invention;

FIG. 6 is a schematic diagram of an evaluation data processing apparatus according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a computer device in accordance with an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The evaluation model generation method provided by the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The clients may be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 2, an evaluation model generating method is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

s10, acquiring a first target characteristic data set; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance.

It will be appreciated that the first target feature data set includes at least one first target feature (specifically, the first target feature data set is acquired with reference to the contents of steps S101 to S103); the characteristics of the evaluation sample include a first characteristic, which may be referred to as safety of the vehicle evaluation sample, and a second characteristic, which may be referred to as comfort of the vehicle evaluation sample, for example, when the evaluation sample is referred to as the vehicle evaluation sample.

In the present embodiment, the standard result is a result after the pre-probability evaluation of the safety and comfort of the vehicle evaluation sample (safety and comfort refer to two characteristics of the evaluation sample), and the standard result will be used as a training target in this model training process.

Further, as shown in fig. 3, before the first target feature data set is acquired, the method further includes:

s101, acquiring a first evaluation tag corresponding to the evaluation sample, and carrying out pre-evaluation on the first evaluation tag to obtain the standard result associated with the first evaluation tag; the first evaluation tag is a key element with uniqueness extracted from information to be evaluated of the evaluation sample;

understandably, the evaluation sample is a vehicle evaluation sample; the information to be evaluated is data information of a vehicle evaluation sample; the information to be evaluated is associated with an evaluation sample.

Specifically, the server can directly obtain the data information of the vehicle evaluation sample through a plurality of data acquisition modes such as a preset data interface acquisition mode or a user uploading mode; the data information of the vehicle evaluation sample is acquired through various ways, so that the wide information acquisition range and effective acquisition source can be ensured.

In this embodiment, after obtaining the data information of the vehicle evaluation sample (the information to be evaluated of the evaluation sample), the data information of the obtained vehicle evaluation sample is extracted with a unique key element, where the key element must include the main content of the data information of the vehicle evaluation sample (for example, in the data information of the vehicle evaluation sample of a vehicle type, the key element may be a key word extracted from the running data of the vehicle type and an evaluation key word of the comfort level of all users, etc.), and the extracted key element is a first evaluation tag, where the first evaluation tag is an evaluation object that performs a pre-evaluation on the vehicle evaluation sample and obtains a standard result. Understandably, the standard result is the result after probability evaluation of the safety and comfort of the vehicle evaluation sample (the characteristics of the evaluation sample), and the standard result will be used as the training target in the model training process (the training target has data support and has data rationality).

S102, acquiring all first image dimensions of the first evaluation labels from a preset knowledge graph, and collecting first image dimension data corresponding to various first image dimensions of each first evaluation label; one of the first image dimension data corresponding to each type of the first image dimension contains the key element;

it can be understood that the knowledge Graph is essentially a knowledge base of a Semantic Network (Semantic Network), and from the practical application point of view, the knowledge Graph can be simply understood as a Multi-relational Graph (Multi-relational Graph); the first image dimension may include an actual index corresponding to safety and comfort (in this example, the safety and comfort are respectively the first characteristic and the second characteristic of the evaluation sample) of the current vehicle evaluation sample already included in the constructed knowledge graph, such as data of vehicle collision data, vehicle original factory data, vehicle service life statistical data, damage of the vehicle under various service environments, safety and comfort evaluation data, and the like, and may also include self-characteristic data of various vehicle evaluation samples, such as vehicle types corresponding to the vehicle evaluation samples, duration time of using the vehicle, key elements of data information of the vehicle evaluation samples, and the like.

Specifically, in the step, a label to be evaluated is input into a preset input interface; and searching various first image dimensions associated with the label to be evaluated from a preset knowledge graph through an input interface, and further collecting first image dimension data corresponding to the various first image dimensions according to the various first image dimensions. Understandably, all first image dimension data of the first image dimension related to the label to be evaluated is collected in the knowledge graph through various data collection means, wherein the collection means comprise means such as online inquiry, computer interface acquisition associated with each large-brand vehicle evaluation sample and manual pre-storage in a preset database, and the like, wherein the data information of the vehicle evaluation sample received by the server and key elements extracted from the data information are also collected in the first image dimension data corresponding to one of the first image dimensions of the knowledge graph.

In this embodiment, the knowledge graph can expand multiple first image dimensions for one to-be-evaluated tag, and obtain multiple first image dimension data of multiple first image dimensions, so that the data expansion of the to-be-evaluated tag can be realized, and the first image dimension data can be ensured to be used as a training sample, so that the method has the characteristic of comprehensive data.

S103, carrying out data cleaning on the first image dimension data corresponding to various first image dimensions of the first evaluation tag, carrying out feature engineering on the first image dimension data after data cleaning to extract first target features in the first image dimension data, generating a first target feature data set corresponding to the first evaluation tag according to the first target features extracted from the first image dimension data, and associating the first target feature data set with the standard result corresponding to the first evaluation tag.

In this embodiment, the purpose of data cleaning is to clean up abnormal data in the first image dimension data (original data); the data is a carrier of information, but the first image dimension data includes a large amount of noise, and the expression of the information of the first image dimension data is not concise enough, so the feature engineering (feature extraction is performed on each first image dimension data after data cleaning, the feature extraction mode includes but is not limited to PCA, LDA and LCA dimension reduction, and feature selection is performed on the features after feature extraction, the feature selection mode includes but is not limited to pearson correlation coefficient, maximum information coefficient, linear model, regularized and random forest model, and the like) is to represent the information of the image dimension data by using a more efficient coding mode (first target feature) through a series of engineering activities, and the information represented by using the target feature has the following advantages: the information loss is less, and the rules contained in the first image dimension data are still preserved.

Further, the data cleaning of the first image dimension data corresponding to the first image dimensions of each type of the first evaluation tag includes:

judging whether the first portrait dimension data has data abnormality or not by an abnormality detection method;

and if the first portrait dimension data has data abnormality, deleting the abnormality in the first portrait dimension data with data abnormality.

It will be appreciated that the clearing of the anomaly may be performed by an anomaly detection method, wherein the anomaly detection method is: filtering unnecessary data according to service conditions; the abnormal point detection algorithm comprises deviation detection (clustering, nearest neighbor and the like), abnormal point detection based on statistics (abnormal points are determined based on statistics of extreme differences, quartile intervals, mean differences, standard differences and the like), abnormal point detection based on distance (abnormal points are detected mainly through a distance method, the distance between the first image dimension data and most points is determined, whether the distance of each point is larger than a certain preset distance threshold value is judged, if the distance is larger than the preset distance threshold value, the point is determined to be the abnormal point, and the mainly used distance measurement methods include Manhattan distance, euclidean distance, mahalanobis distance and the like) and abnormal point detection based on density (such as LOF algorithm).

Further, the generating the first target feature data set corresponding to the first evaluation tag according to the first target feature extracted from each piece of first image dimension data includes:

detecting whether the first target features extracted from the first image dimension data have preset data problems or not;

acquiring a data processing method corresponding to the data problem when the first target feature has the data problem, and determining the feature of the first target feature after the data processing is performed on the data problem according to the data processing method;

determining, when the first target feature does not have the data problem, a feature of the first target feature that does not have the data problem;

and screening all the target features with the characteristics meeting preset training requirements, and generating the first target feature data set according to all the screened target features.

It will be appreciated that data problems include, but are not limited to, inconsistent dimensions, redundancy of information, inability of qualitative features to be used directly, presence of missing values; the data processing method comprises a dimensionless method (the dimensionless method comprises a standardization method, an interval scaling method and a normalization method), a quantitative feature binarization (discretization) method, a single-hot encoding method for qualitative features and a missing value processing method (the missing value processing method comprises a deletion method, a statistical filling method, a unified filling method, a linear interpolation method and a decision tree method); the characteristics include whether the preprocessed target features diverge (divergence) and the degree of correlation (correlation) between the preprocessed target features and the training target.

Specifically, after the data problem occurring after the first target feature is extracted is obtained, the data processing method corresponding to the data problem is queried through the data problem, for example, the data problem corresponding to inconsistent dimension is a normalization method, an interval scaling method and a normalization method, the data problem of information redundancy is a data problem of binarization of quantitative features and incapability of being directly used for qualitative features is a single-heat encoding of qualitative features, and the data problem with missing values is a deletion method, a statistical filling method, a unified filling method, a linear interpolation method and a decision tree method.

In this embodiment, after the server displays the missing values (because the data problem with missing values is more common, the method is described below), it is determined whether the number of missing values in the first target feature or each portrait dimension data exceeds a preset number; if the number exceeds the preset number, deleting the first target feature or each portrait dimension data and directly discarding the first target feature or each portrait dimension data (deleting method); if the number does not exceed the preset number, counting the missing values of the numerical class in all the portrait dimension data, wherein the counted values comprise an average number, a median, a mode, a maximum value, a minimum value and the like (counting filling method), and performing missing value filling by using the counted values; if the number of the non-numerical value classes is not exceeded, uniformly filling all the missing values into custom values, determining (uniform filling method) to perform missing value filling according to the types of the missing values, finding a function value (linear interpolation method) of the non-missing values to perform missing value filling by using a function, or training by using data of the missing values as data points to be predicted and data of the non-missing values as first target features, and performing missing value filling on a model obtained by training; selecting a first target feature after pretreatment according to a characteristic selection method (comprising a filtering method, a packaging method and an embedding method), wherein the filtering method is used as a main method, each first target feature after pretreatment is scored according to divergence or relativity in characteristics (a threshold value is set in advance), whether the first target feature exceeds the preset threshold value is judged according to each scoring result obtained by scoring, and each first target feature after treatment, of which the scoring result exceeds the preset threshold value, is selected; and integrating all the selected first target features to generate a first target feature data set.

S20, inputting the first target characteristic data set into a multi-layer sensor containing initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; the second output result includes second probability evaluation data of a characteristic of the evaluation sample corresponding to the first target feature data set.

In one embodiment, the multilayer inductor comprises an input layer, a hidden layer connected to the input layer, a fully connected layer connected to the last set of hidden layers; the first output result of the multilayer inductor is the output of the fully connected layer. Connecting the last set of hidden layers of the multi-layer inductor to the fully connected layer and connecting the fully connected layer to the logistic regression model; the output of the fully connected layer will be taken as input to the logistic regression model.

It will be appreciated that the multilayer inductor (MLP, multilayer Perceptron), also known as an artificial neural network (ANN, artificial Neural Network), generally comprises an input layer, an output layer, and a number of hidden layers disposed intermediate the input layer and the output layer (the simplest multilayer inductor, having only one hidden layer in between the input layer and the output layer), in this embodiment, in the last set of hidden layers (the last set of hidden layers contains the most data) all connected (each neuron in the all connected layer is all connected with all neurons in its previous layer); and connecting the full connection layer to a logistic regression model. The multi-layer sensor training method is BP algorithm; the logistic regression model (Logistic Regression, LR) is a classification model (which can be classified into two or more classes) in traditional machine learning, and the second output result of the logistic regression model prediction has the advantage of high accuracy. Understandably, since the information content contained in the hidden layer of the multi-layer inductor is more comprehensive, the information contained in the hidden layer (i.e. the output of the fully-connected layer) is directly output to the logistic regression model through the fully-connected layer, so that the input information of the logistic regression model can be retained to the maximum extent, and the logistic regression model is a multi-component (including two-component) model with the highest operation efficiency (the efficiency of logistic regression processing is higher than that of the output layer in the original multi-layer inductor). Therefore, the integrated evaluation model of the multilayer sensor and the logistic regression model can intelligently and efficiently carry out intelligent evaluation on the safety and the comfort of the vehicle, so that the evaluation accuracy and the evaluation efficiency of the safety and the comfort of the vehicle are improved. Specifically, the target feature data set is input into an input layer of the multi-layer sensor (for example, the target feature data set is an n-dimensional vector, and n neurons exist in the input layer at the moment); acquiring data input in an input layer through a hidden layer, wherein because the input layer is fully connected with the hidden layer, X is used for representing the data input in the input layer, H is used for representing the output of the hidden layer, H=f (W1X+B1), wherein W1 represents the weight of the hidden layer, B1 represents the bias of the hidden layer, and a function f is usually nonlinear and is a common activation function, and the activation function comprises Sigmoid, tanh, reLU and Softmax functions; while the hidden layer to the output layer can be regarded as a two-class or multi-class logistic regression, where Z represents the data output by the output layer, X1 represents the output h=f (w1x+b1) of the hidden layer, where z=g (w2x1+b2), W2 represents the weight of the output layer, B2 represents the bias of the output layer, and the function G is a multi-class activation function, including Softmax function; and accessing the full-connection layer (each neuron in the full-connection layer is fully connected with all neurons in the previous layer) into the last group of hidden layers (the last group of hidden layers contain the most data), obtaining a matrix output by the full-connection layer, and inputting the matrix into the logistic regression model after being used as a normalized characteristic data set (a first output result of the multi-layer sensor) to obtain a second output result output by the logistic regression model.

In the embodiment, a first output result of the multi-layer sensor is input into a logistic regression model, so that the safety and comfort prediction of the vehicle is accurate; the coefficients involved in the training process are easy to understand and convenient to explain; the training process is also fast; and finally, a second output result output by an integrated model formed by the multi-layer sensor and the logistic regression model is a probability value, so that the second output result can be made into a sequencing model.

Further, the evaluation model generation method further includes: adding a random deactivation mechanism to the propagation process of each of said hidden layers of said multilayer inductor; during forward propagation of the multilayer inductor, the random deactivation mechanism discards nodes of at least one of the hidden layers, the hidden layer comprising a plurality of the nodes; during the counter-propagation of the multilayer inductor, the random mechanism adjusts the weights of the nodes in at least one of the hidden layers that are not discarded. It can be understood that a random inactivation mechanism (dropout) is a method for optimizing an artificial neural network with a deep structure, and reduces the interdependence among nodes by randomly zeroing part of weights or outputs of a hidden layer in the learning process, thereby realizing regularization of the neural network and reducing the structural risk of the neural network; the nodes are neurons. In this embodiment, after adding a random deactivation mechanism to the hidden layer during model training, the nodes of the hidden layer are discarded (i.e., the output of the nodes is zeroed), and then the coefficients (including weights) of the hidden layer in the multi-layer inductor are adjusted. Thereby preventing the phenomenon of over fitting in the training process.

Further, the pre-evaluating the first evaluation tag to obtain the standard result associated with the first evaluation tag includes:

classifying the acquired first evaluation tag and obtaining a classification result corresponding to the first evaluation tag; one of the classification results is associated with at least one scoring party, and one of the scoring parties is associated with only one scoring rule;

determining the scoring party and the scoring rule of the first evaluation tag according to the classification result, and receiving an evaluation score after the scoring party evaluates the evaluation sample corresponding to the first evaluation tag according to the scoring rule; in this step, the determined scoring party is a scoring party selected from at least one scoring party associated with the classification result, and the determined scoring rule is a scoring rule associated with the selected scoring party.

Preprocessing the evaluation score, importing the preprocessed evaluation score into a preset label evaluation model, acquiring a label evaluation result of the label evaluation model, and determining the standard result related to the first evaluation label according to the label evaluation result and a preset multi-score principle.

It is understood that the classification result refers to a category in which an evaluation sample (vehicle evaluation sample) corresponding to the first evaluation tag is defined. Understandably, each scoring party is associated with each classification result.

Preferably, the evaluation of the evaluation sample corresponding to the first evaluation tag by the evaluation party according to the evaluation rule includes:

a scoring party to which the vehicle evaluation sample relates (the scoring party knows the safety of the vehicle evaluation sample and takes the scoring party as a first scoring party) and a crowd using the vehicle evaluation sample (the crowd can be taken as a second scoring crowd) together evaluate the safety (one of the characteristics of the evaluation sample, namely the first characteristic);

the scoring party related to the vehicle evaluation sample (the scoring party knows the comfort of the vehicle evaluation sample and takes the scoring party as a second scoring party, and the second scoring party can also be the same batch of people as the first scoring party) and the group using the vehicle evaluation sample (the group of people can be taken as a second scoring group, and the second scoring group can also be the same batch of people as the first scoring group) perform a questionnaire on the comfort (another characteristic of the evaluation sample, namely, the second characteristic) together, and evaluate the comfort (the evaluation content of the comfort is included in the questionnaire) through the questionnaire.

The preprocessing of the evaluation scores means that the respective evaluation scores evaluated by the scoring party are averaged, such as the average mentioned belowEtc.

In one embodiment, the tag evaluation model is as follows:

wherein lambda is a first weight coefficient, and lambda has a value of [0,1]]Within the range; gamma is a second weight coefficient, and the value of gamma is [0,1]]Within the range;evaluating the first characteristics for a first scoring party to obtain a first average value of evaluation scores; />Evaluating the second characteristics for a second scoring party to obtain a second average value of the evaluation scores; x is x ₁ Is the average value of the first average value and the second average value; sigma is a first weight value; />Evaluating the first characteristics for the first scoring crowd to obtain a third average value of the evaluation scores; />Evaluating the second characteristics for the second scoring population to obtain a fourth average value of the evaluation scores; y is ₁ An average value of the third average value and the fourth average value; ω is a second weight value.

After preprocessing the evaluation score related to the first evaluation tag, importing the preprocessed evaluation score into a tag evaluation model, obtaining a tag evaluation result from the tag evaluation model, wherein the expression form of the tag evaluation result is an evaluation value within a preset range, and determining a standard result (namely the standard result related to the first evaluation tag) corresponding to the evaluation value (tag evaluation result) according to a multi-score principle. Such as: if the multi-score principle is a bipartite principle, when the evaluation value corresponding to the label evaluation result is smaller than or equal to a preset value, a standard result of 0 is obtained; when the evaluation value corresponding to the label evaluation result is larger than a preset value, a standard result of 1 is obtained; it is understood that the standard result of the multi-component principle is not limited to two items of 0 and 1, but may take a plurality of values within the range of [0,1] (for example, when "multi-component" in the multi-component principle is greater than "two-component", the number of items of the plurality of values is greater than two).

In this embodiment, the training target (standard result) in the training process of the subsequent evaluation model may be determined in advance by the above method, so that the initial parameters of the evaluation model may be adjusted according to the training target to obtain an evaluation model suitable for the training target (standard result).

S30, respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value.

It will be appreciated that the formula of the loss function for measuring the gap between the neural network output result and the standard result may be:

wherein a is ⁽ⁱ⁾ First probability evaluation data for the ith evaluation sample, y ⁽ⁱ⁾ The second probability evaluation data for the ith evaluation sample is divided, c is the loss value output by the loss function, and m is the total number of evaluation samples.

In this embodiment, by judging whether the loss value is equal to the preset loss value (the preset loss value is close to the minimum, the training effect is also good); when the loss value is smaller than or equal to the preset loss value, an evaluation model which is matched with the training target (standard result) can be obtained; when the loss value is larger than the preset loss value, the initial parameters (including weights and biases) of each layer in the multi-layer sensor can be adjusted until an evaluation model which is suitable for the training target can be obtained, so that the training can be completed and successful. The evaluation model trained at this time has the advantages of high evaluation efficiency and high accuracy.

And S40, if the loss value is smaller than or equal to the preset loss value, confirming that the training of the evaluation model integrated by the multi-layer inductor and the logistic regression model is completed.

It can be understood that the evaluation model is an integrated model of the multi-layer sensor and the logistic regression model, the input of the evaluation model is a target feature data set, and the output result of the evaluation model is a second output result output by the logistic regression model.

In this embodiment, when the loss value is less than or equal to the preset loss value, it may be indicated that the initial parameters of the multi-layer inductor are suitable for the training target (standard result) at this time, or it may be indicated that the evaluation model is suitable for the training target.

Further, after the step S40, the method further includes:

and if the loss value is larger than the preset loss value, iteratively updating the initial parameters of the multi-layer inductor until the loss value output by the loss function corresponding to the final logistic regression model is smaller than or equal to the preset loss value.

In this embodiment, when the loss value is greater than the preset loss value, it is indicated that the initial parameters of the multi-layer sensor are not suitable for the training target (standard result) at this time, so that it is necessary to iteratively update the initial parameters of the multi-layer sensor continuously to find an integrated model suitable for the target feature data set.

In summary, the foregoing provides a method for generating an evaluation model, to obtain a first target feature data set; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance; inputting the first target characteristic data set into a multi-layer sensor containing initial parameters for processing, and inputting a first output result of the multi-layer sensor containing the initial parameters into a logistic regression model to obtain a second output result; the second output result comprises second probability evaluation data of the characteristics of the evaluation sample corresponding to the first target characteristic data set; respectively inputting the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model for operation, obtaining a loss value output by the loss function, and judging whether the loss value is smaller than or equal to a preset loss value; and if the loss value is smaller than or equal to the preset loss value, confirming that the training of the evaluation model integrated by the multi-layer inductor and the logistic regression model is completed. The evaluation model trained by the invention is an integrated model of the multilayer inductor and the logistic regression model, and because the information content contained in the hidden layer of the multilayer inductor is more comprehensive, the information contained in the hidden layer (namely, the first output result output by the fully-connected layer) is directly output to the logistic regression model through the fully-connected layer, so that the input information of the logistic regression model can be reserved to the maximum extent, and the logistic regression model is a multi-component (including two-component) model with one of the highest operation efficiency. Therefore, the integrated evaluation model of the multilayer sensor and the logistic regression model can intelligently and efficiently evaluate the safety and the comfort of the vehicle so as to determine the evaluation accuracy of the safety and the comfort of the vehicle and prevent the cliff effect, namely, the evaluation efficiency and the evaluation accuracy are improved.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In an embodiment, an evaluation model generation apparatus is provided, which corresponds to the evaluation model generation method in the above embodiment one by one. As shown in fig. 4, the evaluation model generation apparatus includes an acquisition module 11, an input module 12, a judgment module 13, and a confirmation module 14. The functional modules are described in detail as follows:

an acquisition module 11, configured to acquire a first target feature data set; the first target feature data set is associated with a standard result; the standard result comprises first probability evaluation data for evaluating the characteristics of an evaluation sample corresponding to the first target characteristic data set in advance;

the input module 12 is configured to input the first target feature data set into a multi-layer sensor including an initial parameter for processing, and input a first output result of the multi-layer sensor including the initial parameter into a logistic regression model to obtain a second output result; the second output result comprises second probability evaluation data of the characteristics of the evaluation sample corresponding to the first target characteristic data set;

The judging module 13 is configured to input the first probability evaluation data in the standard result and the second probability evaluation data in the second output result into a loss function corresponding to the logistic regression model respectively for operation, obtain a loss value output by the loss function, and judge whether the loss value is less than or equal to a preset loss value;

and a confirmation module 14, configured to confirm that the training of the evaluation model integrated by the multi-layer inductor and the logistic regression model is completed if the loss value is less than or equal to the preset loss value.

Further, the evaluation model generation apparatus further includes:

the evaluation module is used for acquiring a first evaluation tag corresponding to the evaluation sample, and performing pre-evaluation on the first evaluation tag to obtain the standard result associated with the first evaluation tag; the first evaluation tag is a key element with uniqueness extracted from information to be evaluated of the evaluation sample;

the collecting module is used for acquiring all first image dimensions of the first evaluation labels from a preset knowledge graph and collecting first image dimension data corresponding to various first image dimensions of each first evaluation label; one of the first image dimension data corresponding to each type of the first image dimension contains the key element;

The generating module is used for carrying out data cleaning on the first image dimension data corresponding to various first image dimensions of the first evaluation tag, carrying out feature engineering on the first image dimension data after data cleaning to extract first target features in the first image dimension data, generating a first target feature data set corresponding to the first evaluation tag according to the first target features extracted from the first image dimension data, and associating the first target feature data set with the standard result corresponding to the first evaluation tag.

Further, the evaluation module includes:

the classification sub-module is used for classifying the acquired first evaluation tag and obtaining a classification result corresponding to the first evaluation tag; one of the classification results is associated with at least one scoring party, and one of the scoring parties is associated with only one scoring rule;

the receiving sub-module is used for determining the scoring party and the scoring rule of the first evaluation tag according to the classification result, and receiving the evaluation score after the scoring party evaluates the evaluation sample corresponding to the first evaluation tag according to the scoring rule;

The first determining submodule is used for preprocessing the evaluation score, importing the preprocessed evaluation score into a preset label evaluation model, acquiring a label evaluation result of the label evaluation model, and determining the standard result related to the first evaluation label according to the label evaluation result and a preset multi-score principle.

Further, the generating module includes:

the judging submodule is used for judging whether the first portrait dimension data has data abnormality or not through an abnormality detection method;

and the deleting submodule is used for deleting the abnormality in the first portrait dimension data with abnormal data if the first portrait dimension data has abnormal data.

Further, the generating module includes:

the detection submodule is used for detecting whether the first target feature extracted from each piece of first image dimension data has a preset data problem or not;

a second determining submodule, configured to obtain a data processing method corresponding to the data problem when the first target feature has the data problem, and determine a feature of the first target feature for which data processing has been completed after performing data processing on the data problem according to the data processing method;

A third determination submodule for determining, when the first target feature does not have the data problem, a feature of the first target feature that does not have the data problem;

the generation sub-module is used for screening all the first target features with the characteristics meeting preset training requirements and generating the first target feature data set according to all the screened first target features.

For specific limitations of the evaluation model generation means, reference is made to the above description of the evaluation model generation method, and no further description is given here. The respective modules in the above-described evaluation model generation apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

The evaluation data processing method provided by the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The clients may be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 5, an evaluation data processing method is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

s50, acquiring a second evaluation tag corresponding to the item to be evaluated;

s60, acquiring all second image dimensions of the second evaluation labels from a preset knowledge graph, and collecting second image dimension data corresponding to various second image dimensions of each second evaluation label;

s70, carrying out data cleaning on the second image dimension data corresponding to the various second image dimensions of the second evaluation tag, carrying out feature engineering on the second image dimension data after data cleaning to extract second target features in the second image dimension data, and generating a second target feature data set corresponding to the second evaluation tag according to the second target features extracted from the second image dimension data;

s80, inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated.

It is understood that the item to be evaluated is a vehicle to be evaluated; the second evaluation tag is a key element with uniqueness extracted from evaluation information of the item to be evaluated, the evaluation information is data information of the vehicle to be evaluated, and the evaluation information is associated with the item to be evaluated; the characteristics of the items to be evaluated include the safety of the vehicle to be evaluated and the comfort of the vehicle to be evaluated; the specific process of the present embodiment of step S50 to step S70 may refer to the content corresponding to step S101 to step S103, and the specific process of step S80 may refer to the content corresponding to step S10 to step S40.

In summary, the above provides an evaluation data processing method, which obtains a second evaluation tag corresponding to an item to be evaluated; acquiring all second image dimensions of the second evaluation labels from a preset knowledge graph, and collecting second image dimension data corresponding to various second image dimensions of each second evaluation label; performing data cleaning on the second image dimension data corresponding to various second image dimensions of the second evaluation labels, performing feature engineering on the second image dimension data after data cleaning to extract second target features in the second image dimension data, and generating a second target feature data set corresponding to the second evaluation labels according to the second target features extracted from the second image dimension data; and inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated. According to the invention, the second target feature set is input into the evaluation model obtained by the evaluation model generation method, so that a probability evaluation result for efficiently and accurately evaluating the characteristics of the item to be evaluated can be obtained.

In one embodiment, an evaluation data processing apparatus is provided, which corresponds to the evaluation data processing method in the above embodiment one by one. As shown in fig. 6, the evaluation data processing apparatus includes a second evaluation tag acquisition module 21, a second image dimension data collection module 22, a second target feature data set generation module 23, and a second target feature data set input module 24. The functional modules are described in detail as follows:

a second evaluation tag acquisition module 21, configured to acquire a second evaluation tag corresponding to the item to be evaluated;

the second image dimension data collection module 22 is configured to obtain all second image dimensions of the second evaluation tags from a preset knowledge graph, and collect second image dimension data corresponding to each type of second image dimensions of each second evaluation tag;

a second target feature data set generating module 23, configured to perform data cleaning on the second image dimension data corresponding to the second image dimensions of each type of the second evaluation tag, and perform feature engineering on each piece of the second image dimension data after data cleaning to extract a second target feature in each piece of the second image dimension data, and generate the second target feature data set corresponding to the second evaluation tag according to the second target feature extracted from each piece of the second image dimension data;

And the second target feature data set input module 24 is configured to input the second target feature data set into a preset evaluation model, so as to obtain a probability evaluation result of the characteristic of the item to be evaluated.

For specific limitations of the evaluation data processing apparatus, reference is made to the above description of the evaluation model generation method, and no further description is given here. The respective modules in the above-described evaluation data processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data involved in the evaluation model generation method and the evaluation data processing method. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by the processor, implements an evaluation model generation method, or the computer program, when executed by the processor, implements an evaluation data processing method.

In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method for generating an assessment model in the above embodiment when executing the computer program, or the processor implements the steps of the method for processing assessment data in the above embodiment when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the evaluation model generation method in the above embodiment, or which computer program, when being executed by a processor, implements the steps of the evaluation data processing method in the above embodiment.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. An evaluation model generation method, characterized by comprising:

Before the first target feature data set is acquired, the method further comprises:

acquiring a first evaluation tag corresponding to the evaluation sample, and performing pre-evaluation on the first evaluation tag to obtain the standard result associated with the first evaluation tag; the first evaluation tag is a key element with uniqueness extracted from information to be evaluated of the evaluation sample;

acquiring all first image dimensions of the first evaluation labels from a preset knowledge graph, and collecting first image dimension data corresponding to various first image dimensions of each first evaluation label; one of the first image dimension data corresponding to each type of the first image dimension contains the key element;

performing data cleaning on the first image dimension data corresponding to various first image dimensions of the first evaluation tag, performing feature engineering on the first image dimension data after data cleaning to extract first target features in the first image dimension data, generating a first target feature data set corresponding to the first evaluation tag according to the first target features extracted from the first image dimension data, and associating the first target feature data set with the standard result corresponding to the first evaluation tag;

The generating the first target feature data set corresponding to the first evaluation tag according to the first target feature extracted from each first image dimension data includes:

screening all the first target features with the characteristics meeting preset training requirements, and generating a first target feature data set according to all the screened first target features;

2. The method for generating an evaluation model according to claim 1, wherein the pre-evaluating the first evaluation tag to obtain the standard result associated with the first evaluation tag comprises:

determining the scoring party and the scoring rule of the first evaluation tag according to the classification result, and receiving an evaluation score after the scoring party evaluates the evaluation sample corresponding to the first evaluation tag according to the scoring rule;

3. The method for generating an evaluation model according to claim 1, wherein the data cleaning the first image dimension data corresponding to each type of the first image dimension of the first evaluation tag includes:

4. An evaluation data processing method, comprising:

acquiring a second evaluation tag corresponding to the item to be evaluated;

inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated; wherein the evaluation model is obtained by the evaluation model generation method according to any one of claims 1 to 3.

5. An evaluation model generation apparatus, characterized in that the apparatus implements the evaluation model generation method according to any one of claims 1 to 3, comprising:

6. An evaluation data processing apparatus, comprising:

the second target characteristic data set input module is used for inputting the second target characteristic data set into a preset evaluation model to obtain a probability evaluation result of the characteristics of the item to be evaluated; wherein the evaluation model is obtained by the evaluation model generation method according to any one of claims 1 to 3.

7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the evaluation model generation method according to any one of claims 1 to 3 when executing the computer program or the processor implements the evaluation data processing method according to one of claim 4 when executing the computer program.

8. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the evaluation model generation method according to any one of claims 1 to 3, or

The processor, when executing the computer program, implements an evaluation data processing method as claimed in claim 4.