CN112541579A

CN112541579A - Model training method, poverty degree information identification method, device and storage medium

Info

Publication number: CN112541579A
Application number: CN202011540120.2A
Authority: CN
Inventors: 奚宇航; 蔡庆秋
Original assignee: Beijing Beiming Digital Technology Co ltd
Current assignee: Beijing Beiming Digital Technology Co ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2021-03-23
Anticipated expiration: 2040-12-23
Also published as: CN112541579B

Abstract

The invention discloses an artificial intelligence model training method, a poverty degree information identification method, a computer device and a storage medium. The method has the capability of identifying poverty degree information to which student data belong, and can be used for adjusting the weight set of the feature data used for training the neural network in the next sequence according to the training result of the neural network in the previous sequence, so that the training result of the neural network in the previous sequence can be transmitted to the training process of the neural network in the next sequence, and the finally obtained artificial intelligence model has strong classification performance. The invention is widely applied to the technical field of computers.

Description

Model training method, poverty degree information identification method, device and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to an artificial intelligence model training method, a poverty degree information identification method, a computer device and a storage medium.

Background

In the fields of social poverty alleviation, campus poverty alleviation and the like, the demand of determining personnel individuals meeting poverty alleviation standards from crowds exists. In the prior art, the information of each individual person is investigated through manual investigation, and whether the individual person belongs to the poor individual is determined according to subjective standards, so that the identification efficiency of the prior art on the poor individual is low, and the poor individual is easily interfered by cheating factors due to strong subjectivity, so that the identification result of the poor individual is inaccurate.

Disclosure of Invention

In view of at least one of the above technical problems, it is an object of the present invention to provide an artificial intelligence model training method, a poverty level information identifying method, a computer device, and a storage medium.

In one aspect, an embodiment of the present invention includes an artificial intelligence model training method, including:

acquiring a plurality of neural networks; the neural networks have an order relation;

acquiring a plurality of characteristic data and label values corresponding to the characteristic data; extracting self-learning data from the characteristic data, wherein the tag value comprises information for describing the corresponding poverty degree of the student;

determining a set of weights; the elements of the weight set are weights corresponding to the characteristic data respectively;

executing a multi-round training test process until a termination condition is met; in a round of training and testing process, taking the feature data and the weight set as input, taking a label value corresponding to the feature data as expected output, sequentially training each neural network according to an order relation, and testing each neural network after completing the training of each neural network, wherein after completing the training of the neural network in a previous order, the weight set is updated according to the incorrect classification condition of the neural network in the previous order, and the updated weight set is taken as the input of the neural network in a next order;

determining respective corresponding weights of each of the neural networks;

and combining the neural networks into the artificial intelligence model according to the corresponding weights.

Further, the combining the neural networks into the artificial intelligence model according to the corresponding weights includes:

taking the output of each neural network as the input of the artificial intelligence model,

determining a weighted sum of the outputs of each of the neural networks with the corresponding weight of each of the neural networks;

and taking the weighted sum as the output of the artificial intelligence model.

Further, the executing the multiple rounds of training test processes until the termination condition is met includes:

after executing a round of the training test process, when the termination condition is determined to be met, terminating the execution of the training test process; and when the termination condition is determined not to be met, performing parameter adjustment, neuron number adjustment and/or hidden layer adjustment on part or all of the neural network, and executing the next round of training test process.

Further, the termination condition includes:

the average value of the test errors of the cross validation of the artificial intelligence model is less than or equal to the expected threshold value.

Further, said updating said set of weights according to said previously ordered classification incorrect condition of said neural network comprises:

when the neural network of the current sequence classifies the feature data incorrectly, the weight corresponding to the feature data is increased;

and when the neural network in the current sequence correctly classifies the feature data, the weight corresponding to the feature data is reduced.

Further, the artificial intelligence model training method further comprises the following steps:

integrating, cleaning, standardizing, word embedding and oversampling the characteristic data;

each of the tag values is one-hot encoded.

Further, the obtaining a plurality of neural networks includes:

establishing an input layer;

establishing a plurality of hidden layers; in each hidden layer, the number of neurons is 100, an activation function is an ELU function, a discarding layer is additionally arranged behind each hidden layer, and the initial value of the discarding probability of the discarding layer is 0.01;

establishing an output layer; in each output layer, the number of neurons is the number of sample categories, and the activation function is a Softmax function;

building the neural network with the input layer, hidden layer, and output layer;

setting weight attenuation by taking a cross entropy loss function as a loss function of the neural network, setting an initial value of a weight attenuation super parameter as 3, updating parameters by using random gradient descent, and optimizing by using a momentum method, wherein the momentum super parameter is initially set to 0.5, the learning rate is initially set to 0.002, the batch processing size is initially set to 400, and the processing numerical value is initially set to 10 times.

On the other hand, the embodiment of the invention also comprises a poverty degree information identification method, which comprises the following steps:

acquiring student data;

inputting the student data into a trained artificial intelligence model;

acquiring an output result of the artificial intelligence model;

and determining poverty degree information corresponding to the individual student according to the output result of the artificial intelligence model.

In another aspect, embodiments of the present invention further include a computer apparatus, including a memory and a processor, where the memory is configured to store at least one program, and the processor is configured to load the at least one program to perform the artificial intelligence model training method and/or the poverty degree information identification method according to the embodiments.

In another aspect, the present invention further includes a storage medium in which a processor-executable program is stored, where the processor-executable program, when executed by a processor, is configured to perform the artificial intelligence model training method and/or the poverty level information identification method according to the embodiments.

The invention has the beneficial effects that: in the artificial intelligence model training method in the embodiment, each neural network has the capability of identifying poverty degree information to which a student individual belongs through training of each neural network, and in the training process of each neural network, the weight set of feature data used for training of the neural network in the next sequence is adjusted according to the training result of the neural network in the previous sequence, so that the training result of the neural network in the previous sequence can be transmitted to the training process of the neural network in the next sequence, the finally obtained artificial intelligence model has strong classification performance, and the training result cannot be reduced according to the same proportion even if the training is carried out by using feature data with poor quality.

According to the poverty degree information identification method in the embodiment, the artificial intelligence model obtained through training by the training method in the embodiment is used, and the poverty degree information corresponding to the student individual can be identified by using the strong classification performance of the artificial intelligence model, so that the poverty degree grade of the individual corresponding to the student individual is determined. In the process, the process of partial manual participation is avoided, so that the recognition efficiency can be improved, the possibility of cheating is reduced, and the high classification performance of the artificial intelligence model can provide high recognition accuracy.

Drawings

FIG. 1 is a flow chart of an artificial intelligence model training method in an embodiment;

FIGS. 2 and 3 are schematic diagrams of artificial intelligence model training methods in an embodiment;

fig. 4 is a flowchart of the poverty level information identifying method in the embodiment.

Detailed Description

In this embodiment, the training-tested artificial intelligence model is used to identify the poverty degree information. Before the artificial intelligence model is used, the artificial intelligence model is trained. Referring to fig. 1, training an artificial intelligence model mainly includes the following steps:

p1, acquiring a plurality of neural networks; the neural networks have an order relation;

p2, acquiring a plurality of characteristic data and label values corresponding to the characteristic data; extracting self-learning data from the characteristic data, wherein the label value comprises information for describing the poverty degree corresponding to the student;

p3, determining a weight set; the elements of the weight set are weights corresponding to the characteristic data respectively;

p4, executing a multi-round training test process until a termination condition is met; in the process of one round of training and testing, taking characteristic data and a weight set as input, taking a label value corresponding to the characteristic data as expected output, sequentially training each neural network according to an order relation, and testing each neural network after finishing the training of each neural network, wherein after finishing the training of the neural network in the previous order, the weight set is updated according to the incorrect classification condition of the neural network in the previous order, and the updated weight set is used as the input of the neural network in the next order;

p5. determining the weights corresponding to the neural networks;

and P6, combining the neural networks into an artificial intelligence model according to corresponding weights.

In this embodiment, the structure of the artificial intelligence model to be trained and the principle of training the artificial intelligence model are shown in fig. 2. The artificial intelligence model includes T neural networks having order relationships therebetween, and in this embodiment, the order of the neural networks is indicated by their sequence numbers. The T neural networks are respectively combined together by weight 1 and weight 2 … …, namely the sum of the output of each neural network after being adjusted by the corresponding weight is used as the output of the artificial intelligence model.

In step P1, the neural network is obtained by:

p101, establishing an input layer;

p102, establishing a plurality of hidden layers;

p103, establishing an output layer; p104, building a neural network by using an input layer, a hidden layer and an output layer;

and P105, setting weight attenuation by taking a cross entropy loss function as a loss function of the neural network.

In this embodiment, one neural network can be obtained by performing steps P101 to P105 once, and a plurality of neural networks can be obtained by performing steps P101 to P105 a plurality of times. The parameters for each neural network are: in each hidden layer, the number of neurons is 100, the activation function is an ELU function, a discarding layer is additionally arranged behind each hidden layer, and the discarding probability initial value of the discarding layer is 0.01; in each output layer, the number of neurons is the number of sample categories, and the activation function is a Softmax function; the initial value of the weight attenuation hyperparameter is set to be 3, the random gradient descent update parameter is used, the momentum method is used as optimization, the momentum hyperparameter is initially set to be 0.5, the learning rate is initially set to be 0.002, the batch processing size is initially set to be 400, and the processing numerical value is initially set to be 10 times.

In this embodiment, the acquired student data is the behavior characteristics of the student over a period of time. For example, when the artificial intelligence model is required to be applied to identification of poor students, the acquired student data may be campus card consumption data, score data, dormitory entrance guard data, library entrance guard data, book borrowing data and other basic information of the students in the past 1 year.

In this example. By performing feature extraction on student data, the following features can be extracted:

(1) features are built for campus card consumption data, including but not limited to:

a. the total consumption amount, the maximum consumption amount and the consumption times of 12 consumption modes such as non-holidays, boiling water rooms, power consumption and the like.

b. Average morning lunch meal times on holidays and non-holidays.

(2) Construct features for the performance data, including but not limited to:

a. and ranking the student achievements.

b. Ranking of students in each gender.

(3) Construct features for dormitory access control data, including but not limited to:

a. the number of times of entering or exiting the dormitory in the morning, at noon and evening on non-holidays and holidays.

b. Average time of leaving dormitory earliest and average time of returning to dormitory latest for non-holidays and holidays each day.

c. Days and residence time in dormitory on non-holidays and holidays.

(4) Features are built for library access data, including but not limited to:

a. number of times of entering and exiting the library on non-holidays and holidays.

b. Non-holidays and the number of days a holiday enters or exits the library.

c. Non-holidays and holidays stay in the library.

(5) Features are built for the book borrowing data, including but not limited to:

a. borrow the number of times and the duration of borrowing on the non-holiday and holiday of 10 types of books such as examination, programming, blessing, mathematics and the like.

(6) Features are built for student basic information, including but not limited to:

a. the gender, the native place, the ranking of the scores of college entrance examination, the mouth type, the hobbies, the height, the weight and the age of the students.

In this embodiment, according to the survey of the individual, the real financial status of the individual may be determined, so as to determine the tag value of the individual, where the tag value includes information describing the degree of poverty corresponding to the student data, such as the poverty level of the student.

After the student data and the tag values are obtained, the student data can be integrated, cleaned, standardized, word embedded and oversampled, and the tag values are subjected to one-hot encoding, so that the student data and the tag values are preprocessed.

In step P3, referring to fig. 2, initial weights X are set for each feature data to be input to the neural network 1₁₁、X₁₂、X_1n。

In this embodiment, the principle of step P4 is shown in fig. 3. In step P4, a k-round training test procedure is performed. After one round of training test process is executed, whether a termination condition is met or not is detected, if the termination condition is met, the training test process is terminated, if the termination condition is not met, parameter adjustment, neuron number adjustment and/or hidden layer adjustment are/is carried out on part or all of the neural networks in the artificial intelligent model, and the next round of training test process is executed after adjustment until the termination condition is met.

Referring to fig. 3, in a round of training and testing, the neural network 1 is trained, specifically, the feature data and the weight set are used as the input of the neural network 1, and the label value corresponding to the feature data is used as the expected output of the neural network 1. After the neural network 1 is trained, updating the weight set according to the incorrect classification condition of the neural network 1, specifically, if the neural network 1 has incorrect classification of one feature data, increasing the weight corresponding to the feature data, and if the neural network 1 has correct classification of one feature data, decreasing the weight corresponding to the feature data, so that the training sample with the misclassification of the neural network 1 can obtain more attention in the subsequent training of the neural network 2, and the training sample with the classification of the neural network 1 correspondingly decreases the attention in the subsequent training of the neural network 2 to save the computing resources. After the training of the neural network 1 is completed, the weight set is updated to X₂₁、X₂₂、X_2n. And taking the feature data and the updated weight set as the input of the neural network 2, and taking the label value corresponding to the feature data as the expected output of the neural network 2 to train the neural network 2. After the neural network 2 is trained, the weight set is updated, the neural network 3 is trained, and finally all the neural networks are trained according to similar steps. After one round of training of the T neural networks is completed, the neural networks are tested by using feature data which is not used for training in all feature data.

In this embodiment, after a round of training and testing processes is performed, the respective training errors and testing errors of each neural network are calculated, and then, referring to fig. 2, the training errors of the neural network 1 are weighted according to the weight 1, the training errors of the neural network 2 are weighted … … according to the weight 2, the training errors of the neural network T are weighted according to the weight T, and a training error average value is determined; the test error of the neural network 1 is weighted according to the weight 1, the test error of the neural network 2 is weighted … … according to the weight 2, and the test error of the neural network T is weighted according to the weight T, so that the test error average value is determined. In this example, the calculated test error is Macro-F1 Score. In this embodiment, the training test process of k rounds is completed altogether. And when the test error average value of the cross validation of the artificial intelligence model is less than or equal to the expected threshold value, the training test process is not executed.

In this embodiment, the output of each neural network is used as the input of the artificial intelligence model, the sum of the outputs of each neural network weighted according to the corresponding weight is used as the output of the artificial intelligence model, and the whole formed by each neural network becomes the artificial intelligence model.

In this embodiment, each neural network has the ability to recognize poverty degree information of an individual to which student data belongs by training each neural network, and in the training process of each neural network, the weight set of feature data used for training the neural network in the subsequent order is adjusted according to the training result of the neural network in the previous order, so that the training result of the neural network in the previous order can be transmitted to the training process of the neural network in the subsequent order, and the finally obtained artificial intelligence model has strong classification performance, and even if training is performed by using feature data with poor quality, the training result cannot be reduced according to the same proportion. Specifically, the overfitting problem can be effectively solved by means of setting the structure and parameters of each neural network, such as setting a discarding layer and weight attenuation.

In this embodiment, after the training process of the artificial intelligence model is completed, the poverty degree information identification method may be executed using the artificial intelligence model. Referring to fig. 4, the poverty degree information recognition method includes the steps of:

s1, acquiring student data;

s2, inputting student data into an artificial intelligence model;

s3, obtaining an output result of the artificial intelligence model;

and S4, determining poverty degree information corresponding to the individual students according to the output result of the artificial intelligent model.

In this embodiment, the student data of S1 is data of the same type or structure as the student data of P2, for example, the student data of P2 in this embodiment is campus card consumption data, score data, dormitory entrance guard data, library entrance guard data, book borrowing data and other basic information of students in the past 1 year, and then the student data of S1 may also be data having the same structure but different values.

By using the artificial intelligence model obtained through training by the training method, the poverty degree information corresponding to the student data can be identified by using the strong classification performance of the artificial intelligence model, so that the poverty degree grade of the individual corresponding to the student data is determined. In the process, the process of partial manual participation is avoided, so that the recognition efficiency can be improved, the possibility of cheating is reduced, and the high classification performance of the artificial intelligence model can provide high recognition accuracy.

In this embodiment, a computer apparatus includes a memory and a processor, where the memory is used to store at least one program, and the processor is used to load the at least one program to perform the artificial intelligence model training method in the embodiment, so as to achieve the same technical effects as those described in the embodiment.

In this embodiment, a storage medium stores therein a program executable by a processor, and the program executable by the processor is used to execute the artificial intelligence model training method in the embodiment, thereby achieving the same technical effects as those described in the embodiment.

It should be noted that, unless otherwise specified, when a feature is referred to as being "fixed" or "connected" to another feature, it may be directly fixed or connected to the other feature or indirectly fixed or connected to the other feature. Furthermore, the descriptions of upper, lower, left, right, etc. used in the present disclosure are only relative to the mutual positional relationship of the constituent parts of the present disclosure in the drawings. As used in this disclosure, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. In addition, unless defined otherwise, all technical and scientific terms used in this example have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the description of the embodiments herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this embodiment, the term "and/or" includes any combination of one or more of the associated listed items.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element of the same type from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language ("e.g.," such as "or the like") provided with this embodiment is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.

It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

Further, operations of processes described in this embodiment can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described in this embodiment (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.

Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described in this embodiment includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.

A computer program can be applied to input data to perform the functions described in the present embodiment to convert the input data to generate output data that is stored to a non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.

The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention as long as the technical effects of the present invention are achieved by the same means. The invention is capable of other modifications and variations in its technical solution and/or its implementation, within the scope of protection of the invention.

Claims

1. An artificial intelligence model training method, comprising:

acquiring a plurality of characteristic data and label values corresponding to the characteristic data; extracting self-learning data from the feature data, wherein the tag value comprises information for describing the poverty degree corresponding to the student data;

determining respective corresponding weights of each of the neural networks;

2. The method for training an artificial intelligence model according to claim 1, wherein the combining the artificial intelligence model with each of the neural networks according to the corresponding weight comprises:

and taking the weighted sum as the output of the artificial intelligence model.

3. The artificial intelligence model training method of claim 1, wherein the performing multiple rounds of training test procedures until a termination condition is met comprises:

4. The artificial intelligence model training method of any one of claims 1-3, wherein the termination condition comprises:

5. The artificial intelligence model training method of claim 1, wherein the updating the set of weights according to the categorical incorrect condition of the neural network in the previous order comprises:

6. The artificial intelligence model training method of claim 1, wherein the artificial intelligence model training method further comprises:

each of the tag values is one-hot encoded.

7. The artificial intelligence model training method of claim 1, wherein the obtaining a plurality of neural networks comprises:

establishing an input layer;

8. A poverty degree information identification method is characterized by comprising the following steps:

acquiring student data;

inputting the student data into an artificial intelligence model; the artificial intelligence model is generated by the terms of claims 1-7;

acquiring an output result of the artificial intelligence model;

9. A computer apparatus comprising a memory for storing at least one program and a processor for loading the at least one program to perform the artificial intelligence model training method of any one of claims 1-7 and/or the poverty level information recognition method of claim 8.

10. A storage medium in which a processor-executable program is stored, wherein the processor-executable program, when executed by a processor, is configured to perform the artificial intelligence model training method of any one of claims 1 to 7 and/or the poverty level information recognition method of claim 8.