CN112541579B

CN112541579B - Model training method, lean degree information identification method, device and storage medium

Info

Publication number: CN112541579B
Application number: CN202011540120.2A
Authority: CN
Inventors: 奚宇航; 蔡庆秋
Original assignee: Beijing Beiming Digital Technology Co ltd
Current assignee: Beijing Beiming Digital Technology Co ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2023-08-08
Anticipated expiration: 2040-12-23
Also published as: CN112541579A

Abstract

The invention discloses an artificial intelligent model training method, a lean degree information identification method, a computer device and a storage medium. The invention has the capability of identifying the poverty degree information of student data, and the training result of the previous order neural network can be transmitted to the training process of the next order neural network by adjusting the weight set of the characteristic data used for the next order neural network training according to the training result of the previous order neural network, so that the finally obtained artificial intelligent model has strong classification performance. The invention is widely applied to the technical field of computers.

Description

Model training method, lean degree information identification method, device and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to an artificial intelligent model training method, a poverty degree information identification method, a computer device and a storage medium.

Background

In the fields of social and campus poverty relief, there is a need to identify individuals who meet poverty standards from a population. In the prior art, the information of each personnel and individual is investigated manually, and whether the personnel and individual belongs to the poor individuals is determined according to subjective standards, so that the identification efficiency of the poor individuals in the prior art is low, and the identification result of the poor individuals is inaccurate due to the fact that the subjectivity is strong and the interference of cheating factors is easy to occur.

Disclosure of Invention

In view of at least one of the above problems, an object of the present invention is to provide an artificial intelligence model training method, a lean information identifying method, a computer device and a storage medium.

In one aspect, an embodiment of the present invention includes an artificial intelligence model training method, including:

acquiring a plurality of neural networks; the neural networks have order relation;

acquiring a plurality of characteristic data and tag values corresponding to the characteristic data; the characteristic data is extracted from student data, and the tag value comprises lean degree information for describing the students;

determining a weight set; the elements of the weight set are weights respectively corresponding to the characteristic data;

executing a multi-round training test process until a termination condition is met; in the training test process, the characteristic data and the weight set are taken as input, the label value corresponding to the characteristic data is taken as expected output, each neural network is trained in sequence according to the sequence relation, each neural network is tested after training of each neural network is completed, wherein after training of the neural network in the previous sequence is completed, the weight set is updated according to the incorrect classification condition of the neural network in the previous sequence, and the updated weight set is taken as input of the neural network in the subsequent sequence;

determining the weight of each neural network;

and synthesizing the artificial intelligent model by each neural network according to the corresponding weight group.

Further, the synthesizing the artificial intelligence model by each neural network according to the corresponding weight group comprises the following steps:

taking the output of each neural network as the input of the artificial intelligence model,

determining a weighted sum of outputs of the neural networks according to the corresponding weights of the neural networks;

and taking the weighted sum as the output of the artificial intelligence model.

Further, the performing a multi-round training test procedure until a termination condition is met includes:

after executing a round of the training test process, terminating execution of the training test process when the termination condition is determined to be satisfied; and when the termination condition is not met, carrying out parameter adjustment, neuron number adjustment and/or hidden layer adjustment on part or all of the neural network, and executing the next round of training test process.

Further, the termination condition includes:

the mean value of the test errors of the cross validation of the artificial intelligence model is less than or equal to the expected threshold.

Further, the updating the set of weights according to the incorrect classification of the neural network of the previous order includes:

the neural network in the previous order does not classify the characteristic data correctly, and the weight corresponding to the characteristic data is increased;

the neural network in the previous order classifies the characteristic data correctly, and the weight corresponding to the characteristic data is reduced.

Further, the artificial intelligence model training method further comprises the following steps:

integrating, cleaning, standardizing, word embedding and oversampling the characteristic data;

and performing one-time thermal coding on each tag value.

Further, the acquiring a plurality of neural networks includes:

establishing an input layer;

establishing a plurality of hidden layers; in each hidden layer, the number of neurons is 100, the activation function is an ELU function, a discarding layer is added behind each hidden layer, and the initial value of discarding probability of the discarding layer is 0.01;

establishing an output layer; in each output layer, the number of neurons is the number of sample categories, and the activation function is a Softmax function;

constructing the neural network by the input layer, the hidden layer and the output layer;

setting weight attenuation by taking a cross entropy loss function as a loss function of the neural network, setting an initial value of a weight attenuation super parameter to be 3, using a random gradient descent update parameter, using a momentum method as optimization, setting the initial value of the momentum super parameter to be 0.5, setting the initial learning rate to be 0.002, setting the initial batch processing size to be 400, and setting the initial processing number to be 10.

On the other hand, the embodiment of the invention also comprises a lean degree information identification method, which comprises the following steps:

acquiring student data;

inputting the student data into a trained artificial intelligence model;

obtaining an output result of the artificial intelligent model;

and determining the poverty degree information corresponding to the student individuals according to the output result of the artificial intelligence model.

In another aspect, embodiments of the present invention also include a computer apparatus comprising a memory for storing at least one program and a processor for loading the at least one program to perform the artificial intelligence model training method and/or the lean information identification method of the embodiments.

In another aspect, embodiments of the present invention further include a storage medium having stored therein a processor-executable program which, when executed by a processor, is configured to perform the artificial intelligence model training method and/or the lean information identification method of the embodiments.

The beneficial effects of the invention are as follows: according to the artificial intelligent model training method in the embodiment, through training of each neural network, each neural network has the capability of identifying the poor degree information of the student individuals, in the training process of each neural network, the training result of the neural network in the previous order can be transmitted to the training process of the neural network in the subsequent order by adjusting the weight set of the characteristic data used by the training of the neural network in the subsequent order according to the training result of the neural network in the previous order, and the finally obtained artificial intelligent model has strong classification performance, and even if the training is performed by using the characteristic data with poor quality, the training result can not be reduced according to the same proportion.

According to the lean degree information identification method, the artificial intelligent model obtained through training by the training method can be used for identifying the lean degree information corresponding to the student individuals by utilizing the strong classification performance of the artificial intelligent model, so that the lean degree grade of the individuals corresponding to the student individuals is determined. In the process, the identification efficiency can be improved, the possibility of cheating is reduced, and the high classification performance of the artificial intelligent model can provide high identification accuracy.

Drawings

FIG. 1 is a flow chart of an artificial intelligence model training method in an embodiment;

FIGS. 2 and 3 are schematic diagrams of artificial intelligence model training methods in an embodiment;

FIG. 4 is a flowchart of a lean information identifying method according to an embodiment.

Detailed Description

In this embodiment, the trained artificial intelligence model is used to identify the poverty information. Training is performed prior to using the artificial intelligence model. Referring to FIG. 1, training an artificial intelligence model mainly includes the steps of:

p1. obtaining a plurality of neural networks; the neural networks have order relation;

p2, acquiring a plurality of tag values corresponding to the feature data; the characteristic data is extracted from student data, and the tag value comprises lean degree information for describing the students;

p3. determining a weight set; the elements of the weight set are weights corresponding to the feature data respectively;

p4, executing a multi-round training test process until the termination condition is met; in a round of training test process, characteristic data and weight sets are used as input, label values corresponding to the characteristic data are used as expected output, each neural network is trained sequentially according to sequence relation, each neural network is tested after training of each neural network is completed, wherein after training of a neural network in a previous sequence is completed, the weight sets are updated according to incorrect classification of the neural network in the previous sequence, and the updated weight sets are used as input of the neural network in a next sequence;

p5. determining the respective weights of the neural networks;

and P6, combining the artificial intelligent models by using the neural networks according to the corresponding weights.

In this embodiment, the structure of the artificial intelligence model to be trained and the principle of training the artificial intelligence model are shown in fig. 2. The artificial intelligence model includes T neural networks having an order relationship therebetween, in this embodiment, the order of the neural networks is indicated by their sequence numbers. The T neural networks are respectively combined together by weight 1 and weight 2 … … weight T, namely the output of each neural network is respectively regulated by corresponding weight to be obtained and used as the output of the artificial intelligent model.

In step P1, the neural network is acquired by:

p101. establish an input layer;

p102. establishing a plurality of hidden layers;

p103. establishing an output layer; p104. constructing a neural network with an input layer, a hidden layer, and an output layer;

p105. the cross entropy loss function is used as the loss function of the neural network, and the weight attenuation is set.

In this embodiment, one neural network can be obtained by performing steps P101 to P105 once, and a plurality of neural networks can be obtained when steps P101 to P105 are performed a plurality of times. The parameters of each neural network are: in each hidden layer, the number of neurons is 100, the activation function is an ELU function, a discarding layer is added behind each hidden layer, and the initial value of discarding probability of the discarding layer is 0.01; in each output layer, the number of neurons is the number of sample categories, and the activation function is a Softmax function; the initial value of the weight attenuation super-parameter is set to 3, the random gradient descent update parameter is used, the momentum method is used for optimization, the momentum super-parameter is set to 0.5, the learning rate is set to 0.002, the batch processing size is set to 400, and the processing time value is set to 10 times.

In this embodiment, the acquired student data is a behavior feature of the student over a period of time. For example, when the artificial intelligence model is required to be applied to the identification of poor students, the acquired student data may be campus card consumption data, achievement data, dormitory access data, library access data, book borrowing data and other basic information of the students in the past 1 year.

In this embodiment, the present invention is applicable to a variety of applications. By extracting features from student data, the following features can be extracted:

(1) Features are built for campus card consumption data including, but not limited to:

a. the total consumption amount, the maximum consumption amount and the consumption times of non-holidays and holidays of 12 consumption modes such as canteen, supermarket, boiled water room, electricity consumption and the like.

b. Average early noon meal times for holidays and non-holidays.

(2) Features are built for performance data including, but not limited to:

a. student performance ranking.

b. Ranking of students at respective sexes.

(3) Features are built for dormitory access data including, but not limited to:

a. non-holidays and the number of morning, evening, and evening times of holidays entering and exiting the dormitory.

b. Non-holidays and average time of day of holidays to leave the dormitory earliest and average time to return to the dormitory latest.

c. Days and residence time in dormitory for non-holidays and holidays.

(4) Features are built for library access data including, but not limited to:

a. non-holidays and the number of times holidays come in and out of the library.

b. Non-holidays and holidays days in and out of the library.

c. Non-holidays and holiday residence time in the library.

(5) Features are built for book borrowing data including, but not limited to:

a. the borrowing times and the borrowing time of non-holidays and holidays of 10 books such as borrowing study, programming class, support, mathematics class and the like.

(6) Features are built for student base information including, but not limited to:

a. student gender, native place, college entrance score ranking, family type, hobbies, height, weight, age.

In this embodiment, according to a survey of an individual, the real financial condition of the individual may be determined, so as to determine a tag value of the individual, where the tag value includes lean degree information for describing the correspondence of student data, such as a level of leanness of a student, and the like.

After the student data and the tag value are obtained, the student data can be integrated, cleaned, standardized, word embedding processed and oversampled, and the tag value is subjected to single-heat coding, so that the pretreatment of the student data and the tag value is realized.

In step P3, referring to fig. 2, initial weights X are set for the feature data to be input to the neural network 1 ₁₁ 、X ₁₂ 、X _1n 。

In this embodiment, the principle of step P4 is shown in fig. 3. In step P4, a k-wheel training test procedure is performed. After the first round of training test process is executed, whether the termination condition is met is detected, if the termination condition is met, the execution of the training test process is terminated, if the termination condition is not met, parameter adjustment, neuron number adjustment and/or hidden layer adjustment are carried out on part or all of the neural networks in the artificial intelligent model, and after adjustment, the next round of training test process is executed until the termination condition is met.

Referring to fig. 3, in a training test, the neural network 1 is trained first, specifically, the feature data and the weight set are taken as inputs of the neural network 1, and the tag value corresponding to the feature data is taken as an expected output of the neural network 1. After training the neural network 1, according to the neural network 1If the classification of the neural network 1 is incorrect, the weight corresponding to the feature data is adjusted and increased, and if the classification of the neural network 1 is correct, the weight corresponding to the feature data is adjusted and reduced, so that the training sample of the neural network 1 with the incorrect classification can get more attention in the subsequent training of the neural network 2, and the training sample of the neural network 1 with the incorrect classification can correspondingly reduce the attention in the subsequent training of the neural network 2 to save the computing resource. After training of the neural network 1 is completed, the weight set is updated to X ₂₁ 、X ₂₂ 、X _2n . The feature data and the updated weight set are taken as inputs of the neural network 2, and the tag value corresponding to the feature data is taken as an expected output of the neural network 2, so as to train the neural network 2. After training the neural network 2, the set of weights is then updated, and the neural network 3 is trained, eventually training the entire neural network according to similar steps. After one round of training of T neural networks is completed, each neural network is tested using feature data that is not used for training among all feature data.

In this embodiment, after a round of training and testing process is performed, respective training errors and testing errors of each neural network are calculated, then referring to fig. 2, the training errors of the neural network 1 are weighted according to a weight 1, the training errors of the neural network 2 are weighted … … according to a weight 2, and the training errors of the neural network T are weighted according to a weight T, so as to determine a training error average value; the average value of the test errors is determined by weighting the test errors of the neural network 1 by the weight 1, weighting the test errors of the neural network 2 by the weight 2 … …, and weighting the test errors of the neural network T by the weight T. In this embodiment, the calculated test error is Macro-F1Score. In this embodiment, the training test process of k rounds is completed altogether. When the average value of the test errors of the intersection verification of the artificial intelligence model is smaller than or equal to the expected threshold value, the training test process is not executed.

In this embodiment, the output of each neural network is used as the input of the artificial intelligence model, the sum of the outputs of each neural network weighted according to the corresponding weights is used as the output of the artificial intelligence model, and the whole formed by each neural network becomes the artificial intelligence model.

In this embodiment, through training each neural network, each neural network has the capability of identifying the poverty degree information of the individual to which the student data belongs, and in the training process of each neural network, by adjusting the weight set of the feature data used for the training of the neural network of the next sequence according to the training result of the neural network of the previous sequence, the training result of the neural network of the previous sequence can be transmitted to the training process of the neural network of the next sequence, and the finally obtained artificial intelligent model has strong classification performance, even if the training is performed by using the feature data with poor quality, the training result can not be reduced according to the same proportion. Specifically, the structure and parameter of each neural network are set, for example, a discarding layer, weight attenuation and other means are set, so that the overfitting problem can be effectively solved.

In this embodiment, after the training process for the artificial intelligence model is completed, the lean information identification method may be performed using the artificial intelligence model. Referring to fig. 4, the lean degree information identification method includes the steps of:

s1, acquiring student data;

s2, inputting student data into an artificial intelligent model;

s3, obtaining an output result of the artificial intelligent model;

s4, determining the poverty degree information corresponding to the student individuals according to the output result of the artificial intelligent model.

In this embodiment, the student data of S1 is the same type or structure as the student data of P2, for example, the student data of P2 in this embodiment is campus card consumption data, achievement data, dormitory access data, library access data, book borrowing data and other basic information of the student in the past 1 year, and then the student data of S1 may also be data with the same structure but different values.

By using the artificial intelligence model obtained through training by the training method of the embodiment, the poverty degree information corresponding to the student data can be identified by utilizing the strong classification performance of the artificial intelligence model, so that the poverty degree level of the individual corresponding to the student data can be determined. In the process, the identification efficiency can be improved, the possibility of cheating is reduced, and the high classification performance of the artificial intelligent model can provide high identification accuracy.

In this embodiment, a computer apparatus includes a memory for storing at least one program and a processor for loading the at least one program to execute the artificial intelligence model training method in the embodiment, to achieve the same technical effects as described in the embodiment.

In this embodiment, a storage medium has stored therein a processor-executable program for performing the artificial intelligence model training method in the embodiment when executed by a processor, achieving the same technical effects as described in the embodiment.

It should be noted that, unless otherwise specified, when a feature is referred to as being "fixed" or "connected" to another feature, it may be directly or indirectly fixed or connected to the other feature. Further, the descriptions of the upper, lower, left, right, etc. used in this disclosure are merely with respect to the mutual positional relationship of the various components of this disclosure in the drawings. As used in this disclosure, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. In addition, unless defined otherwise, all technical and scientific terms used in this example have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the description of the embodiments is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used in this embodiment includes any combination of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element of the same type from another. For example, a first element could also be termed a second element, and, similarly, a second element could also be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.

It should be appreciated that embodiments of the invention may be implemented or realized by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer readable storage medium configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, in accordance with the methods and drawings described in the specific embodiments. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

Furthermore, the operations of the processes described in the present embodiments may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described in this embodiment may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications), by hardware, or combinations thereof, that collectively execute on one or more processors. The computer program includes a plurality of instructions executable by one or more processors.

Further, the method may be implemented in any type of computing platform operatively connected to a suitable computing platform, including, but not limited to, a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and so forth. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optical read and/or write storage medium, RAM, ROM, etc., such that it is readable by a programmable computer, which when read by a computer, is operable to configure and operate the computer to perform the processes described herein. Further, the machine readable code, or portions thereof, may be transmitted over a wired or wireless network. When such media includes instructions or programs that, in conjunction with a microprocessor or other data processor, implement the steps described above, the invention described in this embodiment includes these and other different types of non-transitory computer-readable storage media. The invention also includes the computer itself when programmed according to the methods and techniques of the present invention.

The computer program can be applied to the input data to perform the functions described in this embodiment, thereby converting the input data to generate output data that is stored to the non-volatile memory. The output information may also be applied to one or more output devices such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including specific visual depictions of physical and tangible objects produced on a display.

The present invention is not limited to the above embodiments, but can be modified, equivalent, improved, etc. by the same means to achieve the technical effects of the present invention, which are included in the spirit and principle of the present invention. Various modifications and variations are possible in the technical solution and/or in the embodiments within the scope of the invention.

Claims

1. A method for training an artificial intelligence model, comprising:

acquiring a plurality of characteristic data and tag values corresponding to the characteristic data; the characteristic data is extracted from student data, and the tag value comprises lean degree information corresponding to the student data;

determining the weight of each neural network;

synthesizing the artificial intelligent model by each neural network according to the corresponding weight group;

the synthesizing the artificial intelligent model by each neural network according to the corresponding weight group comprises the following steps:

taking the weighted sum as an output of the artificial intelligence model;

updating the set of weights according to a classification incorrect condition of the neural network of a previous order, comprising:

2. The artificial intelligence model training method of claim 1, wherein the performing a plurality of training test runs until a termination condition is met comprises:

3. The artificial intelligence model training method of claim 1 or 2, wherein the termination condition comprises:

the mean value of the cross-validated test errors for the artificial intelligence model is less than or equal to the expected threshold.

4. The artificial intelligence model training method of claim 1, further comprising:

and performing one-time thermal coding on each tag value.

5. The artificial intelligence model training method of claim 1, wherein the acquiring a plurality of neural networks comprises:

establishing an input layer;

6. A lean degree information identification method, characterized by comprising:

acquiring student data;

inputting the student data into an artificial intelligence model; the artificial intelligence model is produced by any one of claims 1-5;

obtaining an output result of the artificial intelligent model;

7. A computer device comprising a memory for storing at least one program and a processor for loading the at least one program to perform the artificial intelligence model training method of any one of claims 1-5 and/or the lean information identification method of claim 6.

8. A storage medium having stored therein a processor-executable program, wherein the processor-executable program when executed by a processor is for performing the artificial intelligence model training method of any one of claims 1-5 and/or the lean information identification method of claim 6.