CN111414539A - Recommendation system neural network training method and device based on feature enhancement - Google Patents

Recommendation system neural network training method and device based on feature enhancement Download PDF

Info

Publication number
CN111414539A
CN111414539A CN202010197501.9A CN202010197501A CN111414539A CN 111414539 A CN111414539 A CN 111414539A CN 202010197501 A CN202010197501 A CN 202010197501A CN 111414539 A CN111414539 A CN 111414539A
Authority
CN
China
Prior art keywords
samples
attribute
neural network
enhancement
feature information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010197501.9A
Other languages
Chinese (zh)
Other versions
CN111414539B (en
Inventor
施韶韵
张敏
郝斌
李大任
张瑞
于新星
单厚智
刘奕群
马少平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Zhizhe Sihai Beijing Technology Co Ltd
Original Assignee
Tsinghua University
Zhizhe Sihai Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Zhizhe Sihai Beijing Technology Co Ltd filed Critical Tsinghua University
Priority to CN202010197501.9A priority Critical patent/CN111414539B/en
Publication of CN111414539A publication Critical patent/CN111414539A/en
Application granted granted Critical
Publication of CN111414539B publication Critical patent/CN111414539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure relates to a recommendation system neural network training method and device based on feature enhancement, wherein the method comprises the following steps: inputting a plurality of first samples in the first training set into a neural network to be trained in the t-th round for processing to obtain prediction scores corresponding to the plurality of first samples; according to the feature information of the first samples and the prediction scores corresponding to the first samples, the attention of the neural network to each attribute is respectively determined; respectively determining the enhancement probability of each attribute according to the attention threshold and the attention of the neural network to each attribute; determining feature information to be updated from the feature information of the plurality of first samples according to the first enhancement rate and the enhancement probability; updating a first sample in the first training set according to the feature information to be updated and the noise feature value to obtain an updated second training set; and performing the t-th round of training on the neural network according to the second training set. Embodiments of the present disclosure may improve the robustness of neural networks.

Description

Recommendation system neural network training method and device based on feature enhancement
Technical Field
The disclosure relates to the field of machine learning, in particular to a recommendation system neural network training method and device based on feature enhancement.
Background
Deep learning is one of machine learning, and mainly utilizes a deep neural network to analyze and model data so as to discover rules between input features and predicted targets. Deep learning has achieved significant effects in many fields such as computer vision, computational linguistics, information retrieval, and the like.
The design of the deep neural network usually focuses on network architecture, feature representation and the like, and the deep neural network is easy to over-fit in the training process, so that some features are over-depended and other features are ignored. For example, when the obesity degree of a person is predicted by using a deep neural network, if characteristics such as sex, age, weight, height, circumference, shoulder width and the like are taken as input, and after a long-time training without constraint, the deep neural network is likely to be over-fitted, only two characteristics such as sex and weight are concerned, and other relatively indirect characteristics are not sufficiently utilized. Furthermore, during the use of the deep neural network, some features may be noisy, for example, the weight may be inaccurate, and the deep neural network may rely too much on these noisy features to make the prediction result less accurate.
Disclosure of Invention
In view of this, the present disclosure provides a method and an apparatus for training a neural network of a recommendation system based on feature enhancement.
According to an aspect of the present disclosure, there is provided a recommendation system neural network training method based on feature enhancement, the method including:
inputting a plurality of first samples in a preset first training set into a t-th round of neural network to be trained for processing to obtain prediction scores corresponding to the plurality of first samples, wherein t is a positive integer, and the first samples comprise characteristic information representing user attributes and characteristic information representing object attributes of objects to be recommended;
according to the feature information of the first samples and the prediction scores corresponding to the first samples, the attention degree of the neural network to each attribute is respectively determined;
respectively determining the enhancement probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute;
determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute;
updating a first sample in the first training set according to the feature information to be updated and a preset noise feature value to obtain an updated second training set;
performing a t-th round of training on the neural network according to the second training set,
the neural network is applied to a recommendation system and used for predicting the scoring of a user on an object to be recommended in the recommendation system.
In a possible implementation manner, determining the attention of the neural network to each attribute according to the feature information of the plurality of first samples and the prediction scores corresponding to the plurality of first samples respectively includes:
for any first sample in a first training set, respectively determining a first contribution value of each feature information of the first sample to a prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample;
for any attribute in the plurality of attributes, determining a second contribution value of the feature information corresponding to the attribute from the first contribution values of the feature information of the first samples;
and determining the average value of the second contribution values as the attention degree of the neural network to the attribute.
In a possible implementation manner, determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute, includes:
determining the enhancement quantity of the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples;
randomly selecting a plurality of second samples from a plurality of first samples of the first training set, wherein the number of the second samples is the same as the enhancement number;
and for any second sample, randomly selecting one attribute from a plurality of attributes according to the enhanced probability of each attribute, and determining the feature information corresponding to the randomly selected attribute in the second sample as the feature information to be updated.
In one possible implementation, the method further includes:
determining a second enhancement rate of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate;
and determining a first enhancement rate of the feature information of the plurality of first samples during the t-th round of training according to the maximum enhancement rate and the second enhancement rate.
In a possible implementation manner, determining, according to a preset attention threshold and the attention of the neural network to each attribute, an enhanced probability of each attribute respectively includes:
for any attribute, determining the attention degree of the neural network as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is smaller than a preset attention degree threshold value.
In a possible implementation manner, determining, according to a preset attention threshold and the attention of the neural network to each attribute, an enhanced probability of each attribute respectively further includes:
for any attribute, determining the product of the attention degree and a preset adjustment proportion as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is greater than or equal to a preset attention degree threshold value.
In one possible implementation manner, the neural network includes an input layer, an N-level intermediate layer and an output layer, the input layer inputs the feature information of each first sample, the output layer outputs the prediction score corresponding to each first sample, the N-level intermediate layer outputs N-level intermediate feature information in the processing process respectively, N is a positive integer,
determining a first contribution value of each feature information of the first sample to the prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample, respectively, including:
according to the prediction scores corresponding to the first samples, determining the contribution values of the N-th-level intermediate characteristic information to the prediction scores respectively;
determining the contribution value of each N-1 level intermediate characteristic information to the prediction score according to the contribution value of each N-level intermediate characteristic information to the prediction score, the N-level intermediate characteristic information and the N-1 level intermediate characteristic information;
determining the contribution value of each i-1 level intermediate characteristic information to the prediction score according to the contribution value of each i-level intermediate characteristic information to the prediction score, the i-level intermediate characteristic information and the i-1 level intermediate characteristic information, wherein i is an integer and is more than or equal to 2 and less than or equal to N;
and respectively determining a first contribution value of each characteristic information of the first sample to the prediction score according to the contribution value of each level 1 intermediate characteristic information to the prediction score, the level 1 intermediate characteristic information and the characteristic information of the first sample.
In one possible implementation, the feature information of the plurality of first samples in the first training set is represented by a feature matrix, each row of the feature matrix represents one first sample, and each column of the feature matrix represents one attribute.
According to another aspect of the present disclosure, there is provided a recommendation system neural network training device based on feature enhancement, the device including:
the system comprises a prediction score determining module, a recommendation score determining module and a recommendation score determining module, wherein the prediction score determining module is used for inputting a plurality of first samples in a preset first training set into a t-th to-be-trained neural network for processing to obtain prediction scores corresponding to the first samples, t is a positive integer, and the first samples comprise characteristic information representing user attributes and characteristic information representing object attributes of objects to be recommended;
the attention degree determining module is used for respectively determining the attention degrees of the neural network to each attribute according to the feature information of the first samples and the prediction scores corresponding to the first samples;
the enhancement probability determination module is used for respectively determining the enhancement probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute;
the to-be-updated feature determination module is used for determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute;
the training set updating module is used for updating a first sample in the first training set according to the feature information to be updated and a preset noise feature value to obtain an updated second training set;
a training module for performing the t-th round of training on the neural network according to the second training set,
the neural network is applied to a recommendation system and used for predicting the scoring of a user on an object to be recommended in the recommendation system.
In one possible implementation, the apparatus further includes:
the first enhancement rate determining module is used for determining second enhancement rates of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate;
and the second enhancement rate determining module is used for determining a first enhancement rate of the feature information of the plurality of first samples during the tth round of training according to the maximum enhancement rate and the second enhancement rate.
According to the embodiment of the disclosure, when the method is applied to the neural network training of the recommendation system, the enhancement probability of each attribute can be determined according to the attention degree of the neural network to be trained in the current training turn to each attribute, further determining a training set which is used in the current training round and is subjected to characteristic enhancement according to the enhancement probability of each attribute and a preset first enhancement rate, training the neural network by using the training set, thereby optimizing the attention degree of the neural network to different attributes in the training process of the neural network, so that the neural network can comprehensively utilize all the characteristic information during prediction, avoid overfitting or over-dependence on partial characteristic information, meanwhile, when part of characteristic information of the neural network is noisy, other characteristic information is fully utilized for prediction, the robustness of the neural network is improved, and meanwhile, the accuracy of neural network prediction is improved.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow diagram of a method for feature enhancement based recommendation system neural network training, in accordance with an embodiment of the present disclosure.
Fig. 2 shows a schematic diagram of an application scenario of a feature enhancement based recommendation system neural network training method according to an embodiment of the present disclosure.
FIG. 3 shows a block diagram of a feature enhancement based recommendation system neural network training apparatus, according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
The method for training a Neural Network of a recommendation system based on feature enhancement according to the embodiments of the present disclosure may be applied to a processor, which may be a general-purpose processor, such as a CPU (Central Processing Unit), or an artificial Intelligence Processor (IPU) for performing artificial intelligence operations, such as a GPU (Graphics Processing Unit), an NPU (Neural-Network Processing Unit), a DSP (Digital Signal Processing Unit), and the like. The present disclosure is not limited to a particular type of processor.
The feature enhancement according to the embodiment of the present disclosure may be to randomly conceal part of feature information in the initial feature information by setting an invalid value, a noise value, and the like. That is, more noisy/invalid features are included in the feature enhanced feature information than the initial feature information. Accordingly, the score corresponding to the enhanced sample is more difficult to predict than the initial sample. Training the neural network according to the training set with enhanced features can improve the robustness of the neural network.
In a possible implementation manner, the neural network of the recommendation system may be a neural network applied to a recommendation system, and is used for predicting the rating of the user on the object to be recommended in the recommendation system. The recommendation system can comprise various recommendation systems, such as a movie and television work recommendation system, a commodity recommendation system, a literature recommendation system, a shared knowledge recommendation system in a knowledge sharing platform, and the like. The objects to be recommended in the recommendation system can also comprise various objects, such as movie works, commodities, literature works, shared knowledge, multimedia data, documents and the like. The present disclosure does not limit the specific application scenario of the recommendation system and the specific content of the object to be recommended.
Fig. 1 shows a flow diagram of a method for feature enhancement based recommendation system neural network training, in accordance with an embodiment of the present disclosure. As shown in fig. 1, the method includes:
step S11, inputting a plurality of first samples in a preset first training set into a t-th round of neural network to be trained for processing, so as to obtain prediction scores corresponding to the plurality of first samples, wherein t is a positive integer, and the first samples comprise characteristic information representing user attributes and characteristic information representing object attributes of objects to be recommended;
step S12, respectively determining attention degrees of the neural network to each attribute according to the feature information of the first samples and the prediction scores corresponding to the first samples;
step S13, determining the enhancement probability of each attribute according to the preset attention threshold and the attention of the neural network to each attribute;
step S14, determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute;
step S15, updating the first sample in the first training set according to the feature information to be updated and a preset noise feature value to obtain an updated second training set;
and step S16, performing the t-th round of training on the neural network according to the second training set.
According to the embodiment of the disclosure, when the method is applied to the neural network training of the recommendation system, the enhancement probability of each attribute can be determined according to the attention degree of the neural network to be trained in the current training turn to each attribute, further determining a training set which is used in the current training round and is subjected to characteristic enhancement according to the enhancement probability of each attribute and a preset first enhancement rate, training the neural network by using the training set, thereby optimizing the attention degree of the neural network to different attributes in the training process of the neural network, so that the neural network can comprehensively utilize all the characteristic information during prediction, avoid overfitting or over-dependence on partial characteristic information, meanwhile, when part of characteristic information of the neural network is noisy, other characteristic information is fully utilized for prediction, the robustness of the neural network is improved, and meanwhile, the accuracy of neural network prediction is improved.
In one possible implementation, the first training set may be determined prior to training the neural network. The first training set may include a plurality of first samples and reference scores corresponding to the plurality of first samples. Wherein, each first sample can comprise characteristic information representing the user attribute and characteristic information representing the object attribute of the object to be recommended.
In one possible implementation, the user attributes may include user identification, age, gender, occupation, city, and the like. The objects to be recommended are different, and the object attributes may be different. For example, when the object to be recommended is a movie work, the object attributes may include movie identification, name, director, main actors, year of showing, region of propagation, multiple movie categories (e.g., science fiction, love, war), and the like; when the object to be recommended is a commodity, the object attributes of the object can comprise a commodity identification, a name, a production date, a manufacturer, a commodity price and the like; when the object to be recommended is a literary work, the object attribute can comprise an identification, a name, an author, a keyword and the like of the literary work; when the object to be recommended is shared knowledge, the object attributes of the object to be recommended can comprise identification, name, keywords, access amount and the like of the shared knowledge; when the object to be recommended is multimedia data, the object attribute can include the identification, name, format, keyword, size, etc. of the multimedia data; when the object to be recommended is a document, the object attribute may include an identifier, a name, a format, a keyword, a size, and the like of the document. The user identifier (i.e. user ID, Identity) may be used to uniquely identify the user, and the identifier of the object to be recommended may be used to uniquely identify the object to be recommended.
In one possible implementation manner, the object attribute of the object to be recommended may be determined according to a specific application scenario of the neural network. The application scenes are different, the objects to be recommended may be different, and the object attributes may also be different. It should be understood that, a person skilled in the art may set specific contents of the user attribute and the object attribute of the object to be recommended according to a specific application scenario of the neural network, and the present disclosure does not limit this.
In a possible implementation manner, after the first training set is determined, in step S11, a plurality of preset first samples in the first training set are input into the t-th to-be-trained neural network for processing, scores of the to-be-recommended objects by the user in each first sample are respectively predicted, and prediction scores corresponding to the plurality of first samples are obtained, where t is a positive integer. Wherein, the t-th round represents the current training round of the neural network.
In one possible implementation manner, in step S12, the attention of the neural network to each attribute may be determined according to the feature information of the plurality of first samples and the prediction scores corresponding to the plurality of first samples. Wherein, the attention degree of the neural network to each attribute can be used for representing the relevance of the prediction score output by the neural network to each attribute. The sum of the attention of the neural network to all the attributes is equal to 1, or the difference between the sum and 1 is within an error range.
In a possible implementation manner, the attention degree of the neural network to each attribute may be determined respectively through inter-layer correlation Propagation (L eye-wise independence Propagation) according to the feature information of the plurality of first samples and the prediction scores corresponding to the plurality of first samples.
In a possible implementation manner, after determining the attention degree of the neural network to each attribute, in step S13, the enhanced probability of each attribute may be determined according to a preset attention degree threshold and the attention degree of the neural network to each attribute. Wherein, the value range of the attention threshold is more than 0 and less than 1.
In one possible implementation, the attention threshold may be different for different training rounds, that is, the attention threshold may be changed according to the change of the current training round t. For example, the attention threshold may increase with increasing t. The attention threshold value in the tth round of training can be preset by a person skilled in the art according to actual conditions, and the disclosure does not limit this.
In one possible implementation, the higher the interest of a certain attribute by the neural network, the more important the attribute is. When determining the enhancement probability of each attribute, classifying the attention of each attribute according to a preset attention threshold, then adjusting the attention of each classified attribute according to the training requirement, and determining the adjusted attention of each attribute as the enhancement probability of each attribute in the t-th round of training.
In a possible implementation manner, in step S14, the feature information to be updated may be determined from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and an enhancement probability of each attribute. Wherein, the value range of the first enhancement rate is more than 0 and less than 1.
In a possible implementation manner, the preset first enhancement rate of the feature information of the plurality of first samples may represent an enhancement ratio of the feature information of the plurality of first samples. The first enhancement rate may be different for different training rounds, i.e. the first enhancement rate may vary depending on the current training round t, e.g. the first enhancement rate may increase with increasing t. The person skilled in the art can preset the first enhancement rate in the tth round of training according to practical situations, which is not limited by the present disclosure.
In a possible implementation manner, the enhancement quantity of the feature information of the plurality of first samples may be determined according to a preset first enhancement rate of the feature information of the plurality of first samples, and then the feature information to be updated is randomly determined from the feature information of the plurality of first samples according to the enhancement quantity and the enhancement probability of each attribute.
In a possible implementation manner, after determining the feature information to be updated, in step S15, the first sample in the first training set is updated according to the feature information to be updated and the preset noise feature value, so as to obtain an updated second training set.
In one possible implementation, the preset noise characteristic value may be represented as a specific number. For example, the noise characteristic value U may be represented by a number 0. The specific value of the noise characteristic value can be preset by a person skilled in the art according to actual conditions, and the disclosure does not limit this.
In a possible implementation manner, when the first sample in the first training set is updated, the preset noise feature value may be used to replace the feature information to be updated, so as to obtain an updated second training set. The second training set is a feature enhanced training set compared to the initial first training set.
In one possible implementation, after obtaining the second training set, in step S16, the neural network may be trained according to the second training set in a t-th round. The plurality of samples in the second training set can be input into the neural network for processing to obtain the prediction values corresponding to the plurality of samples in the second training set, and the parameters of the neural network are adjusted according to the error between the prediction values and the corresponding reference values to obtain the neural network finished in the t-th round of training.
In one possible implementation, when training the neural network, the first enhancement rate and the attention threshold may be adjusted as the training turns increase, for example, the first enhancement rate and the attention threshold may gradually increase as the training turns increase until a preset maximum value is reached. Through the mode, the quantity of the characteristic information to be updated can be gradually increased to the maximum value, meanwhile, the importance degree of the attribute corresponding to the characteristic information to be updated can be gradually increased, so that the strength of characteristic enhancement of the second training set can be gradually improved along with the increase of training rounds, and the stability of the neural network training process is further improved.
In one possible implementation, the second training set of each training round is updated from the initial first training set, and the updates are not accumulated on the second training set of the previous round. By the method, the second training sets of the training rounds are independent from each other, and the diversity of training samples is increased.
In a possible implementation manner, when the neural network meets a preset training end condition, the training is ended to obtain a trained neural network. The preset training end condition may be set according to an actual situation, for example, the training end condition may be that the effect of the neural network on the verification set decreases in a continuous preset turn (for example, continuous 5 turns); the training end condition can also be that the loss function of the neural network is reduced to a certain degree or converged within a certain threshold value; the training end condition may also be other conditions. The present disclosure does not limit the specific contents of the training end condition.
In one possible implementation, the trained neural network may be applied to a recommendation system for predicting a user's score for an object to be recommended in the recommendation system. According to the specific application scene of the recommendation system, determining the user attribute and the object attribute of the object to be recommended, and determining input data corresponding to the user attribute and the object attribute; inputting the input data into a trained neural network for processing, and predicting the score of the user on the object to be recommended; according to the score predicted by the neural network, the recommendation system can determine a preset number of recommended objects from the objects to be recommended and recommend the recommended objects to the user.
In one possible implementation, the method may further include: determining a second enhancement rate of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate; and determining a first enhancement rate of the feature information of the plurality of first samples during the t-th round of training according to the maximum enhancement rate and the second enhancement rate.
Wherein the preset value range of the initial enhancement rate is more than or equal to 0 and less than 1; the preset value range of the maximum enhancement rate is more than 0 and less than or equal to 1; the preset value range of each round of the enhancement rate change value is more than 0 and less than 1. The specific values of the initial enhancement rate, the maximum enhancement rate and the variation value of the enhancement rate in each round can be set by those skilled in the art according to practical situations, and the disclosure does not limit this.
In a possible implementation manner, second enhancement rates of the feature information of the plurality of first samples during the t-th round of training can be determined according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate; then, judging the relation between the second enhancement rate and the maximum enhancement rate, and determining the second enhancement rate as a first enhancement rate of the characteristic information of a plurality of first samples during the t-th round of training under the condition that the second enhancement rate is less than or equal to the maximum enhancement rate; and determining the maximum enhancement rate as a first enhancement rate of the feature information of the plurality of first samples in the t-th training, if the second enhancement rate is greater than the maximum enhancement rate.
In one possible implementation, the first enhancement rate s of the feature information of the plurality of first samples in the t-th round of training can be determined by the following formula (1)t
st=min(s,s0+Δ·t) (1)
In the above equation (1), s represents a preset maximum enhancement rate and s ∈ (0, 1)],s0Represents a predetermined initial enhancement rate and s0∈ [0,1), where Δ represents a preset change in the rate of enhancement for each round.
In this embodiment, the second enhancement rate of the feature information of the plurality of first samples during the t-th training can be determined according to the initial enhancement rate, the maximum enhancement rate, and the change value of the enhancement rate per round, and the minimum value between the maximum enhancement rate and the second enhancement rate is determined as the first enhancement rate of the feature information of the plurality of first samples during the t-th training, so that the first enhancement rate can be gradually increased from the initial enhancement rate to the maximum enhancement rate with the increase of the training round, and then kept unchanged.
In one possible implementation, the feature information of the plurality of first samples in the first training set may be represented by a feature matrix, each row of the feature matrix representing one first sample, and each column of the feature matrix representing one attribute.
For example, the first training set includes n first samples, each of which includes m pieces of feature information corresponding to m preset attributes, and the feature information of the plurality of first samples in the first training set may be represented as a feature matrix D ═ Du,v}n×mWherein, the u-th row of the feature matrix D represents the u-th first sample, the v-th column of the feature matrix D represents the v-th attribute, and the element D of the feature matrix Du,vAnd representing the characteristic information corresponding to the v-th attribute in the u-th first sample, wherein n, m, u and v are positive integers, u is more than or equal to 1 and less than or equal to n, and v is more than or equal to 1 and less than or equal to m.
In this embodiment, a plurality of first samples in the first training set are represented as a feature matrix, which facilitates neural network processing and can improve processing efficiency of the neural network.
In one possible implementation, step S12 may include:
for any first sample in a first training set, respectively determining a first contribution value of each feature information of the first sample to a prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample;
for any attribute in the plurality of attributes, determining a second contribution value of the feature information corresponding to the attribute from the first contribution values of the feature information of the first samples;
and determining the average value of the second contribution values as the attention degree of the neural network to the attribute.
In one possible implementation, when determining the attention of the neural network to each attribute, a first contribution value of each feature information of the first sample to the prediction score may be determined first. For any first sample in the first training set, the first contribution value of each feature information of the first sample to the prediction score thereof can be respectively determined through inter-layer correlation propagation according to the feature information of the first sample and the prediction score corresponding to the first sample.
For example, for any first sample in the first training set, assuming that the first sample includes m pieces of feature information, according to the m pieces of feature information of the first sample and the prediction score corresponding to the first sample, a first contribution value of each piece of feature information of the first sample to the prediction score thereof may be determined through inter-layer correlation propagation, that is, each piece of feature information in the first sample corresponds to one first contribution value, and the first sample includes m pieces of feature information, and m pieces of first contribution values may be determined.
In a possible implementation manner, after the first contribution value is determined, for any attribute of the plurality of attributes, a second contribution value of the feature information corresponding to the attribute may be determined from the first contribution values of the feature information of the respective first samples, and the determined second contribution values are averaged to determine the average value as the attention of the neural network to the attribute.
For example, the first training set includes n first samples, and for the v-th attribute, a first contribution value of the feature information corresponding to the v-th attribute may be selected from first contribution values of the n first samples and determined as a second contribution value, where the second contribution value is n in total; then averaging the n second contribution valuesThe average is determined as the interest of the neural network for the vth attribute. I.e. attention of the neural network to the v-th attribute
Figure BDA0002418139920000131
Wherein the content of the first and second substances,
Figure BDA0002418139920000132
a first contribution value representing a prediction score of the vth characteristic information of the u-th first sample.
In this embodiment, first contribution values of the feature information of the first samples to the prediction scores of the first samples may be determined, then, for any attribute, second contribution values of the feature information corresponding to the attribute may be determined from the first contribution values, and an average value of the second contribution values may be determined as the attention of the neural network to the attribute, so that accuracy of the attention may be improved.
In one possible implementation manner, the neural network may include an input layer, an N-level intermediate layer, and an output layer, the input layer inputs the feature information of each first sample, the output layer outputs the prediction score corresponding to each first sample, the N-level intermediate layer outputs N-level intermediate feature information in the processing process, respectively, N is a positive integer,
determining a first contribution value of each feature information of the first sample to the prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample, respectively, including:
according to the prediction scores corresponding to the first samples, determining the contribution values of the N-th-level intermediate characteristic information to the prediction scores respectively;
determining the contribution value of each N-1 level intermediate characteristic information to the prediction score according to the contribution value of each N-level intermediate characteristic information to the prediction score, the N-level intermediate characteristic information and the N-1 level intermediate characteristic information;
determining the contribution value of each i-1 level intermediate characteristic information to the prediction score according to the contribution value of each i-level intermediate characteristic information to the prediction score, the i-level intermediate characteristic information and the i-1 level intermediate characteristic information, wherein i is an integer and is more than or equal to 2 and less than or equal to N;
and respectively determining a first contribution value of each characteristic information of the first sample to the prediction score according to the contribution value of each level 1 intermediate characteristic information to the prediction score, the level 1 intermediate characteristic information and the characteristic information of the first sample.
In a possible implementation manner, the neural network may include an input layer, an N-level intermediate layer, and an output layer, where the input layer inputs the feature information of each first sample, the output layer outputs the prediction score corresponding to each first sample, and the N-level intermediate layer outputs N-level intermediate feature information in the processing process, respectively.
In a possible implementation manner, for any first sample, when determining a first contribution value of each feature information of the first sample to the prediction score thereof, starting from the prediction score output by the output layer, and proceeding layer by layer according to the hierarchical structure of the neural network, through inter-layer correlation propagation, determining the contribution value of the intermediate feature information output by each layer to the prediction score in turn until determining the first contribution value of each feature information of the input first sample to the prediction score.
In a possible implementation manner, the contribution value of each nth-level intermediate feature information to the prediction score may be determined according to the prediction score corresponding to the first sample. For example, assuming that the nth-level intermediate feature information is a predicted value of the E classification tags (where E is a positive integer), according to the prediction score corresponding to the first sample, the contribution value of the nth-level intermediate feature information corresponding to the correct classification tag in the E nth-level intermediate feature information to the prediction score may be determined to be 1, and the contribution values of the other nth-level intermediate feature information to the prediction score may be determined to be 0.
Then, according to the contribution value of each N-level intermediate characteristic information to the prediction score, the N-level intermediate characteristic information and the N-1-level intermediate characteristic information, the contribution value of each N-1-level intermediate characteristic information to the prediction score can be determined through correlation propagation between the N-1-level intermediate layer and the N-1-level intermediate layer.
In a possible implementation mode, according to the contribution value of each i-th-level intermediate characteristic information to the prediction score, the i-th-level intermediate characteristic information and the i-1-th-level intermediate characteristic information, the contribution value of each i-1-th-level intermediate characteristic information to the prediction score is respectively determined through correlation propagation between the i-1-th-level intermediate layer and the i-th-level intermediate layer, wherein i is an integer and is more than or equal to 2 and less than or equal to N;
for example, the input of the neural network is any first sample, the i-th level intermediate layer of the neural network is a full connection layer, and the q-th i-th level intermediate characteristic information output by the i-th level intermediate layer
Figure BDA0002418139920000151
Can be determined by the following equation (2):
Figure BDA0002418139920000152
in the above-mentioned formula (2),
Figure BDA0002418139920000153
represents the p (i-1) th level intermediate characteristic information of the (i-1) th level intermediate layer output,
Figure BDA0002418139920000154
the weight of the fully-connected layer is represented,
Figure BDA0002418139920000155
denotes the bias of the fully connected layer, relu (x) max (0, x) is a nonlinear activation function, and p and q are both positive integers.
The kth i-1 level intermediate characteristic information according to the back propagation of the full connection layer
Figure BDA0002418139920000156
Contribution to predictive score
Figure BDA0002418139920000157
Can be determined by the following equation (3):
Figure BDA0002418139920000161
in the above formula (3), k is a positive integer,
Figure BDA0002418139920000162
the weight of the fully-connected layer is represented,
Figure BDA0002418139920000163
represents the contribution value of the q-th i-th level intermediate characteristic information to the prediction score,
Figure BDA0002418139920000164
is a parameter in propagation of inter-layer correlation, and>sign (Z) is a sign function, and sign (Z) is 1 when Z is more than or equal to 0, otherwise sign (Z) is-1.
In a possible implementation manner, the first contribution value of each feature information of the first sample to the prediction score may be determined by inter-layer correlation propagation according to the contribution value of each level 1 intermediate feature information to the prediction score, the level 1 intermediate feature information, and the feature information of the first sample.
In this embodiment, starting from the prediction score corresponding to the first sample, according to the hierarchical structure of the neural network, the first contribution value of each piece of feature information of the first sample to the prediction score is determined layer by layer through interlayer correlation propagation, and the accuracy of the first contribution value can be improved.
In one possible implementation, step S13 may include: for any attribute, determining the attention degree of the neural network as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is smaller than a preset attention degree threshold value.
In one possible implementation, the preset attention threshold may be determined according to an increasing function of the current training turn t, and is used to automatically control the enhancement probability of the feature information according to the current training turn. The increasing function can be set by those skilled in the art according to practical needs, and the present disclosure does not limit this.
In a possible implementation manner, for any attribute, the relationship between the attention of the neural network to the attribute and a preset attention threshold value can be judged; in the event that the attention of the neural network to the attribute is less than the attention threshold, the attention of the neural network to the attribute may be determined as an enhanced probability of the attribute.
In one possible implementation, step S13 may further include: for any attribute, determining the product of the attention degree and a preset adjustment proportion as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is greater than or equal to a preset attention degree threshold value.
The preset adjustment ratio is greater than 0 and less than 1, for example, the preset adjustment ratio is 0.1. The specific value of the adjustment ratio can be set by a person skilled in the art according to training needs, and the disclosure does not limit this.
In a possible implementation manner, for any attribute, in the case that the attention degree of the neural network to the attribute is greater than or equal to a preset attention degree threshold, the importance degree of the attribute may be considered to be higher, and in order to maintain the routine training of the neural network, the enhancement probability of the attribute may be reduced. The product of the attention of the neural network to the attribute and the preset adjustment proportion can be determined as the enhancement probability of the attribute.
In one possible implementation, the enhanced probability P of the v-th attribute may be determined by the following equation (4)v
Figure BDA0002418139920000171
In the above formula (4), FvRepresents the attention of the neural network to the v-th attribute and represents the preset adjustment ratio (0)<<1) σ (t) is an increasing function with respect to t, max { F }1,…,FmDenotes the attention F of the neural network to m attributes1,…,FmMaximum value of (1), σ (t). max { F }1,…,FmAnd represents the attention threshold value in the t-th round of training.
In one possible implementation, after determining the enhanced probabilities of the attributes, normalization may be performed. The normalized enhanced probability of an attribute can be determined by the following equation (5):
Figure BDA0002418139920000172
in the above formula (5), P'vExpressing the normalized enhanced probability, P, of the v-th attributejRepresents the enhanced probability of j ≦ m before normalization processing, ∑jPjRepresenting the sum of the enhanced probabilities of all attributes before normalization.
In this embodiment, when the attention of the neural network to the attribute is greater than or equal to the preset attention threshold, the product of the attention and the preset adjustment ratio is determined as the enhancement probability of the attribute, and the enhancement probability of the attribute with a higher importance degree may be reduced in an initial stage of training (for example, in the previous rounds) to maintain the conventional training, that is, in the initial stage of training, to ensure the utilization of the important features by the neural network.
In one possible implementation, step S14 may include:
determining the enhancement quantity of the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples;
randomly selecting a plurality of second samples from a plurality of first samples of the first training set, wherein the number of the second samples is the same as the enhancement number;
and for any second sample, randomly selecting one attribute from a plurality of attributes according to the enhanced probability of each attribute, and determining the feature information corresponding to the randomly selected attribute in the second sample as the feature information to be updated.
In a possible implementation manner, the enhancement amount of the feature information of the plurality of first samples may be determined according to a preset first enhancement rate of the feature information of the plurality of first samples. For example, in the t-th round of training, the preset first enhancement rate of the feature information of the plurality of first samples is stIf the total amount of the feature information of the first samples in the first training set is n × m, the enhanced amount of the feature information of the first samples is n × m × st
In one possible implementation, a plurality of second samples may be randomly selected from the plurality of first samples of the first training set, the number of the second samples being the same as the enhancement number of the feature information of the plurality of first samples, wherein the randomly selected plurality of second samples may be repeatedtThe number of randomly chosen second samples is the same as the enhancement number, also n × m × st
In a possible implementation manner, for any second sample, one attribute may be randomly selected from the multiple attributes according to the enhanced probability of each attribute, and the feature information corresponding to the randomly selected attribute in the second sample is determined as the feature information to be updated.
In one possible implementation, the number of enhancements to the feature information of the plurality of first samples may be greater than the number of the plurality of first samples. In this case, there may be repeated feature information to be updated. When repeated feature information to be updated exists, the feature information to be updated can be reselected from the plurality of first samples according to the number of the repeated feature information until all the feature information to be updated is not repeated. For example, the enhancement number of the feature information of the plurality of first samples is 100, 5 of the selected 100 feature information to be updated are repeated with other features, and 5 of the non-repeated feature information to be updated are re-selected from the plurality of first samples, so that none of the 100 feature information to be updated is repeated.
The following illustrates the determination process of the feature information to be updated. Assuming that the first enhancement rate of the feature information of the preset multiple first samples is s during the t-th round of trainingtThe first training set is represented as the feature matrix D ═ Du,v}n×mFirst, the enhancement number G of the feature information of the first samples n × m × s may be determinedt(ii) a Then, in the feature matrix D, G rows are randomly selected to determine G second samples GaA is positiveAn is an integer of 1 to G; for any second sample gaAccording to the enhancement probability of each attribute, randomly selecting one attribute c from the plurality of attributes (wherein c is more than or equal to 1 and less than or equal to m, and the probability of the selected c attribute is the enhancement probability P 'of the c attribute'c) Obtaining a second sample gaCorresponding row column pair (g)aAnd c); using the same method, row and column pairs corresponding to G second samples can be obtained; judging whether the G row-column pairs have repeated row-column pairs or not, if so, reselecting the row-column pairs according to the number of the repeated row-column pairs until the G row-column pairs are not repeated; and then determining the characteristic information of the corresponding positions of the G row-column pairs as the characteristic information to be updated.
Assuming that the preset noise eigenvalue U is equal to 0, 0 may be used to replace the characteristic information of the corresponding positions of G row-column pairs in the characteristic matrix D, and the other characteristic information remains unchanged, so as to obtain an updated characteristic matrix
Figure BDA0002418139920000191
DGRepresented as the updated second training set. Can be based on the feature matrix DGAnd carrying out the t-th round of training on the neural network.
In this embodiment, the enhancement number of the feature information of the plurality of first samples can be determined according to the first enhancement rate of the feature information of the plurality of first samples, the plurality of second samples are randomly selected from the plurality of first samples according to the enhancement number, and the feature information to be updated in the plurality of second samples is determined according to the enhancement probability of each attribute, so that the determined feature information to be updated meets the first enhancement rate and the enhancement probability of each attribute during the t-th round of training, and the accuracy of the feature information to be updated is improved.
Fig. 2 shows a schematic diagram of an application scenario of a feature enhancement based recommendation system neural network training method according to an embodiment of the present disclosure. As shown in fig. 2, an initial first training set may be determined in step S201, where the first training set may include a plurality of first samples, and may be represented as a feature matrix D, then, in step S202, a current training round t is determined, and in step S203, the plurality of first samples in the first training set are input into a t-th round of neural network to be trained for processing, so as to obtain prediction scores corresponding to the plurality of first samples;
then, in step S204, the attention degree of the neural network to each attribute may be determined according to the feature information of the plurality of first samples and the prediction scores corresponding to the plurality of first samples, and F may be usedvRepresenting the attention of the neural network to the v attribute; in step S205, an attention threshold value in the t-th training round is determined, for example, the attention threshold value in the t-th training round is σ (t) · max { F }1,…,Fm};
Then, it can be determined in step S206 whether the attention of the neural network to each attribute is greater than or equal to the attention threshold, and if the attention of the neural network to the attribute is greater than or equal to the attention threshold, step S207 is executed, where the enhanced probability of the attribute is equal to the preset adjustment ratio ×, for example, the attention F of the neural network to the v-th attributevWhen the value is greater than or equal to the attention threshold, the enhancement probability P of the v-th attributev=·FvWherein denotes the adjustment ratio; otherwise, step S208 is executed, where the enhanced probability of the attribute is the attention degree of the neural network to the attribute, for example, the attention degree F of the neural network to the v-th attributevThe enhancement probability P of the v-th attribute is smaller than the attention thresholdv=Fv
Then, step S209 is executed, and the enhancement probabilities of the attributes determined in steps S207 and S208 are normalized to obtain normalized enhancement probabilities of the attributes;
after step S202 is performed, in step S210, a first enhancement rate S of the feature information of the plurality of first samples during the t-th round of training may be determinedtAnd in step S211, determining the enhancement quantity of the feature information of the plurality of first samples according to the first enhancement rate determined in step S210;
after steps S211 and S209 are performed, in step S212, the number of enhancements of the feature information of the plurality of first samples determined in step S211 and the attribute values of the respective attributes determined in step S209 may be determinedEnhancing probability, determining the characteristic information to be updated, updating the first sample in the first training set according to the characteristic information to be updated and a preset noise characteristic value to obtain an updated second training set, wherein a characteristic matrix of the updated second training set is represented as DG
After the second training set is determined in step S212, step S213 may be executed to perform a tth round of training on the neural network according to the second training set; after the t-th round of training is completed, in step S214, it may be determined whether the neural network meets a preset training end condition, and when the neural network does not meet the training end condition, step S215 is executed, where 1 is added to the training round, that is, t is t +1, and then step S202 is executed to perform the next round of training; when the neural network meets the training end condition, the training can be ended to obtain the trained neural network.
The following describes a method for training a neural network of a recommendation system based on feature enhancement with reference to a specific example.
The method comprises the steps that a neural network is assumed to be applied to a movie and television work recommendation system and used for predicting scores of users on movie and television works, a prediction result is expressed as a score, and the value range of the score is 1-5; the user attributes are 4, including user identification, age, gender and occupation, the object to be recommended is a movie work, the object attributes are 21, including movie identification, year and 19 movie categories (science fiction, love, war, etc.).
The total number of samples is 10 thousands, and the characteristic information of each sample comprises characteristic information representing the attribute of the user and characteristic information representing the attribute of the object of the film and television work.
During the t-th round of training, the specific values of the preset variables are shown in the following table 1:
TABLE 1
Figure BDA0002418139920000211
All samples may be first represented as a feature matrix D1Feature matrix D1Has 1 × 105Row, 25 columns; then, inputting each sample into the neural network to be trained in the t-th round respectively for processing to obtain the sampleA corresponding prediction score; determining the attention degree of the neural network to each attribute through interlayer correlation propagation according to the prediction scores corresponding to the samples and the characteristic information of the samples, wherein the attention degree of the neural network to the v' th attribute can be represented as Fv′Wherein v' is 1, …, 25;
can be determined according to the formula σ (t) ═ 1.1/(1+ e)3-t) And 0.1 and the attention of the neural network to each attribute, the enhancement probability of each attribute is determined by the above formula (4). Enhanced probability P of the v' th attributev′Comprises the following steps:
Figure BDA0002418139920000221
and carrying out normalization processing by the formula (5) to obtain normalized enhancement probability of each attribute, wherein the normalized enhancement probability of the v 'th attribute can be expressed as P'v′
Then, the maximum enhancement rate s is 0.2, and the initial enhancement rate s is determined0The first enhancement rate s of the feature information of all samples at the time of the tth round of training is determined by the above equation (1) with the enhancement rate change value Δ of 0.005 for each round of 0.1t
st=min(s,s0+Δ·t)=min(0.2,0.1+0.005t);
The enhancement number G of feature information for all samples can then be determined:
G=n×m×st=105×25×min(0.2,0.1+0.005t);
can be derived from the feature matrix D according to the enhancement quantity G1Randomly selecting G rows, and randomly selecting one column according to the enhanced probability of each attribute for any one row in the selected G rows, wherein the probability of the selected v 'column is P'v′(ii) a The characteristic information of the corresponding positions of the G row-column pairs is selected, namely the characteristic information to be updated; replacing the feature matrix D with a noise eigenvalue U-01Obtaining the updated feature matrix from the feature information to be updated
Figure BDA0002418139920000222
With the initial feature matrix D1Feature matrix
Figure BDA0002418139920000223
Is a feature matrix after feature enhancement.
All samples can be divided into a training set, a test set and a validation set according to a ratio of 8:1: 1. Accordingly, the feature matrix may be scaled by 8:1:1
Figure BDA0002418139920000224
Partitioning into feature matrices corresponding to training sets
Figure BDA0002418139920000225
Feature matrices corresponding to test sets
Figure BDA0002418139920000226
And a test matrix corresponding to the validation set
Figure BDA0002418139920000227
Feature matrices corresponding to the training set may be used
Figure BDA0002418139920000228
And carrying out the t round training on the neural network. After the t round of training is completed, the feature matrix corresponding to the test set can be used
Figure BDA0002418139920000229
Evaluating the effect of a neural network using a test matrix corresponding to a validation set
Figure BDA00024181399200002210
Judging whether the neural network meets a preset training end condition, for example, the training end condition can be that the effect of the neural network is reduced on the verification set in 5 continuous rounds; when the neural network does not meet the training end condition, carrying out the next round of training; and when the neural network meets the training end condition, ending the training to obtain the trained neural network.
In one possible implementation, RMSE (Root mean square Error) may be used when evaluating the effect of the neural network. The smaller the RMSE value, the better the neural network. Compared with a neural network which does not use the training method, the RMSE value of the neural network which uses the training method is smaller, the effect of the neural network which uses the training method is better, and the accuracy of the prediction result of the neural network is higher.
According to the embodiment of the disclosure, when the method is applied to neural network training of a recommendation system, the attention threshold and the first enhancement rate can be adjusted according to the current training round, so that the feature information corresponding to the attribute with low attention can be enhanced in the initial stage (for example, the first rounds) of the neural network training, and along with the enhancement of the training round, the feature information corresponding to the attribute with high attention is gradually enhanced, so that the neural network is promoted to learn the condition that part of the feature information is noisy, the over-fitting or over-dependence of the neural network on part of the feature information can be avoided, the robustness of the neural network is improved, and meanwhile, the accuracy of neural network prediction is also improved.
It should be noted that, although the above embodiments are described as examples of the feature enhancement based recommendation system neural network training method, those skilled in the art can understand that the disclosure should not be limited thereto. In fact, the user can flexibly set each step according to personal preference and/or actual application scene, as long as the technical scheme of the disclosure is met.
FIG. 3 shows a block diagram of a feature enhancement based recommendation system neural network training apparatus, according to an embodiment of the present disclosure. As shown in fig. 3, the apparatus includes:
the predicted score determining module 31 is configured to input a plurality of first samples in a preset first training set into a t-th to-be-trained neural network for processing, so as to obtain predicted scores corresponding to the plurality of first samples, where t is a positive integer, and the first samples include feature information representing user attributes and feature information representing object attributes of an object to be recommended;
an attention degree determining module 32, configured to determine attention degrees of the neural network to each attribute according to the feature information of the plurality of first samples and the prediction scores corresponding to the plurality of first samples;
an enhanced probability determining module 33, configured to determine enhanced probabilities of the attributes according to a preset attention threshold and attention of the neural network to the attributes, respectively;
a to-be-updated feature determining module 34, configured to determine, according to a preset first enhancement rate of the feature information of the multiple first samples and the enhancement probability of each attribute, feature information to be updated from the feature information of the multiple first samples;
a training set updating module 35, configured to update a first sample in the first training set according to the feature information to be updated and a preset noise feature value, so as to obtain an updated second training set;
a training module 36, configured to perform a tth round of training on the neural network according to the second training set,
the neural network is applied to a recommendation system and used for predicting the scoring of a user on an object to be recommended in the recommendation system.
In one possible implementation, the apparatus further includes:
the first enhancement rate determining module is used for determining second enhancement rates of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate;
and the second enhancement rate determining module is used for determining a first enhancement rate of the feature information of the plurality of first samples during the tth round of training according to the maximum enhancement rate and the second enhancement rate.
According to the embodiment of the disclosure, when the method is applied to the neural network training of the recommendation system, the enhancement probability of each attribute can be determined according to the attention degree of the neural network to be trained in the current training turn to each attribute, further determining a training set which is used in the current training round and is subjected to characteristic enhancement according to the enhancement probability of each attribute and a preset first enhancement rate, training the neural network by using the training set, thereby optimizing the attention degree of the neural network to different attributes in the training process of the neural network, so that the neural network can comprehensively utilize all the characteristic information during prediction, avoid overfitting or over-dependence on partial characteristic information, meanwhile, when part of characteristic information of the neural network is noisy, other characteristic information is fully utilized for prediction, the robustness of the neural network is improved, and meanwhile, the accuracy of neural network prediction is improved.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A recommendation system neural network training method based on feature enhancement is characterized by comprising the following steps:
inputting a plurality of first samples in a preset first training set into a t-th round of neural network to be trained for processing to obtain prediction scores corresponding to the plurality of first samples, wherein t is a positive integer, and the first samples comprise characteristic information representing user attributes and characteristic information representing object attributes of objects to be recommended;
according to the feature information of the first samples and the prediction scores corresponding to the first samples, the attention degree of the neural network to each attribute is respectively determined;
respectively determining the enhancement probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute;
determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute;
updating a first sample in the first training set according to the feature information to be updated and a preset noise feature value to obtain an updated second training set;
performing a t-th round of training on the neural network according to the second training set,
the neural network is applied to a recommendation system and used for predicting the scoring of a user on an object to be recommended in the recommendation system.
2. The method of claim 1, wherein determining the attention of the neural network to each attribute according to the feature information of the first samples and the prediction scores corresponding to the first samples comprises:
for any first sample in a first training set, respectively determining a first contribution value of each feature information of the first sample to a prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample;
for any attribute in the plurality of attributes, determining a second contribution value of the feature information corresponding to the attribute from the first contribution values of the feature information of the first samples;
and determining the average value of the second contribution values as the attention degree of the neural network to the attribute.
3. The method according to claim 1, wherein determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute comprises:
determining the enhancement quantity of the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples;
randomly selecting a plurality of second samples from a plurality of first samples of the first training set, wherein the number of the second samples is the same as the enhancement number;
and for any second sample, randomly selecting one attribute from a plurality of attributes according to the enhanced probability of each attribute, and determining the feature information corresponding to the randomly selected attribute in the second sample as the feature information to be updated.
4. The method of claim 1, further comprising:
determining a second enhancement rate of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate;
and determining a first enhancement rate of the feature information of the plurality of first samples during the t-th round of training according to the maximum enhancement rate and the second enhancement rate.
5. The method of claim 1, wherein determining the enhanced probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute respectively comprises:
for any attribute, determining the attention degree of the neural network as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is smaller than a preset attention degree threshold value.
6. The method of claim 5, wherein determining the enhanced probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute respectively, further comprises:
for any attribute, determining the product of the attention degree and a preset adjustment proportion as the enhancement probability of the attribute under the condition that the attention degree of the neural network to the attribute is greater than or equal to a preset attention degree threshold value.
7. The method according to claim 2, wherein the neural network includes an input layer that inputs the feature information of each first sample, an N-level intermediate layer that outputs the prediction score corresponding to each first sample, and an output layer that outputs N-level intermediate feature information in the process, respectively, where N is a positive integer,
determining a first contribution value of each feature information of the first sample to the prediction score according to the feature information of the first sample and the prediction score corresponding to the first sample, respectively, including:
according to the prediction scores corresponding to the first samples, determining the contribution values of the N-th-level intermediate characteristic information to the prediction scores respectively;
determining the contribution value of each N-1 level intermediate characteristic information to the prediction score according to the contribution value of each N-level intermediate characteristic information to the prediction score, the N-level intermediate characteristic information and the N-1 level intermediate characteristic information;
determining the contribution value of each i-1 level intermediate characteristic information to the prediction score according to the contribution value of each i-level intermediate characteristic information to the prediction score, the i-level intermediate characteristic information and the i-1 level intermediate characteristic information, wherein i is an integer and is more than or equal to 2 and less than or equal to N;
and respectively determining a first contribution value of each characteristic information of the first sample to the prediction score according to the contribution value of each level 1 intermediate characteristic information to the prediction score, the level 1 intermediate characteristic information and the characteristic information of the first sample.
8. The method of claim 1, wherein the feature information of the first samples in the first training set is represented by a feature matrix, each row of the feature matrix representing one first sample, and each column of the feature matrix representing one attribute.
9. A recommendation system neural network training apparatus based on feature enhancement, the apparatus comprising:
the system comprises a prediction score determining module, a recommendation score determining module and a recommendation score determining module, wherein the prediction score determining module is used for inputting a plurality of first samples in a preset first training set into a t-th to-be-trained neural network for processing to obtain prediction scores corresponding to the first samples, t is a positive integer, and the first samples comprise characteristic information representing user attributes and characteristic information representing object attributes of objects to be recommended;
the attention degree determining module is used for respectively determining the attention degrees of the neural network to each attribute according to the feature information of the first samples and the prediction scores corresponding to the first samples;
the enhancement probability determination module is used for respectively determining the enhancement probability of each attribute according to a preset attention threshold and the attention of the neural network to each attribute;
the to-be-updated feature determination module is used for determining feature information to be updated from the feature information of the plurality of first samples according to a preset first enhancement rate of the feature information of the plurality of first samples and the enhancement probability of each attribute;
the training set updating module is used for updating a first sample in the first training set according to the feature information to be updated and a preset noise feature value to obtain an updated second training set;
a training module for performing the t-th round of training on the neural network according to the second training set,
the neural network is applied to a recommendation system and used for predicting the scoring of a user on an object to be recommended in the recommendation system.
10. The apparatus of claim 9, further comprising:
the first enhancement rate determining module is used for determining second enhancement rates of the characteristic information of the plurality of first samples during the t-th round of training according to a preset initial enhancement rate, a preset maximum enhancement rate and a preset change value of each round of enhancement rate;
and the second enhancement rate determining module is used for determining a first enhancement rate of the feature information of the plurality of first samples during the tth round of training according to the maximum enhancement rate and the second enhancement rate.
CN202010197501.9A 2020-03-19 2020-03-19 Recommendation system neural network training method and device based on feature enhancement Active CN111414539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010197501.9A CN111414539B (en) 2020-03-19 2020-03-19 Recommendation system neural network training method and device based on feature enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010197501.9A CN111414539B (en) 2020-03-19 2020-03-19 Recommendation system neural network training method and device based on feature enhancement

Publications (2)

Publication Number Publication Date
CN111414539A true CN111414539A (en) 2020-07-14
CN111414539B CN111414539B (en) 2023-09-01

Family

ID=71493180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010197501.9A Active CN111414539B (en) 2020-03-19 2020-03-19 Recommendation system neural network training method and device based on feature enhancement

Country Status (1)

Country Link
CN (1) CN111414539B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021152941A (en) * 2020-11-16 2021-09-30 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Object recommendation method, neural network and training method thereof, device and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10108902B1 (en) * 2017-09-18 2018-10-23 CS Disco, Inc. Methods and apparatus for asynchronous and interactive machine learning using attention selection techniques
CN109509054A (en) * 2018-09-30 2019-03-22 平安科技(深圳)有限公司 Method of Commodity Recommendation, electronic device and storage medium under mass data
CN109902222A (en) * 2018-11-30 2019-06-18 华为技术有限公司 Recommendation method and device
WO2020020088A1 (en) * 2018-07-23 2020-01-30 第四范式(北京)技术有限公司 Neural network model training method and system, and prediction method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10108902B1 (en) * 2017-09-18 2018-10-23 CS Disco, Inc. Methods and apparatus for asynchronous and interactive machine learning using attention selection techniques
WO2020020088A1 (en) * 2018-07-23 2020-01-30 第四范式(北京)技术有限公司 Neural network model training method and system, and prediction method and system
CN109509054A (en) * 2018-09-30 2019-03-22 平安科技(深圳)有限公司 Method of Commodity Recommendation, electronic device and storage medium under mass data
CN109902222A (en) * 2018-11-30 2019-06-18 华为技术有限公司 Recommendation method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021152941A (en) * 2020-11-16 2021-09-30 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Object recommendation method, neural network and training method thereof, device and medium
JP7194233B2 (en) 2020-11-16 2022-12-21 バイドゥ オンライン ネットワーク テクノロジー(ペキン) カンパニー リミテッド Object recommendation method, neural network and its training method, device and medium

Also Published As

Publication number Publication date
CN111414539B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN109408731B (en) Multi-target recommendation method, multi-target recommendation model generation method and device
CN111563164B (en) Specific target emotion classification method based on graph neural network
CN111460130B (en) Information recommendation method, device, equipment and readable storage medium
CN111966914B (en) Content recommendation method and device based on artificial intelligence and computer equipment
CN107633444B (en) Recommendation system noise filtering method based on information entropy and fuzzy C-means clustering
CN114492363B (en) Small sample fine adjustment method, system and related device
CN110688479B (en) Evaluation method and sequencing network for generating abstract
CN112464100B (en) Information recommendation model training method, information recommendation method, device and equipment
CN112861945B (en) Multi-mode fusion lie detection method
CN108763367B (en) Method for recommending academic papers based on deep alignment matrix decomposition model
CN112380451A (en) Favorite content recommendation method based on big data
CN113221019A (en) Personalized recommendation method and system based on instant learning
CN112256965A (en) Neural collaborative filtering model recommendation method based on lambdamat
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN116680363A (en) Emotion analysis method based on multi-mode comment data
CN117216281A (en) Knowledge graph-based user interest diffusion recommendation method and system
CN113516094B (en) System and method for matching and evaluating expert for document
CN114428910A (en) Resource recommendation method and device, electronic equipment, product and medium
CN113627151A (en) Cross-modal data matching method, device, equipment and medium
CN111414539B (en) Recommendation system neural network training method and device based on feature enhancement
CN111523311B (en) Search intention recognition method and device
CN112434512A (en) New word determining method and device in combination with context
CN116452293A (en) Deep learning recommendation method and system integrating audience characteristics of articles
CN115344698A (en) Label processing method, label processing device, computer equipment, storage medium and program product
CN114936890A (en) Counter-fact fairness recommendation method based on inverse tendency weighting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant