CN117350403A - Method and device for training correlation analysis model - Google Patents

Method and device for training correlation analysis model Download PDF

Info

Publication number
CN117350403A
CN117350403A CN202311289029.1A CN202311289029A CN117350403A CN 117350403 A CN117350403 A CN 117350403A CN 202311289029 A CN202311289029 A CN 202311289029A CN 117350403 A CN117350403 A CN 117350403A
Authority
CN
China
Prior art keywords
correlation
analysis model
scores
correlation analysis
sample object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311289029.1A
Other languages
Chinese (zh)
Inventor
郑凌瀚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202311289029.1A priority Critical patent/CN117350403A/en
Publication of CN117350403A publication Critical patent/CN117350403A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for training a correlation analysis model. One embodiment of the method comprises the following steps: acquiring sample data, wherein the sample data comprises a sample object, a plurality of contents to be matched and a plurality of correlation labels between the sample object and the plurality of contents; inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents; smoothing the plurality of prediction scores to obtain a plurality of processing scores; determining correlation coefficients between the plurality of processing scores and the plurality of correlation labels as predictive losses; and adjusting model parameters of the correlation analysis model with the aim of minimizing the prediction loss.

Description

Method and device for training correlation analysis model
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a method and a device for training a correlation analysis model.
Background
In the field of computer technology, a variety of tasks involve correlation analysis. For example, in the task of recommending the articles, the articles to be recommended can be screened and ranked based on the correlation analysis of the user and the articles to be recommended, so that the articles with high correlation with the user are screened out and ranked, and the effect of the whole recommending link is improved. For another example, in the information searching task, the search results can be screened and ranked based on the correlation analysis of the search information and the search results, so that the search results with high correlation with the search information are screened and ranked, and the information searching effect is improved. Based on this, correlation analysis and ordering of data is necessary for a variety of tasks.
Disclosure of Invention
Embodiments of the present specification describe a method and apparatus for training a correlation analysis model, which performs smoothing processing on a plurality of prediction scores obtained based on the correlation analysis model when training the correlation analysis model, so that the distribution of the obtained processing scores becomes smoother. Then, model parameters of the correlation analysis model are adjusted based on the correlation coefficients between the plurality of processing scores and the plurality of correlation labels as prediction losses. Because the distribution of the plurality of processing scores is smoother, the ordering among the plurality of prediction scores of the model can be reflected, and the model parameters are adjusted based on the prediction losses among the plurality of processing scores and the plurality of correlation labels, so that the prediction scores output by the correlation analysis model obtained through training are more suitable for ordering.
According to a first aspect, there is provided a method of training a correlation analysis model, comprising: acquiring sample data, wherein the sample data comprises a sample object, a plurality of contents to be matched and a plurality of correlation labels between the sample object and the plurality of contents; inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents; smoothing the plurality of prediction scores to obtain a plurality of processing scores; determining correlation coefficients between the plurality of processing scores and the plurality of correlation labels as predictive losses; and adjusting model parameters of the correlation analysis model with the aim of minimizing the prediction loss.
According to a second aspect, there is provided a relevance ranking method comprising: acquiring a target object and a plurality of candidate contents; inputting the target object and the candidate contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the target object and the candidate contents; wherein the correlation analysis model is trained according to the method of claim 1; and determining the sequencing result of the candidate contents according to the prediction scores.
According to a third aspect, there is provided an apparatus for training a correlation analysis model, comprising: a first acquisition unit configured to acquire sample data including a sample object, a plurality of pieces of content to be matched, and a plurality of correlation tags between the sample object and the plurality of pieces of content; a first input unit configured to input the sample object and the plurality of contents into a correlation analysis model, and obtain a plurality of prediction scores of correlations between the sample object and the plurality of contents; a smoothing unit configured to perform smoothing processing on the plurality of prediction scores to obtain a plurality of processing scores; a determining unit configured to determine, as a prediction loss, a correlation coefficient between the plurality of processing scores and the plurality of correlation labels; and an adjustment unit configured to adjust model parameters of the correlation analysis model with the objective of minimizing the prediction loss.
According to a fourth aspect, there is provided a relevance ranking apparatus comprising: a second acquisition unit configured to acquire a target object and a plurality of candidate contents; a second input unit configured to input the target object and the plurality of candidate contents into a correlation analysis model, and obtain a plurality of prediction scores of correlations between the target object and the plurality of candidate contents; wherein the correlation analysis model is trained according to the method described in the first aspect; and a ranking unit configured to determine ranking results of the plurality of candidate contents according to the plurality of prediction scores.
According to a fifth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform a method as described in any of the implementations of the first or second aspects.
According to a sixth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has executable code stored therein, and wherein the processor, when executing the executable code, implements a method as described in any implementation of the first or second aspect.
According to the method and the device for training the correlation analysis model provided by the embodiment of the specification, sample data comprises a sample object, a plurality of pieces of content to be matched and a plurality of correlation labels between the sample object and the plurality of pieces of content. The sample object and the plurality of contents are input into a correlation analysis model, and a plurality of prediction scores of correlation between the sample object and the plurality of contents are obtained. And then, smoothing the plurality of prediction scores to obtain a plurality of processing fragments. And taking the correlation coefficients between the plurality of processing scores and the plurality of correlation labels as prediction losses, and adjusting model parameters of the correlation analysis model with the aim of minimizing the prediction losses. Because the distribution of the plurality of processing scores is smoother, the ordering among the plurality of prediction scores of the model can be reflected, and the model parameters are adjusted based on the prediction losses among the plurality of processing scores and the plurality of correlation labels, so that the prediction scores output by the correlation analysis model obtained through training are more suitable for ordering.
Drawings
FIG. 1 shows a schematic diagram of one application scenario in which embodiments of the present description may be applied;
FIG. 2 illustrates a flow chart of a method of training a correlation analysis model, according to one embodiment;
FIG. 3 illustrates a flow diagram of a relevance ranking method according to one embodiment;
FIG. 4 shows a schematic block diagram of an apparatus for training a correlation analysis model according to one embodiment;
FIG. 5 shows a schematic block diagram of a relevance ranking apparatus according to one embodiment.
Detailed Description
The technical scheme provided in the present specification is further described in detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings. It should be noted that, without conflict, the embodiments of the present specification and features in the embodiments may be combined with each other.
As previously mentioned, correlation analysis and ordering of data is necessary for a variety of tasks.
To this end, embodiments of the present disclosure provide a method of training a correlation analysis model, such that correlation prediction scores output by the trained correlation analysis model are more suitable for ranking. Fig. 1 shows a schematic diagram of one application scenario in which embodiments of the present description may be applied. In the present application scenario, a relevance analysis model is used to analyze the relevance between a user (user) and items (items) to be recommended. The sample data used in the training of the correlation analysis model comprises a target user U, a plurality of articles I (1) -I (N) to be recommended, wherein N is more than or equal to 2, N is a positive integer, and a plurality of correlation labels Y (1) -Y (N) between the target user and the plurality of articles. Based on this, the electronic device 101 for training a correlation analysis model may input the target user U and the plurality of items I (1) -I (N) to be recommended into the correlation analysis model 1011, and output a plurality of prediction scores X (1) -X (N) of the correlation between the target user U and the plurality of items I (1) -I (N) to be recommended by the correlation analysis model 1011. For example, the relevance analysis model 1011 may obtain a predictive score corresponding to each item to be recommended based on the target user U and each item I (1) -I (N) to be recommended. Then, the plurality of prediction scores X (1) -X (N) are smoothed to obtain a plurality of processing scores S (1) -S (N). The correlation coefficients between the plurality of processing scores S (1) -S (N) and the plurality of correlation labels Y (1) -Y (N) are determined as prediction losses, and model parameters of the correlation analysis model 1011 are adjusted with the aim of minimizing the prediction losses.
With continued reference to fig. 2, fig. 2 illustrates a flow chart of a method of training a correlation analysis model according to one embodiment. It is understood that the method may be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities. As shown in fig. 2, the method for training the correlation analysis model may include the following steps:
in step S201, sample data is acquired.
In this embodiment, each sample data for training the correlation analysis model may include a sample object, a plurality of pieces of content to be matched, and a plurality of correlation tags between the sample object and the plurality of pieces of content. Here, the content and the correlation tags are in one-to-one correspondence, and the plurality of items of content and the plurality of correlation tags are the same in number. Here, the correlation analysis model may be a neural network model of various network structures, which is not limited herein.
In some implementations, the plurality of relevance labels between the sample object and the plurality of items of content may be relevance indicators between the sample object and the plurality of items of content, which may be statistically derived from actual interaction data. For example, assume that the sample object is a user and the plurality of items of content are a plurality of items to be recommended. The relevance index of the user to the item may be determined based on user operations such as searching, clicking, collecting, purchasing, etc. of the item by the user, for example, the higher the number of user operations is, the higher the relevance index value is, and the relevance label is obtained based on the relevance index value.
In some implementations, the plurality of relevance labels between the sample object and the plurality of items of content may be converted from ordering labels between the sample object and the plurality of items of content. In some scenarios, the directly available data is often a ranking order, i.e., a ranking tag, between a sample object and multiple items of content with respect to relevance. For example, in a manually annotated scenario, the annotator typically only needs to annotate the ordering of the correlations between the various content and the sample object. For another example, in the above item recommendation scenario, it is possible to obtain the order in which the user clicks on each item as its ranking tag. On the basis of obtaining the ranking tags, the ranking tags can be converted into relevance tags representing the degree of relevance by some simple algorithm. For example, if there are n items of content, the content ranked in the ith rank in the relevance rank may have a tag value of (n-i)/n for the relevance tag. The conversion algorithm may be varied so long as the contents are ranked more later, the relevance label value is lower.
Thus, multiple relevance labels between a sample object and multiple items of content can be obtained in a number of ways.
Then, in step S203, the sample object and the plurality of contents are input into the correlation analysis model, and a plurality of prediction scores of correlation between the sample object and the plurality of contents are obtained.
In this embodiment, a sample object and a plurality of contents in sample data may be input into a correlation analysis model, so as to obtain a prediction score of correlation between the sample object and each of the contents, and obtain a plurality of prediction scores. Each item of content corresponds to a predictive score. Here, the higher the prediction score, the stronger the correlation between the sample object and the item of content. The number of the plurality of prediction scores and the plurality of correlation labels are the same.
In some implementations, the sample object may include a target user, and the plurality of items of content may include a plurality of items to be recommended for the target user. In this application scenario, the above step S203 may be specifically performed as follows: and generating a user characterization vector corresponding to the target user and an article characterization vector corresponding to each article through a correlation analysis model. And determining a first correlation between the user characterization vector and each of the item characterization vectors as a plurality of predictive scores.
In this implementation, the target user may correspond to user portrayal information, which may include, but is not limited to, the user's occupation, registration duration, hobbies, and so forth, for example. The plurality of items to be recommended may be not only commodities but also content-class objects, such as movies, music, articles, and the like. Each of the plurality of items may correspond to item information. For example, item information may include, but is not limited to, the name, price, pictures, video, introduction, etc. of the item. In this way, a user characterization vector corresponding to the target user and an item characterization vector corresponding to each item may be generated through the correlation analysis model, and a first correlation between the user characterization vector and each item characterization vector is determined. The obtained first correlations are used as prediction scores. For example, the first correlation may be obtained by calculating a dot product between the user characterization vector and the item characterization vector.
In some implementations, the sample object may include retrieval information, and the plurality of items of content may include a plurality of retrieval results. In this application scenario, the above step S203 may be specifically performed as follows: and generating a retrieval characterization vector corresponding to the retrieval information and a retrieval result characterization vector corresponding to each retrieval result through a correlation analysis model. And determining a second correlation between the search token vector and each of the search result token vectors as a plurality of prediction scores.
In this implementation, the retrieval information may be various information for performing information retrieval, for example, text (such as a retrieval word), a picture, voice, and the like. The plurality of items of content may include a plurality of search results corresponding to the search information. In this way, the search token vector corresponding to the search information and the search result token vector corresponding to each search result can be generated through the correlation analysis model, and the second correlation between the search token vector and each search result token vector is calculated. The obtained plurality of second correlations are used as a plurality of prediction scores. For example, the second correlation may be obtained by calculating a dot product between the search token vector and the search result token vector.
In some implementations, the sample object may include target information and the plurality of items of content may include a plurality of tags. In this application scenario, the above step S203 may be specifically performed as follows: and generating correlations between the target information and the labels as a plurality of prediction scores through a correlation analysis model.
In this implementation, information understanding tasks are involved in which multiple tags are often required to be extracted from information, and the multiple tags may be used to describe the information content. In this example, the target information may include information such as text, image, audio, video, and the like, and may include multi-modal information in which information of a plurality of modalities (e.g., text, image, audio, video, and the like) is fused. The plurality of items of content may include a plurality of tags. In this way, the correlation between the target information and each of the plurality of tags can be generated by the correlation analysis model, and the plurality of correlations are taken as a plurality of prediction scores.
In some implementations, the sample object may include a target question and the plurality of items of content may include a plurality of answer information. The correlation analysis model may be implemented by a pre-trained language model. In this application scenario, the above step S203 may be specifically performed as follows: and respectively splicing the target questions and the answer information into texts, and inputting a pre-trained language model to obtain the correlation scores of the target questions and the answer information as a plurality of prediction scores.
In one embodiment, the correlation analysis model described above may be used as a reward model in a reinforcement learning system. Reinforcement learning (Reinforcement Learning) is a machine learning method that aims to design an agent that can learn and improve autonomously in a specific environment, and continuously optimize its behavior strategy by interacting with the environment, thereby achieving some predetermined goal. A reward model (reward model) in reinforcement learning is used to evaluate the effect of the agent's behaviour, which directs the agent to learn the correct strategy by giving a reward or penalty signal, thereby continuously improving its behaviour. The reward model may include a variety of different metrics and algorithms, such as rule-based evaluation, adaptive evaluation, model prediction, and the like. The design and implementation of these models has a significant impact on the performance and effectiveness of reinforcement learning systems.
In practice, reinforcement learning may be combined with a Language Model (Language Model), for example, chat robots may be trained based on reinforcement learning. In this example, a correlation analysis model may be used as the reward model, which may be implemented by a pre-trained language model. Thus, for a plurality of answer information corresponding to the target question, the target question and each answer information can be respectively spliced into a text, and a pre-trained language model is input to obtain a relevance score of the target question and each answer information as a prediction score. These predictive scores may be used as rewards for the agent (i.e., chat robot) to output these answer information for further training the chat robot.
In step S205, a plurality of prediction scores are smoothed to obtain a plurality of processing scores.
In the present embodiment, various data smoothing methods (for example, a binning method smoothing, a regression method smoothing, an addition smoothing method, etc.) may be employed to smooth the plurality of prediction scores, and a plurality of processing scores may be obtained after the smoothing process.
In some implementations, the plurality of prediction scores may be processed by using a softmax function with a temperature coefficient, so as to obtain a plurality of processing scores after the smoothing processing, where the temperature coefficient may be a preset value, for example, may be a preset value manually set according to actual needs. The softmax function with the temperature coefficient is used for processing the plurality of prediction scores, so that the obtained distribution of the plurality of processing scores can be smoothed, the contained knowledge is richer, and the ordering information can be embodied.
In step S207, a correlation coefficient between the plurality of processing scores and the plurality of correlation labels is determined as a prediction loss.
In this embodiment, a Loss function may be preset, and the Loss function may include, by way of example and not limitation, a mean square error Loss function (Mean Squared Error, MSE), a mean absolute error Loss function (Mean Absolute Error, MAE), a Cross-Entropy Loss function (CE), a correlation coefficient, and the like. A predicted loss may be calculated based on the loss function, the plurality of processing scores, and the plurality of correlation tags.
In some implementations, the correlation coefficient may include a spearman (speerman) correlation coefficient. The spearman correlation coefficient is a non-parametric indicator of the dependence of two variables, which evaluates the dependence of two statistical variables using a monotonic equation. As an example, a formula with spearman correlation coefficient as a loss function may be as follows:
wherein x is i Representing an ith predictive score of the plurality of predictive scores;representing processing x using softmax function with temperature coefficient i The i-th processing score obtained; y is i Representing an ith correlation tag of the plurality of correlation tags; />Representing the mean value of S; />Represents the mean value of y; t represents a temperature coefficient; n represents the number of prediction scores.
Because the spearman correlation coefficient only measures the monotonic relation of the variables, the spearman correlation coefficient is more suitable for fitting a modeling scene of monotonicity of the sequencing result, and therefore, the spearman correlation coefficient is used as a loss function, and the prediction score output by the correlation analysis model obtained through training can be more suitable for sequencing.
Step S209, adjusting model parameters of the correlation analysis model with the aim of minimizing the prediction loss.
In this embodiment, various implementations may be used to adjust model parameters of the correlation analysis model with the goal of minimizing the predictive loss. For example, a BP (Back Propagation) algorithm, an SGD (Stochastic Gradient Descent, random gradient descent) algorithm, or the like may be employed to adjust model parameters of the correlation analysis model.
Reviewing the above procedure, in the above embodiment, when the correlation analysis model is trained, a plurality of prediction scores obtained based on the correlation analysis model are smoothed, so that the distribution of the obtained processing scores becomes smoother. Then, model parameters of the correlation analysis model are adjusted based on the correlation coefficients between the plurality of processing scores and the plurality of correlation labels as prediction losses. Because the distribution of the plurality of processing scores is smoother, the ordering among the plurality of prediction scores of the model can be reflected, and the model parameters are adjusted based on the prediction losses among the plurality of processing scores and the plurality of correlation labels, so that the prediction scores output by the correlation analysis model obtained through training are more suitable for ordering.
The above describes the training process of the correlation analysis model, and the correlation analysis model thus obtained can process the object and the plurality of candidate contents input therein to obtain a plurality of prediction scores of the correlation between the object and the plurality of candidate contents. The plurality of prediction scores thus obtained may be more suitable for ranking. Therefore, the plurality of candidate contents are ranked based on the plurality of prediction scores, and the ranking result can be more accurate.
With continued reference to fig. 3, fig. 3 illustrates a flow chart of a relevance ranking method according to one embodiment. It is understood that the method may be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities. And an execution body for training the correlation analysis model and an execution body for executing the correlation sorting method may be the same or different. As shown in fig. 3, the relevance ranking method may include the steps of:
in step S301, a target object and a plurality of candidate contents are acquired.
In this embodiment, for different application scenarios, there may be different target objects and multiple candidate contents corresponding to each other. For example, in an application scenario in which items are recommended to a user, a target object may be the user, the user corresponds to user portrait information, a plurality of candidate contents may be a plurality of items to be recommended for the user, and each item may correspond to item information. For another example, in an application scenario of information retrieval, the target object may be retrieval information, and the plurality of candidate contents may be a plurality of retrieval results obtained based on the retrieval information. For another example, in an application scenario for tagging information, the target information may include various information to be tagged, such as text, image, video, and the like, and the plurality of candidate contents may include a plurality of tags.
Step S303, inputting the target object and the plurality of candidate contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the target object and the plurality of candidate contents.
In this embodiment, the correlation analysis model may be a model trained according to the method described in fig. 2. The model can perform correlation analysis on the target object and multiple candidate contents, so as to obtain a prediction score of correlation between the target object and each candidate content, wherein each candidate content corresponds to one prediction score. Here, the higher the prediction score, the stronger the correlation between the target object and the candidate content.
Step S305, determining the sorting result of the plurality of candidate contents according to the plurality of prediction scores.
In this embodiment, a predictive score may be obtained for each candidate content based on the processing of the correlation analysis model. Based on a plurality of prediction scores corresponding to the plurality of candidate contents, the plurality of candidate contents can be ranked, and a ranking result is obtained. For example, the multiple item selection content may be ordered according to the order in which the corresponding prediction scores are from large to small.
In some implementations, the above-described relevance ranking method may further include the following, not shown in fig. 3: and recalling a plurality of candidate contents from the plurality of candidate contents as recommended results according to the sorting results.
In this implementation, several candidate contents may be recalled from the multiple candidate contents according to the sorting result of step S305 as the recommendation result. For example, assuming that the ranking result is ranked in order of high to low based on the corresponding prediction scores, the candidate content ranked in the first several digits may be selected as the recommendation result.
Reviewing the above, in the above embodiment, the plurality of candidate contents are ranked based on the plurality of prediction scores output by the correlation analysis model. The prediction scores output by the correlation analysis model obtained through training are more suitable for sequencing, so that the sequencing result can be more accurate through the method of the embodiment.
According to an embodiment of another aspect, an apparatus for training a correlation analysis model is provided. The above-described means for training a correlation analysis model may be deployed in any device, platform or cluster of devices having computing, processing capabilities.
FIG. 4 illustrates a schematic block diagram of an apparatus for training a correlation analysis model, according to one embodiment. As shown in fig. 4, the apparatus 400 for training a correlation analysis model may include: a first obtaining unit 401 configured to obtain sample data, including a sample object, a plurality of contents to be matched, and a plurality of correlation labels between the sample object and the plurality of contents; a first input unit 402 configured to input the sample object and the plurality of contents into a correlation analysis model, and obtain a plurality of prediction scores of correlations between the sample object and the plurality of contents; a smoothing unit 403 configured to perform smoothing processing on the plurality of prediction scores to obtain a plurality of processing scores; a determining unit 404 configured to determine, as a prediction loss, a correlation coefficient between the plurality of processing scores and the plurality of correlation labels; and an adjustment unit 405 configured to adjust model parameters of the correlation analysis model with the objective of minimizing the prediction loss.
In some optional implementations of this embodiment, the correlation coefficient is a spearman correlation coefficient.
In some optional implementations of this embodiment, the smoothing unit 403 is further configured to: the plurality of prediction scores are processed by a softmax function having a temperature coefficient, wherein the temperature coefficient is a preset value.
In some optional implementations of this embodiment, the plurality of relevance labels are converted from an ordering label between the sample object and the plurality of items of content.
In some optional implementations of this embodiment, the sample object includes a target user, and the plurality of items of content includes a plurality of items to be recommended; and, the first input unit 402 is further configured to: generating a user characterization vector corresponding to the target user and an article characterization vector corresponding to each article through the correlation analysis model; and determining a first correlation between the user token vector and each of the item token vectors as a plurality of predictive scores.
In some optional implementations of this embodiment, the sample object includes search information, and the plurality of items of content includes a plurality of search results; and, the first input unit 402 is further configured to: generating a retrieval characterization vector corresponding to the retrieval information and retrieval result characterization vectors corresponding to the retrieval results through the correlation analysis model; and determining a second correlation between the search token vector and each search result token vector as a plurality of prediction scores.
In some optional implementations of this embodiment, the sample object includes target information, and the plurality of items of content includes a plurality of tags; and, the first input unit 402 is further configured to: and generating correlations between the target information and the tags as a plurality of prediction scores by the correlation analysis model.
In some optional implementations of this embodiment, the sample object includes a target question, and the plurality of items of content includes a plurality of answer information; the correlation analysis model is implemented by a pre-trained language model, and the first input unit 402 is further configured to: and respectively splicing the target questions and the answer information into texts, and inputting a pre-trained language model to obtain the correlation scores of the target questions and the answer information as a plurality of prediction scores.
According to an embodiment of another aspect, a relevance ranking apparatus is provided. The relevance ranking means described above may be deployed in any device, platform or cluster of devices having computing, processing capabilities.
FIG. 5 shows a schematic block diagram of a relevance ranking apparatus according to one embodiment. As shown in fig. 5, the relevance ranking apparatus 500 may include: a second acquisition unit 501 configured to acquire a target object and a plurality of candidate contents; a second input unit 502 configured to input the target object and the plurality of candidate contents into a correlation analysis model, and obtain a plurality of prediction scores of correlations between the target object and the plurality of candidate contents; wherein, the correlation analysis model is trained according to the method described in fig. 2; and a ranking unit 503 configured to determine ranking results of the plurality of candidate contents according to the plurality of prediction scores.
In some optional implementations of this embodiment, the apparatus 500 further includes: a recall unit (not shown) configured to recall a plurality of candidate contents from the plurality of candidate contents as recommendation results according to the sorting result.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in fig. 2 or fig. 3.
According to an embodiment of a further aspect, there is also provided a computing device including a memory and a processor, wherein the memory stores executable code, and the processor, when executing the executable code, implements the method described in fig. 2 or fig. 3.
Those of ordinary skill would further appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those of ordinary skill in the art may implement the described functionality using different approaches for each particular application, but such implementation is not to be considered as beyond the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
While the foregoing embodiments have been described in some detail for purposes of clarity of understanding, it will be appreciated that the invention is not limited to the details of construction and the embodiments described above, but is intended to cover various modifications, equivalents, improvements and changes within the spirit and principles of the invention.

Claims (22)

1. A method of training a correlation analysis model, comprising:
acquiring sample data, wherein the sample data comprises a sample object, a plurality of contents to be matched and a plurality of correlation labels between the sample object and the plurality of contents;
inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents;
smoothing the plurality of prediction scores to obtain a plurality of processing scores;
determining a correlation coefficient between the plurality of processing scores and a plurality of correlation labels as a predictive loss;
and adjusting model parameters of the correlation analysis model with the aim of minimizing the prediction loss.
2. The method of claim 1, wherein the correlation coefficient is a spearman correlation coefficient.
3. The method of claim 1, wherein smoothing the plurality of prediction scores comprises:
the plurality of prediction scores are processed using a softmax function having a temperature coefficient, wherein the temperature coefficient is a preset value.
4. The method of claim 1, wherein the plurality of relevance labels are transformed from an ordering label between the sample object and the plurality of items of content.
5. The method of claim 1, wherein the sample object comprises a target user, the plurality of items of content comprising a plurality of items to be recommended; the method comprises the steps of,
inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents, wherein the method comprises the following steps:
generating a user characterization vector corresponding to the target user and an article characterization vector corresponding to each article through the correlation analysis model; and determining a first correlation between the user characterization vector and each item characterization vector as a plurality of predictive scores.
6. The method of claim 1, wherein the sample object includes retrieval information and the plurality of items of content includes a plurality of retrieval results; the method comprises the steps of,
inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents, wherein the method comprises the following steps:
generating a retrieval characterization vector corresponding to the retrieval information and a retrieval result characterization vector corresponding to each retrieval result through the correlation analysis model; and determining a second correlation between the search token vector and each search result token vector as a plurality of prediction scores.
7. The method of claim 1, wherein the sample object comprises target information and the plurality of items of content comprises a plurality of tags; the method comprises the steps of,
inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents, wherein the method comprises the following steps:
and generating correlations between the target information and the tags as a plurality of prediction scores through the correlation analysis model.
8. The method of claim 1, wherein the sample object comprises a target question, the plurality of items of content comprising a plurality of answer information; the correlation analysis model is implemented by a pre-trained language model, and,
inputting the sample object and the plurality of contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the sample object and the plurality of contents, wherein the method comprises the following steps:
and respectively splicing the target questions and the answer information into texts, and inputting a pre-trained language model to obtain the correlation scores of the target questions and the answer information as a plurality of prediction scores.
9. A method of relevance ranking comprising:
acquiring a target object and a plurality of candidate contents;
inputting the target object and the plurality of candidate contents into a correlation analysis model to obtain a plurality of prediction scores of correlation between the target object and the plurality of candidate contents; wherein the correlation analysis model is trained according to the method of claim 1;
and determining the sequencing result of the candidate contents according to the prediction scores.
10. The method of claim 9, wherein the method further comprises:
and recalling a plurality of candidate contents from the plurality of candidate contents as recommended results according to the sorting results.
11. An apparatus for training a correlation analysis model, comprising:
a first acquisition unit configured to acquire sample data including a sample object, a plurality of pieces of content to be matched, and a plurality of correlation tags between the sample object and the plurality of pieces of content;
a first input unit configured to input the sample object and the plurality of pieces of content into a correlation analysis model, and obtain a plurality of prediction scores of correlation between the sample object and the plurality of pieces of content;
a smoothing unit configured to perform smoothing processing on the plurality of prediction scores to obtain a plurality of processing scores;
a determining unit configured to determine, as a predictive loss, a correlation coefficient between the plurality of processing scores and the plurality of correlation labels;
and an adjustment unit configured to adjust model parameters of the correlation analysis model with the aim of minimizing the prediction loss.
12. The apparatus of claim 11, wherein the correlation coefficient is a spearman correlation coefficient.
13. The apparatus of claim 11, wherein the smoothing unit is further configured to:
the plurality of prediction scores are processed using a softmax function having a temperature coefficient, wherein the temperature coefficient is a preset value.
14. The apparatus of claim 11, wherein the plurality of relevance labels are transformed from an ordering label between the sample object and the plurality of items of content.
15. The apparatus of claim 11, wherein the sample object comprises a target user, the plurality of items of content comprising a plurality of items to be recommended; and the first input unit is further configured to:
generating a user characterization vector corresponding to the target user and an article characterization vector corresponding to each article through the correlation analysis model; and determining a first correlation between the user characterization vector and each item characterization vector as a plurality of predictive scores.
16. The apparatus of claim 11, wherein the sample object comprises retrieval information and the plurality of items of content comprises a plurality of retrieval results; and the first input unit is further configured to:
generating a retrieval characterization vector corresponding to the retrieval information and a retrieval result characterization vector corresponding to each retrieval result through the correlation analysis model; and determining a second correlation between the search token vector and each search result token vector as a plurality of prediction scores.
17. The apparatus of claim 11, wherein the sample object comprises target information and the plurality of items of content comprises a plurality of tags; and the first input unit is further configured to:
and generating correlations between the target information and the tags as a plurality of prediction scores through the correlation analysis model.
18. The apparatus of claim 11, wherein the sample object comprises a target question, the plurality of items of content comprising a plurality of answer information; the correlation analysis model is implemented by a pre-trained language model, and the first input unit is further configured to:
and respectively splicing the target questions and the answer information into texts, and inputting a pre-trained language model to obtain the correlation scores of the target questions and the answer information as a plurality of prediction scores.
19. A relevance ranking apparatus, comprising:
a second acquisition unit configured to acquire a target object and a plurality of candidate contents;
a second input unit configured to input the target object and the plurality of candidate contents into a correlation analysis model, and obtain a plurality of prediction scores of correlations between the target object and the plurality of candidate contents; wherein the correlation analysis model is trained according to the method of claim 1;
and the ordering unit is configured to determine ordering results of the plurality of candidate contents according to the plurality of prediction scores.
20. The apparatus of claim 19, wherein the apparatus further comprises:
and the recall unit is configured to recall a plurality of candidate contents from the plurality of candidate contents as recommendation results according to the sorting results.
21. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-10.
22. A computing device comprising a memory and a processor, wherein the memory has executable code stored therein, which when executed by the processor, implements the method of any of claims 1-10.
CN202311289029.1A 2023-09-28 2023-09-28 Method and device for training correlation analysis model Pending CN117350403A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311289029.1A CN117350403A (en) 2023-09-28 2023-09-28 Method and device for training correlation analysis model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311289029.1A CN117350403A (en) 2023-09-28 2023-09-28 Method and device for training correlation analysis model

Publications (1)

Publication Number Publication Date
CN117350403A true CN117350403A (en) 2024-01-05

Family

ID=89360543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311289029.1A Pending CN117350403A (en) 2023-09-28 2023-09-28 Method and device for training correlation analysis model

Country Status (1)

Country Link
CN (1) CN117350403A (en)

Similar Documents

Publication Publication Date Title
CN112487278A (en) Training method of recommendation model, and method and device for predicting selection probability
CN110019736B (en) Question-answer matching method, system, equipment and storage medium based on language model
CN113240130B (en) Data classification method and device, computer readable storage medium and electronic equipment
CN112699305A (en) Multi-target recommendation method, device, computing equipment and medium
CN113537630B (en) Training method and device of business prediction model
US20140317034A1 (en) Data classification
CN111639247A (en) Method, apparatus, device and computer-readable storage medium for evaluating quality of review
CN111881359A (en) Sorting method, system, equipment and storage medium in internet information retrieval
CN111666416A (en) Method and apparatus for generating semantic matching model
CN111695024A (en) Object evaluation value prediction method and system, and recommendation method and system
CN112380421A (en) Resume searching method and device, electronic equipment and computer storage medium
CN112819024A (en) Model processing method, user data processing method and device and computer equipment
CN118043802A (en) Recommendation model training method and device
CN114118526A (en) Enterprise risk prediction method, device, equipment and storage medium
CN113806501A (en) Method for training intention recognition model, intention recognition method and equipment
CN117668157A (en) Retrieval enhancement method, device, equipment and medium based on knowledge graph
CN110378486B (en) Network embedding method and device, electronic equipment and storage medium
CN110262906B (en) Interface label recommendation method and device, storage medium and electronic equipment
CN111340605A (en) Method and device for training user behavior prediction model and user behavior prediction
CN116975686A (en) Method for training student model, behavior prediction method and device
CN117251619A (en) Data processing method and related device
CN116910357A (en) Data processing method and related device
CN117350403A (en) Method and device for training correlation analysis model
Zhang et al. Continuous reinforcement learning to adapt multi-objective optimization online for robot motion
CN117217324A (en) Model training method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination