CN112836520A - Method and device for generating user description text based on user characteristics - Google Patents

Method and device for generating user description text based on user characteristics Download PDF

Info

Publication number
CN112836520A
CN112836520A CN202110189542.8A CN202110189542A CN112836520A CN 112836520 A CN112836520 A CN 112836520A CN 202110189542 A CN202110189542 A CN 202110189542A CN 112836520 A CN112836520 A CN 112836520A
Authority
CN
China
Prior art keywords
feature
vector
word
model
sublayer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110189542.8A
Other languages
Chinese (zh)
Inventor
李怀松
黄涛
王睿祺
金先明
张天翼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202110189542.8A priority Critical patent/CN112836520A/en
Publication of CN112836520A publication Critical patent/CN112836520A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for generating a user description text based on user characteristics. The method comprises the following steps: inputting feature names of various features of a target user and feature values corresponding to the feature names into a first encoder to obtain feature vectors of various initial users; inputting each initial user feature vector into a retrieval model, and performing K iterations to obtain K sentences; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and searching out a statement from an artificial knowledge base according to the comprehensive characterization vector; inputting the K sentences into a second encoder, and encoding the K sentences based on an attention mechanism to obtain semantic representation vectors; and inputting each initial user feature vector and the semantic representation vector into a generation model to generate a user description text of the target user. Efficiency and text quality can be both considered.

Description

Method and device for generating user description text based on user characteristics
Technical Field
One or more embodiments of the present specification relate to the field of computers, and more particularly, to a method and apparatus for generating user description text based on user characteristics.
Background
Since the user characteristics of the user have an association relationship with the user category of the user, the user can be classified based on the user characteristics of the user. The user characteristics may be data of the user's age, academic calendar, income, etc., and the user categories may include a plurality of predetermined categories, for example, whether there is a payment risk, whether there is a money laundering risk, etc. Generally, it is not convincing to only give the user characteristics of the user and the user category of the user, so after obtaining the user characteristics of the user, a user description text needs to be generated based on the user characteristics, and the user description text comprises a plurality of sentences, which can embody the association relationship between the user characteristics and the user category. The requirement for the user description text is a normative message with compact logic, sufficient demonstration, simplicity and understandability.
In the prior art, there are two ways to generate a user description text based on user characteristics: one way to manually compose user description text is inefficient; another way is to machine-generate the user description text, which is of poor quality.
Accordingly, improved solutions are desired that allow for both efficiency and text quality.
Disclosure of Invention
One or more embodiments of the present specification describe a method and apparatus for generating a user description text based on user characteristics, which can take both efficiency and text quality into account.
In a first aspect, a method for generating a user description text based on user characteristics is provided, and the method includes:
inputting feature names of various features of a target user and feature values corresponding to the feature names into a first encoder, and obtaining initial user feature vectors corresponding to the various features through the first encoder;
inputting the characteristic vectors of all the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector;
inputting the K sentences into a second encoder, and encoding the K sentences through the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences;
and inputting the initial user feature vectors and the semantic representation vectors into a generation model, and generating a user description text of the target user through the generation model.
In a possible implementation, the types of the features include:
numeric or text.
Further, before obtaining each initial user feature vector corresponding to each feature through the first encoder, the method further includes:
performing word segmentation processing on an original characteristic value of a characteristic with a text type to obtain a plurality of word segmentation results;
the inputting of the feature names of the features of the target user and the feature values corresponding to the feature names into the first encoder includes:
and inputting the feature name of the feature of which the type of the target user is a text type and a plurality of word segmentation results corresponding to the feature name into a first encoder.
In a possible embodiment, the first encoder comprises a first embedding matrix, a second embedding matrix and a first coding model; obtaining, by the first encoder, initial user feature vectors corresponding to the features respectively includes:
taking any one of the features as a target item feature, and converting the feature name of the target item feature into a first embedded vector through a first embedded matrix;
converting the eigenvalue corresponding to the target item characteristic into a second embedded vector through a second embedded matrix;
and inputting the first embedded vector and the second embedded vector corresponding to each feature into the first coding model, and coding the first coding model based on an attention mechanism to obtain initial user feature vectors corresponding to each feature.
In a possible implementation manner, the determining attention coefficients of the current iteration corresponding to the features respectively includes:
and determining attention coefficients corresponding to each feature of the iteration respectively according to the initial user feature vectors and the comprehensive characterization vector obtained by the last iteration.
In a possible embodiment, the retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector includes:
passing the comprehensive characterization vector through a full connection layer to obtain an output vector with dimensions of the number of sentences contained in the artificial knowledge base;
normalizing the output vector to obtain a normalized vector;
and selecting the dimension corresponding to the maximum numerical value in the normalized vector, and taking the statement corresponding to the dimension as the statement retrieved from the artificial knowledge base in the iteration.
In one possible embodiment, the second encoder comprises a second coding model and a self-attention layer; said encoding, by the second encoder, the K statements based on an attention mechanism, comprising:
inputting the word embedding vector of each word included in the K sentences into a second coding model, and determining the word coding vector of each word through the second coding model;
inputting the word coding vector of each word into a self-attention layer, determining an attention coefficient corresponding to each word through the self-attention layer, and performing weighted summation on the word coding vector of each word according to the attention coefficient corresponding to each word to obtain semantic representation vectors corresponding to the K sentences.
In one possible embodiment, the generative model comprises: the device comprises a first sublayer, a second sublayer and an intermediate layer, wherein the first sublayer and the second sublayer are time sequence-based neural network layers;
the generating of the user description text of the target user through the generative model includes:
the first sublayer takes a word generated at the last moment and the hidden state of the second sublayer at the last moment as the current moment input of the first sublayer to generate the hidden state of the first sublayer at the current moment; wherein the semantic representation vector is used as a hidden state of the second sublayer at the initial moment;
the intermediate layer determines each weight coefficient corresponding to each feature according to the hidden state of the first sublayer at the current moment and each initial user feature vector, and performs weighted summation on each initial user feature vector according to each weight coefficient to obtain an intermediate characterization vector;
the second sublayer takes the intermediate characterization vector and the hidden state of the first sublayer at the current moment as the current moment input of the second sublayer, and generates the hidden state of the second sublayer at the current moment; the hidden state of the second sublayer at the current time is used to determine the word generated at the current time.
In one possible embodiment, the method further comprises:
adjusting parameters of at least one of the first encoder, the retrieval model, the second encoder and the generation model by using a preset total loss function; the total loss function is determined by a first loss function and a second loss function, wherein the function value of the first loss function depends on the generation probability of each word in the artificial description text of the target user in the generation model, and the function value of the second loss function depends on whether each word in the artificial description text of the target user exists in a preset label text or not and the generation probability of each word in the label text in the generation model.
Further, the generation model is a time sequence-based model which sequentially generates words in the user description text at a plurality of moments; the generation probability of each word in the generative model comprises the generation probability of each word obtained by the generative model at each moment.
In a second aspect, an apparatus for generating a user description text based on user characteristics is provided, the apparatus comprising:
the first coding unit is used for inputting the feature names of various features of a target user and the feature values corresponding to the feature names into a first coder, and obtaining various initial user feature vectors corresponding to the various features through the first coder;
the retrieval unit is used for inputting each initial user feature vector obtained by the first coding unit into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector;
the second coding unit is used for inputting the K sentences obtained by the retrieval unit into a second coder, and coding the K sentences through the second coder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences;
and the generating unit is used for inputting each initial user feature vector obtained by the first encoding unit and the semantic representation vector obtained by the second encoding unit into a generating model, and generating the user description text of the target user through the generating model.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, there is provided a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of the first aspect.
According to the method and the device provided by the embodiment of the specification, firstly, feature names of various features of a target user and feature values corresponding to the feature names are input into a first encoder, and various initial user feature vectors corresponding to the various features are obtained through the first encoder; then inputting the characteristic vectors of all the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector; inputting the K sentences into a second encoder, and encoding the K sentences through the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences; and finally, inputting the characteristic vectors of the initial users and the semantic representation vectors into a generation model, and generating a user description text of the target user through the generation model. Therefore, the user description text is automatically generated through a machine, efficiency is high, in the process, initial user feature vectors corresponding to various features of the target user are utilized, semantic representation vectors corresponding to the retrieved K sentences are utilized, and the K sentences are derived from the artificial knowledge base, so that the artificial experience most relevant to the target user can be effectively utilized, the problems of word folding, wrong words and the like can be well solved, applicability is high, text quality is good, and efficiency and text quality can be considered.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram illustrating an implementation scenario of an embodiment disclosed herein;
FIG. 2 illustrates a flow diagram of a method for generating user description text based on user characteristics, according to one embodiment;
FIG. 3 illustrates a retrieval system architecture diagram according to one embodiment;
FIG. 4 shows a schematic block diagram of a second encoder according to an embodiment;
FIG. 5 illustrates a structural diagram of a generative model according to one embodiment;
FIG. 6 shows a schematic block diagram of an apparatus for generating user description text based on user characteristics, according to one embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
Fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification. The implementation scenario involves generating user description text based on user features. The user may be classified based on the user characteristics of the user. The user characteristics may be data of the user's age, academic calendar, income, etc., and the user categories may include a plurality of predetermined categories, for example, whether there is a payment risk, whether there is a money laundering risk, etc. The user description text comprises a plurality of sentences, and can embody the association relation between the user characteristics and the user categories. The requirement for the user description text is a normative message with compact logic, sufficient demonstration, simplicity and understandability.
Referring to fig. 1, the table lists feature names and corresponding feature values of various features of user a, and it can be understood that the target user is user a, the feature name is age, the feature value is 50 years old, the feature name is scholarly, the feature value is high, … …, and the feature name is annual income, the feature value is 3 ten thousand yuan. According to the feature names of the features of the user A and the corresponding feature values, the generated user description text is ' the user A is older, the academic history is lower, the income is lower in … … years, and therefore the repayment risk ' is realized '. In the embodiment of the present specification, the user characteristics may include, but are not limited to, the above-listed user attribute characteristics such as age, academic degree, and income of the year, and may further include historical behavior characteristics of the user for a specific application, for example, a historical debit amount, whether there is a delayed payment, and the like. The specific content and manner of generation of the user description text is generally not fixed. In the embodiment of the specification, the user description text is generated by combining expert experience and machine learning, and it can be understood that the user description text is automatically generated by a machine through the expert experience, namely manual experience, the efficiency is high, in the process, not only are various characteristics of the target user utilized, but also sentences retrieved from the manual knowledge base are utilized, so that the manual experience most relevant to the target user can be effectively utilized, the problems of word folding, wrong words and the like can be well solved, the applicability is strong, the text quality is good, and the efficiency and the text quality can be considered at the same time.
Fig. 2 shows a flowchart of a method for generating a user description text based on user characteristics according to an embodiment, which may be based on the implementation scenario shown in fig. 1. As shown in fig. 2, the method for generating the user description text based on the user characteristics in this embodiment includes the following steps: step 21, inputting feature names of various features of a target user and feature values corresponding to the feature names into a first encoder, and obtaining initial user feature vectors corresponding to the various features through the first encoder; step 22, inputting the characteristic vectors of the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector; step 23, inputting the K sentences into a second encoder, and encoding the K sentences through the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences; and 24, inputting the initial user feature vectors and the semantic representation vectors into a generation model, and generating a user description text of the target user through the generation model. Specific execution modes of the above steps are described below.
Firstly, in step 21, feature names of various features of a target user and feature values corresponding to the feature names are input into a first encoder, and various initial user feature vectors corresponding to the various features are obtained through the first encoder. It is understood that the first encoder may be based on various model structures, such as a transformer (transform), a long-short-term memory (LSTM), a gated round-robin unit (GRU), and so on.
In one example, the types of the various features include:
numeric or text.
For example, the age of the user a is 50 years, the feature of the age belongs to a feature of a numerical type, the feature name of the feature is the age, and the corresponding feature value is 50; the place of the user A is Beijing and Shanghai, the feature of the place belongs to a text type feature, the feature name of the feature is the place of the place, and the corresponding feature values are Beijing and Shanghai.
It will be appreciated that the type of feature is also the type of its corresponding feature value.
In the embodiment of the present specification, for a feature whose type is a numerical type, a feature name of the feature and an original feature value corresponding to the feature name may be input to a first encoder; for the feature with text type, the corresponding original feature value can be firstly subjected to word segmentation to obtain a plurality of word segmentation results, and then the feature name of the feature and the plurality of word segmentation results corresponding to the feature name are input into the first encoder.
In one example, the first encoder includes a first embedding matrix, a second embedding matrix, and a first coding model; obtaining, by the first encoder, initial user feature vectors corresponding to the features respectively includes:
taking any one of the features as a target item feature, and converting the feature name of the target item feature into a first embedded vector through a first embedded matrix;
converting the eigenvalue corresponding to the target item characteristic into a second embedded vector through a second embedded matrix;
and inputting the first embedded vector and the second embedded vector corresponding to each feature into the first coding model, and coding the first coding model based on an attention mechanism to obtain initial user feature vectors corresponding to each feature.
For example, a first embedding vector corresponding to the ith feature is denoted as x _ i _ feature, a second embedding vector corresponding to the ith feature is denoted as x _ i _ value, the first coding model is a transform model structure, and x _ i ═ x _ i _ feature and x _ i _ value ] are coded by using the transform model structure to obtain an initial user feature vector corresponding to the ith feature. The model structure of the transform mainly comprises an attention layer, a residual layer, a normalization layer, a feedforward layer and the like.
Then, in step 22, inputting the characteristic vectors of the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and searching out a statement from an artificial knowledge base according to the comprehensive characterization vector. It can be understood that the sentences in the artificial knowledge base embody artificial experiences, and the artificial experiences related to the target user can be obtained through a retrieval mode.
In an example, the determining attention coefficients of the current iteration respectively corresponding to the features includes:
and determining attention coefficients corresponding to each feature of the iteration respectively according to the initial user feature vectors and the comprehensive characterization vector obtained by the last iteration.
For example, XiRepresenting each initial user feature vector, the value of i is from 1 to N, namely N features are shared, and X is to divide each X intoiThe vector after the combination is carried out,
Figure BDA0002944829450000091
WX、WCis a parameter to be learned, Ct-1Is the integrated token vector obtained from the last iteration, or is called the integrated token vector at the last moment, CtThe comprehensive characterization vector obtained by the iteration is referred to as the comprehensive characterization vector of the current moment.
First z can be determined by the following formulat
Figure BDA0002944829450000092
Wherein tanh represents the activation function.
Then z is calculated by the following formulatAnd (3) carrying out normalization processing to obtain each attention coefficient:
αt=softmax(zt) Wherein α istRepresenting each attention coefficient alphat,iAnd (5) merging the vectors.
Finally, each attention coefficient pair X is utilizediAnd carrying out weighted summation to obtain a comprehensive characterization vector of the iteration, wherein the comprehensive characterization vector is represented by the following formula:
Figure BDA0002944829450000101
in one example, the retrieving a statement from an artificial knowledge base based on the comprehensive characterization vector includes:
passing the comprehensive characterization vector through a full connection layer to obtain an output vector with dimensions of the number of sentences contained in the artificial knowledge base;
normalizing the output vector to obtain a normalized vector;
and selecting the dimension corresponding to the maximum numerical value in the normalized vector, and taking the statement corresponding to the dimension as the statement retrieved from the artificial knowledge base in the iteration.
FIG. 3 shows a retrieval system architecture diagram according to one embodiment. Referring to fig. 3, the retrieval system includes a first encoder and a retrieval model. Inputting each feature of a target user into a first encoder, obtaining each initial user feature vector corresponding to each feature through the first encoder, representing the vector after each initial user feature vector is combined by using X, inputting the X into a retrieval model, and retrieving K sentences from an artificial knowledge base through the retrieval model. Where N is the total number of sentences contained in the artificial knowledge base, N is usually a large number, for example, N may be hundreds or thousands, and K is a predetermined number, for example, K may be 2, 3, or 5. The retrieval system is used for executing the actions of the foregoing step 21 and step 22 to obtain K statements.
Then, in step 23, the K sentences are input into a second encoder, and the K sentences are encoded by the second encoder based on an attention mechanism, so as to obtain semantic representation vectors corresponding to the K sentences. It will be appreciated that the second encoder may be based on a variety of model structures, for example, transform, LSTM, GRU, etc.
In one example, the second encoder includes a second coding model and a self attention layer; said encoding, by the second encoder, the K statements based on an attention mechanism, comprising:
inputting the word embedding vector of each word included in the K sentences into a second coding model, and determining the word coding vector of each word through the second coding model;
inputting the word coding vector of each word into a self-attention layer, determining an attention coefficient corresponding to each word through the self-attention layer, and performing weighted summation on the word coding vector of each word according to the attention coefficient corresponding to each word to obtain semantic representation vectors corresponding to the K sentences.
For example, the second coding model is a transform model structure, where K sentences contain l words, wiEmbedding vectors, s, for words of the ith word in K statementsiEncoding a vector for a word of the ith word determined by the transformer, alphaiThe attention coefficient for the ith word, H is the semantic representation vector,
Figure BDA0002944829450000111
Wsare all parameters to be learned.
First, s can be determined by the following formulai
si=Transformer(wi)。
Then, alpha is determined by the following formulai
Figure BDA0002944829450000112
Wherein tanh represents the activation function.
Finally, H is determined by the following formula:
Figure BDA0002944829450000113
fig. 4 shows a schematic structural diagram of a second encoder according to an embodiment. Referring to fig. 4, the second encoder includes a second coding model and a self-attention layer. Embedding words of each word included in the K sentences into a vector wiInputting a second coding model, determining a word coding vector s of each word by the second coding modeli(ii) a Encoding the word of each word into a vector siAnd inputting the semantic representation vectors H corresponding to the K sentences through a self-attention layer.
Finally, in step 24, the initial user feature vectors and the semantic representation vectors are input into a generation model, and a user description text of the target user is generated through the generation model. It can be understood that the generated model functions as a decoder, and the generated user description text can be used as a final user description text, or used for further editing and processing the user description text manually to form a final user description text, which helps to improve the efficiency of manually forming the text.
In one example, the generative model comprises: the device comprises a first sublayer, a second sublayer and an intermediate layer, wherein the first sublayer and the second sublayer are time sequence-based neural network layers;
the generating of the user description text of the target user through the generative model includes:
the first sublayer takes a word generated at the last moment and the hidden state of the second sublayer at the last moment as the current moment input of the first sublayer to generate the hidden state of the first sublayer at the current moment; wherein the semantic representation vector is used as a hidden state of the second sublayer at the initial moment;
the intermediate layer determines each weight coefficient corresponding to each feature according to the hidden state of the first sublayer at the current moment and each initial user feature vector, and performs weighted summation on each initial user feature vector according to each weight coefficient to obtain an intermediate characterization vector;
the second sublayer takes the intermediate characterization vector and the hidden state of the first sublayer at the current moment as the current moment input of the second sublayer, and generates the hidden state of the second sublayer at the current moment; the hidden state of the second sublayer at the current time is used to determine the word generated at the current time.
For example,
Figure BDA0002944829450000121
hidden state at time t-1 of the second sublayer, wt-1For the word generated at time t-1,
Figure BDA0002944829450000122
hidden state at time t of the first sublayer, fjInitial user feature vector, f, for the jth featurejAnd the aforementioned XiThere is no essential difference, and both are used to represent the initial user feature vector, wT、Wfb、WhbAre parameters to be learned.
First determined by the first sublayer
Figure BDA0002944829450000123
Can be represented by the following formula:
Figure BDA0002944829450000124
then determining an intermediate characterization vector c through the intermediate layertThe following formula is involved:
Figure BDA0002944829450000125
wherein tanh represents the activation function.
βl=softmax(bl) (ii) a Wherein, blB representing respective correspondence of each featurej,tMerged vector, betalRepresenting the vector formed by the weight coefficients corresponding to the features respectively.
Figure BDA0002944829450000126
Where M represents how many items are common to the user features.
Finally, the hidden state of the second sublayer at the time t is determined through the second sublayer
Figure BDA0002944829450000127
Can be represented by the following formula:
Figure BDA0002944829450000131
FIG. 5 illustrates a structural schematic of a generative model according to one embodiment. Referring to fig. 5, the generative model includes: a first sublayer, a second sublayer and an intermediate layer, the first sublayer and the second sublayer being time sequence based neural networksThe layer can be, but is not limited to, a neural network layer of a transducer, LSTM, GRU, or the like. The first sublayer will generate the word w at the last momentt-1And a hidden state of the second sublayer at a previous moment in time
Figure BDA0002944829450000132
Generating a hidden state of the first sublayer at the current time as its current time input
Figure BDA0002944829450000133
Wherein the semantic representation vector H is used as a hidden state of the initial moment of the second sublayer
Figure BDA0002944829450000134
The middle layer is hidden according to the current time of the first sublayer
Figure BDA0002944829450000135
And said initial user feature vectors [ f ]1,…,fM]Determining each weight coefficient beta corresponding to each featurelAnd according to each weight coefficient betalCarrying out weighted summation on the characteristic vectors of all the initial users to obtain an intermediate characterization vector ct(ii) a The second sublayer characterizes the intermediate vector ctAnd a hidden state of the first sublayer at the current time
Figure BDA0002944829450000136
Generating a hidden state of the second sublayer at the current time as its current time input
Figure BDA0002944829450000137
Hidden state of the second sublayer at the current moment
Figure BDA0002944829450000138
Word w generated for determining current timet
According to the embodiment of the specification, the generated model can more fully utilize the original characteristics and the hidden state at each moment, so that the quality of the generated user description text is more guaranteed.
In one example, the method further comprises:
adjusting parameters of at least one of the first encoder, the retrieval model, the second encoder and the generation model by using a preset total loss function; the total loss function is determined by a first loss function and a second loss function, wherein the function value of the first loss function depends on the generation probability of each word in the artificial description text of the target user in the generation model, and the function value of the second loss function depends on whether each word in the artificial description text of the target user exists in a preset label text or not and the generation probability of each word in the label text in the generation model.
Further, the generation model is a time sequence-based model which sequentially generates words in the user description text at a plurality of moments; the generation probability of each word in the generative model comprises the generation probability of each word obtained by the generative model at each moment.
The total loss function described above can be formulated as:
Figure BDA0002944829450000141
where α may be a predetermined constant.
The first loss function is
Figure BDA0002944829450000142
A second loss function of
Figure BDA0002944829450000143
The first loss function, expert experience part: mainly, artificial experience, namely expert rules, forms a description text for each client, and the probability of each word in the text is inherited from a generation model, namely
Figure BDA0002944829450000144
m represents the number of words in the text, and the maximum probability thereof is taken as log to determine the loss function of the expert experience part, so that the model parameters are corrected: when the machine generates correct characters, the probability of the correct characters is strengthened; when the machine generates the wrong word, the probability of the wrong word is reduced, the probability of the correct word is increased, and meanwhile, the convergence of the model is accelerated. Wherein the first loss function may also take other forms, e.g. will
Figure BDA0002944829450000145
Taking the maximum value and replacing the maximum value with the average value or the minimum value and the like.
The second loss function, machine learning part: the cross entropy of a general generative model can be used as a loss function.
According to the embodiment of the specification, the expert experience part is added into the total loss function, so that the accuracy and the coverage rate of the generated user description text can be improved, and a good effect can be achieved under the condition that the number of the label texts is small.
In the embodiment of the present specification, evaluation index optimization is also performed, which is generally used for evaluating indexes of good and bad generated texts, and the generated quality is mostly evaluated through similarity of texts, but in a specific field, such as the anti-money laundering field, accuracy of describing risk points is better considered, and coverage of the risk points is as much as possible, so the following evaluation indexes are proposed:
recall, namely splitting the original text and the generated text by sentences, taking the number of sentences appearing in the two texts simultaneously as numerators, taking the number of sentences of the original text as denominators, and obtaining a ratio which is the Recall ratio;
precision: the numerator is the same as the numerator of Recall, the denominator is the number of sentences of the model generated text, and the obtained ratio is the accuracy;
human Evaluation, namely, manual sampling inspection, for example, 100 messages generated by the model are given, and the number of qualified messages is manually judged, for example, 90 messages are provided, so that the quality of the text generated by the model is 90%.
According to the method provided by the embodiment of the specification, firstly, feature names of various features of a target user and feature values corresponding to the feature names are input into a first encoder, and various initial user feature vectors corresponding to the various features are obtained through the first encoder; then inputting the characteristic vectors of all the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector; inputting the K sentences into a second encoder, and encoding the K sentences through the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences; and finally, inputting the characteristic vectors of the initial users and the semantic representation vectors into a generation model, and generating a user description text of the target user through the generation model. Therefore, the user description text is automatically generated through a machine, efficiency is high, in the process, initial user feature vectors corresponding to various features of the target user are utilized, semantic representation vectors corresponding to the retrieved K sentences are utilized, and the K sentences are derived from the artificial knowledge base, so that the artificial experience most relevant to the target user can be effectively utilized, the problems of word folding, wrong words and the like can be well solved, applicability is high, text quality is good, and efficiency and text quality can be considered.
According to an embodiment of another aspect, an apparatus for generating a user description text based on a user characteristic is further provided, and the apparatus is configured to perform the method for generating a user description text based on a user characteristic provided in the embodiments of the present specification. FIG. 6 shows a schematic block diagram of an apparatus for generating user description text based on user characteristics, according to one embodiment. As shown in fig. 6, the apparatus 600 includes:
a first encoding unit 61, configured to input feature names of various features of a target user and feature values corresponding to the feature names into a first encoder, and obtain, by using the first encoder, initial user feature vectors corresponding to the various features respectively;
a retrieval unit 62, configured to input each initial user feature vector obtained by the first encoding unit 61 into a retrieval model, and perform K iterations through the retrieval model to obtain K statements through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector;
a second encoding unit 63, configured to input the K statements obtained by the retrieval unit 62 into a second encoder, and encode the K statements by using the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K statements;
and a generating unit 64, configured to input each initial user feature vector obtained by the first encoding unit 61 and the semantic representation vector obtained by the second encoding unit 63 into a generation model, and generate a user description text of the target user through the generation model.
Optionally, as an embodiment, the types of the features include:
numeric or text.
Further, the apparatus further comprises:
a word segmentation unit, configured to perform word segmentation processing on an original feature value of a feature of a text type before the first encoding unit 61 obtains, through the first encoder, each initial user feature vector corresponding to each feature, so as to obtain a plurality of word segmentation results;
the first encoding unit 61 is specifically configured to input a feature name of a feature that the type of the target user is a text type and a plurality of segmentation results corresponding to the feature name into the first encoder.
Optionally, as an embodiment, the first encoder includes a first embedding matrix, a second embedding matrix, and a first coding model; the first encoding unit 61 includes:
the first embedding subunit is used for taking any one of the features as a target item feature and converting the feature name of the target item feature into a first embedding vector through a first embedding matrix;
the second embedding subunit is used for converting the eigenvalue corresponding to the target item characteristic into a second embedding vector through a second embedding matrix;
and the coding subunit is configured to input a first embedded vector obtained by the first embedding subunit and a second embedded vector obtained by the second embedding subunit, where the features respectively correspond to the first feature, into the first coding model, and the first coding model is coded based on an attention mechanism to obtain initial user feature vectors corresponding to the features respectively.
Optionally, as an embodiment, the retrieving unit 62 is specifically configured to determine, according to the initial user feature vectors and the comprehensive characterization vector obtained in the last iteration, attention coefficients corresponding to each feature of the current iteration respectively.
Optionally, as an embodiment, the retrieving unit 62 includes:
the full-connection subunit is used for enabling the comprehensive characterization vector to pass through a full-connection layer to obtain an output vector with the dimensionality being the number of the sentences contained in the artificial knowledge base;
the normalization subunit is used for normalizing the output vector obtained by the full-connection subunit to obtain a normalized vector;
and the determining subunit is used for selecting the dimension corresponding to the maximum numerical value in the normalized vector obtained by the normalizing subunit, and taking the statement corresponding to the dimension as the statement retrieved from the artificial knowledge base in the iteration.
Optionally, as an embodiment, the second encoder includes a second coding model and a self attention layer; the second encoding unit 63 includes:
the coding subunit is used for inputting the word embedding vector of each word included in the K sentences into a second coding model, and determining the word coding vector of each word through the second coding model;
and the self-attention subunit is used for inputting the word coding vector of each word obtained by the coding subunit into a self-attention layer, determining the attention coefficient corresponding to each word through the self-attention layer, and performing weighted summation on the word coding vector of each word according to the attention coefficient corresponding to each word to obtain the semantic representation vector corresponding to the K sentences.
Optionally, as an embodiment, the generating a model includes: the device comprises a first sublayer, a second sublayer and an intermediate layer, wherein the first sublayer and the second sublayer are time sequence-based neural network layers;
the generating unit 64 includes:
the first processing subunit is configured to input, as the current time of the first sublayer, a word generated at a previous time and a hidden state of the second sublayer at the previous time as input of the first sublayer, and generate a hidden state of the first sublayer at the current time; wherein the semantic representation vector is used as a hidden state of the second sublayer at the initial moment;
the intermediate processing subunit is configured to determine, through the intermediate layer, each weight coefficient corresponding to each feature according to the hidden state of the first sublayer at the current time and each initial user feature vector generated by the first processing subunit, and perform weighted summation on each initial user feature vector according to each weight coefficient to obtain an intermediate characterization vector;
the second processing subunit is configured to use, through the second sublayer, the intermediate characterization vector obtained by the intermediate processing subunit and the hidden state of the first sublayer at the current time, generated by the first processing subunit, as its current time input, and generate the hidden state of the second sublayer at the current time; the hidden state of the second sublayer at the current time is used to determine the word generated at the current time.
Optionally, as an embodiment, the apparatus further includes:
a parameter adjusting unit, configured to adjust a parameter of at least one of the first encoder, the search model, the second encoder, and the generation model using a preset total loss function; the total loss function is determined by a first loss function and a second loss function, wherein the function value of the first loss function depends on the generation probability of each word in the artificial description text of the target user in the generation model, and the function value of the second loss function depends on whether each word in the artificial description text of the target user exists in a preset label text or not and the generation probability of each word in the label text in the generation model.
Further, the generation model is a time sequence-based model which sequentially generates words in the user description text at a plurality of moments; the generation probability of each word in the generative model comprises the generation probability of each word obtained by the generative model at each moment.
With the device provided in this specification, first, the first encoding unit 61 inputs feature names of various features of a target user and feature values corresponding to the feature names into the first encoder, and obtains initial user feature vectors corresponding to the various features respectively through the first encoder; then, the retrieval unit 62 inputs the initial user feature vectors into a retrieval model, and performs K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector; then, the second encoding unit 63 inputs the K sentences into a second encoder, and the K sentences are encoded by the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences; finally, the generating unit 64 inputs the initial user feature vectors and the semantic representation vectors into a generating model, and generates the user description text of the target user through the generating model. Therefore, the user description text is automatically generated through a machine, efficiency is high, in the process, initial user feature vectors corresponding to various features of the target user are utilized, semantic representation vectors corresponding to the retrieved K sentences are utilized, and the K sentences are derived from the artificial knowledge base, so that the artificial experience most relevant to the target user can be effectively utilized, the problems of word folding, wrong words and the like can be well solved, applicability is high, text quality is good, and efficiency and text quality can be considered.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory having stored therein executable code, and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (22)

1. A method of generating user description text based on user characteristics, the method comprising:
inputting feature names of various features of a target user and feature values corresponding to the feature names into a first encoder, and obtaining initial user feature vectors corresponding to the various features through the first encoder;
inputting the characteristic vectors of all the initial users into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector;
inputting the K sentences into a second encoder, and encoding the K sentences through the second encoder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences;
and inputting the initial user feature vectors and the semantic representation vectors into a generation model, and generating a user description text of the target user through the generation model.
2. The method of claim 1, wherein the types of features comprise:
numeric or text.
3. The method of claim 2, wherein before obtaining, by the first encoder, initial user feature vectors corresponding to respective features, the method further comprises:
performing word segmentation processing on an original characteristic value of a characteristic with a text type to obtain a plurality of word segmentation results;
the inputting of the feature names of the features of the target user and the feature values corresponding to the feature names into the first encoder includes:
and inputting the feature name of the feature of which the type of the target user is a text type and a plurality of word segmentation results corresponding to the feature name into a first encoder.
4. The method of claim 1, wherein the first encoder comprises a first embedding matrix, a second embedding matrix, and a first coding model; obtaining, by the first encoder, initial user feature vectors corresponding to the features respectively includes:
taking any one of the features as a target item feature, and converting the feature name of the target item feature into a first embedded vector through a first embedded matrix;
converting the eigenvalue corresponding to the target item characteristic into a second embedded vector through a second embedded matrix;
and inputting the first embedded vector and the second embedded vector corresponding to each feature into the first coding model, and coding the first coding model based on an attention mechanism to obtain initial user feature vectors corresponding to each feature.
5. The method of claim 1, wherein the determining attention coefficients of the current iteration corresponding to the features respectively comprises:
and determining attention coefficients corresponding to each feature of the iteration respectively according to the initial user feature vectors and the comprehensive characterization vector obtained by the last iteration.
6. The method of claim 1, wherein said retrieving a statement from an artificial knowledge base based on said comprehensive characterization vector comprises:
passing the comprehensive characterization vector through a full connection layer to obtain an output vector with dimensions of the number of sentences contained in the artificial knowledge base;
normalizing the output vector to obtain a normalized vector;
and selecting the dimension corresponding to the maximum numerical value in the normalized vector, and taking the statement corresponding to the dimension as the statement retrieved from the artificial knowledge base in the iteration.
7. The method of claim 1, wherein the second encoder comprises a second coding model and a self attention layer; said encoding, by the second encoder, the K statements based on an attention mechanism, comprising:
inputting the word embedding vector of each word included in the K sentences into a second coding model, and determining the word coding vector of each word through the second coding model;
inputting the word coding vector of each word into a self-attention layer, determining an attention coefficient corresponding to each word through the self-attention layer, and performing weighted summation on the word coding vector of each word according to the attention coefficient corresponding to each word to obtain semantic representation vectors corresponding to the K sentences.
8. The method of claim 1, wherein the generating a model comprises: the device comprises a first sublayer, a second sublayer and an intermediate layer, wherein the first sublayer and the second sublayer are time sequence-based neural network layers;
the generating of the user description text of the target user through the generative model includes:
the first sublayer takes a word generated at the last moment and the hidden state of the second sublayer at the last moment as the current moment input of the first sublayer to generate the hidden state of the first sublayer at the current moment; wherein the semantic representation vector is used as a hidden state of the second sublayer at the initial moment;
the intermediate layer determines each weight coefficient corresponding to each feature according to the hidden state of the first sublayer at the current moment and each initial user feature vector, and performs weighted summation on each initial user feature vector according to each weight coefficient to obtain an intermediate characterization vector;
the second sublayer takes the intermediate characterization vector and the hidden state of the first sublayer at the current moment as the current moment input of the second sublayer, and generates the hidden state of the second sublayer at the current moment; the hidden state of the second sublayer at the current time is used to determine the word generated at the current time.
9. The method of claim 1, wherein the method further comprises:
adjusting parameters of at least one of the first encoder, the retrieval model, the second encoder and the generation model by using a preset total loss function; the total loss function is determined by a first loss function and a second loss function, wherein the function value of the first loss function depends on the generation probability of each word in the artificial description text of the target user in the generation model, and the function value of the second loss function depends on whether each word in the artificial description text of the target user exists in a preset label text or not and the generation probability of each word in the label text in the generation model.
10. The method of claim 9, wherein the generative model is a time-sequential based model that sequentially generates words in the user description text at multiple times; the generation probability of each word in the generative model comprises the generation probability of each word obtained by the generative model at each moment.
11. An apparatus for generating user description text based on user characteristics, the apparatus comprising:
the first coding unit is used for inputting the feature names of various features of a target user and the feature values corresponding to the feature names into a first coder, and obtaining various initial user feature vectors corresponding to the various features through the first coder;
the retrieval unit is used for inputting each initial user feature vector obtained by the first coding unit into a retrieval model, and performing K iterations through the retrieval model to obtain K sentences through the K iterations; each iteration comprises the steps of determining each attention coefficient of the current iteration corresponding to each feature, carrying out weighted summation on each initial user feature vector according to each attention coefficient to obtain a comprehensive characterization vector, and retrieving a statement from an artificial knowledge base according to the comprehensive characterization vector;
the second coding unit is used for inputting the K sentences obtained by the retrieval unit into a second coder, and coding the K sentences through the second coder based on an attention mechanism to obtain semantic representation vectors corresponding to the K sentences;
and the generating unit is used for inputting each initial user feature vector obtained by the first encoding unit and the semantic representation vector obtained by the second encoding unit into a generating model, and generating the user description text of the target user through the generating model.
12. The apparatus of claim 11, wherein the types of features comprise:
numeric or text.
13. The apparatus of claim 12, wherein the apparatus further comprises:
the word segmentation unit is used for performing word segmentation processing on original characteristic values of the characteristics of the text type to obtain a plurality of word segmentation results before the first encoding unit obtains each initial user characteristic vector corresponding to each characteristic through the first encoder;
the first encoding unit is specifically configured to input a feature name of a feature that the type of the target user is a text type and a plurality of word segmentation results corresponding to the feature name into the first encoder.
14. The apparatus of claim 11, wherein the first encoder comprises a first embedding matrix, a second embedding matrix, and a first coding model; the first encoding unit includes:
the first embedding subunit is used for taking any one of the features as a target item feature and converting the feature name of the target item feature into a first embedding vector through a first embedding matrix;
the second embedding subunit is used for converting the eigenvalue corresponding to the target item characteristic into a second embedding vector through a second embedding matrix;
and the coding subunit is configured to input a first embedded vector obtained by the first embedding subunit and a second embedded vector obtained by the second embedding subunit, where the features respectively correspond to the first feature, into the first coding model, and the first coding model is coded based on an attention mechanism to obtain initial user feature vectors corresponding to the features respectively.
15. The apparatus according to claim 11, wherein the retrieving unit is specifically configured to determine, according to the initial user feature vectors and the comprehensive characterization vector obtained in the previous iteration, attention coefficients corresponding to the features of the current iteration, respectively.
16. The apparatus of claim 11, wherein the retrieving unit comprises:
the full-connection subunit is used for enabling the comprehensive characterization vector to pass through a full-connection layer to obtain an output vector with the dimensionality being the number of the sentences contained in the artificial knowledge base;
the normalization subunit is used for normalizing the output vector obtained by the full-connection subunit to obtain a normalized vector;
and the determining subunit is used for selecting the dimension corresponding to the maximum numerical value in the normalized vector obtained by the normalizing subunit, and taking the statement corresponding to the dimension as the statement retrieved from the artificial knowledge base in the iteration.
17. The apparatus of claim 11, wherein the second encoder comprises a second coding model and a self attention layer; the second encoding unit includes:
the coding subunit is used for inputting the word embedding vector of each word included in the K sentences into a second coding model, and determining the word coding vector of each word through the second coding model;
and the self-attention subunit is used for inputting the word coding vector of each word obtained by the coding subunit into a self-attention layer, determining the attention coefficient corresponding to each word through the self-attention layer, and performing weighted summation on the word coding vector of each word according to the attention coefficient corresponding to each word to obtain the semantic representation vector corresponding to the K sentences.
18. The apparatus of claim 11, wherein the generative model comprises: the device comprises a first sublayer, a second sublayer and an intermediate layer, wherein the first sublayer and the second sublayer are time sequence-based neural network layers;
the generation unit includes:
the first processing subunit is configured to input, as the current time of the first sublayer, a word generated at a previous time and a hidden state of the second sublayer at the previous time as input of the first sublayer, and generate a hidden state of the first sublayer at the current time; wherein the semantic representation vector is used as a hidden state of the second sublayer at the initial moment;
the intermediate processing subunit is configured to determine, through the intermediate layer, each weight coefficient corresponding to each feature according to the hidden state of the first sublayer at the current time and each initial user feature vector generated by the first processing subunit, and perform weighted summation on each initial user feature vector according to each weight coefficient to obtain an intermediate characterization vector;
the second processing subunit is configured to use, through the second sublayer, the intermediate characterization vector obtained by the intermediate processing subunit and the hidden state of the first sublayer at the current time, generated by the first processing subunit, as its current time input, and generate the hidden state of the second sublayer at the current time; the hidden state of the second sublayer at the current time is used to determine the word generated at the current time.
19. The apparatus of claim 11, wherein the apparatus further comprises:
a parameter adjusting unit, configured to adjust a parameter of at least one of the first encoder, the search model, the second encoder, and the generation model using a preset total loss function; the total loss function is determined by a first loss function and a second loss function, wherein the function value of the first loss function depends on the generation probability of each word in the artificial description text of the target user in the generation model, and the function value of the second loss function depends on whether each word in the artificial description text of the target user exists in a preset label text or not and the generation probability of each word in the label text in the generation model.
20. The apparatus of claim 19, wherein the generative model is a time-sequential based model that sequentially generates words in the user description text at a plurality of times; the generation probability of each word in the generative model comprises the generation probability of each word obtained by the generative model at each moment.
21. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-10.
22. A computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of any of claims 1-10.
CN202110189542.8A 2021-02-19 2021-02-19 Method and device for generating user description text based on user characteristics Pending CN112836520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110189542.8A CN112836520A (en) 2021-02-19 2021-02-19 Method and device for generating user description text based on user characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110189542.8A CN112836520A (en) 2021-02-19 2021-02-19 Method and device for generating user description text based on user characteristics

Publications (1)

Publication Number Publication Date
CN112836520A true CN112836520A (en) 2021-05-25

Family

ID=75933865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110189542.8A Pending CN112836520A (en) 2021-02-19 2021-02-19 Method and device for generating user description text based on user characteristics

Country Status (1)

Country Link
CN (1) CN112836520A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106383815A (en) * 2016-09-20 2017-02-08 清华大学 Neural network sentiment analysis method in combination with user and product information
CN109472031A (en) * 2018-11-09 2019-03-15 电子科技大学 A kind of aspect rank sentiment classification model and method based on double memory attentions
US20190341025A1 (en) * 2018-04-18 2019-11-07 Sony Interactive Entertainment Inc. Integrated understanding of user characteristics by multimodal processing
WO2020107878A1 (en) * 2018-11-30 2020-06-04 平安科技(深圳)有限公司 Method and apparatus for generating text summary, computer device and storage medium
CN112131469A (en) * 2020-09-22 2020-12-25 安徽农业大学 Deep learning recommendation method based on comment text
CN112214652A (en) * 2020-10-19 2021-01-12 支付宝(杭州)信息技术有限公司 Message generation method, device and equipment
WO2021012645A1 (en) * 2019-07-22 2021-01-28 创新先进技术有限公司 Method and device for generating pushing information

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106383815A (en) * 2016-09-20 2017-02-08 清华大学 Neural network sentiment analysis method in combination with user and product information
US20190341025A1 (en) * 2018-04-18 2019-11-07 Sony Interactive Entertainment Inc. Integrated understanding of user characteristics by multimodal processing
CN109472031A (en) * 2018-11-09 2019-03-15 电子科技大学 A kind of aspect rank sentiment classification model and method based on double memory attentions
WO2020107878A1 (en) * 2018-11-30 2020-06-04 平安科技(深圳)有限公司 Method and apparatus for generating text summary, computer device and storage medium
WO2021012645A1 (en) * 2019-07-22 2021-01-28 创新先进技术有限公司 Method and device for generating pushing information
CN112131469A (en) * 2020-09-22 2020-12-25 安徽农业大学 Deep learning recommendation method based on comment text
CN112214652A (en) * 2020-10-19 2021-01-12 支付宝(杭州)信息技术有限公司 Message generation method, device and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张兰霞;胡文心;: "基于双向GRU神经网络和双层注意力机制的中文文本中人物关系抽取研究", 计算机应用与软件, no. 11 *
赵小虎;尹良飞;赵成龙;: "基于全局-局部特征和自适应注意力机制的图像语义描述算法", 浙江大学学报(工学版), no. 01 *

Similar Documents

Publication Publication Date Title
CN113010693B (en) Knowledge graph intelligent question-answering method integrating pointer generation network
US20190155905A1 (en) Template generation for a conversational agent
CN110674850A (en) Image description generation method based on attention mechanism
US20210034817A1 (en) Request paraphrasing system, request paraphrasing model and request determining model training method, and dialogue system
CN108897852B (en) Method, device and equipment for judging continuity of conversation content
CN111339781A (en) Intention recognition method and device, electronic equipment and storage medium
CN111401037B (en) Natural language generation method and device, electronic equipment and storage medium
CN113553860A (en) Reply diversity multi-round conversation generation method and system based on multi-task learning
Chien et al. Variational and hierarchical recurrent autoencoder
CN111626041B (en) Music comment generation method based on deep learning
CN114528898A (en) Scene graph modification based on natural language commands
CN111563160B (en) Text automatic summarization method, device, medium and equipment based on global semantics
CN113821635A (en) Text abstract generation method and system for financial field
CN112069827A (en) Data-to-text generation method based on fine-grained subject modeling
CN116629324B (en) Optimization generation method for generating text repeated degradation phenomenon facing model
CN112836520A (en) Method and device for generating user description text based on user characteristics
CN116561251A (en) Natural language processing method
CN112613307A (en) Text processing device, method, apparatus, and computer-readable storage medium
CN115840815A (en) Automatic abstract generation method based on pointer key information
CN115965027A (en) Text abstract automatic extraction method based on semantic matching
CN115101122A (en) Protein processing method, apparatus, storage medium, and computer program product
CN112131363B (en) Automatic question and answer method, device, equipment and storage medium
CN114357154A (en) Chinese abstract generation method based on double-coding-pointer hybrid network
CN114925197A (en) Deep learning text classification model training method based on topic attention
CN114237610A (en) Method, system and storage medium for generating webpage configuration information by low-code platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination