CN113987159A - Recommendation information determining method and device, electronic equipment and storage medium - Google Patents

Recommendation information determining method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113987159A
CN113987159A CN202111335048.4A CN202111335048A CN113987159A CN 113987159 A CN113987159 A CN 113987159A CN 202111335048 A CN202111335048 A CN 202111335048A CN 113987159 A CN113987159 A CN 113987159A
Authority
CN
China
Prior art keywords
text information
candidate recommended
information
historical search
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111335048.4A
Other languages
Chinese (zh)
Inventor
黄腾玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing IQIYI Science and Technology Co Ltd
Original Assignee
Beijing IQIYI Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing IQIYI Science and Technology Co Ltd filed Critical Beijing IQIYI Science and Technology Co Ltd
Priority to CN202111335048.4A priority Critical patent/CN113987159A/en
Publication of CN113987159A publication Critical patent/CN113987159A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for determining recommendation information, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring historical search text information searched by a user; splitting the historical search text information to obtain at least one first word segmentation corresponding to the historical search text information; obtaining semantic features of each first participle; determining the interest tendency of the user to each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the historical search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information; and determining at least one piece of target recommendation information based on the magnitude of the interest tendency of the user to each candidate recommendation text information. By the method, the accuracy of determining the interest tendency of the user to each candidate recommended text information is improved, and the information recommendation effect is improved.

Description

Recommendation information determining method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of information recommendation technologies, and in particular, to a recommendation information determining method and apparatus, an electronic device, and a storage medium.
Background
At present, various kinds of application software recommend information to users so as to improve the use experience of the users. For example, video software may recommend videos of interest to a user.
The commonly used method for determining recommendation information is: the method comprises the steps of recalling a plurality of candidate recommendation information of a user from a large amount of information, determining the correlation between the candidate recommendation information and the user according to the characteristics of user behavior information (such as historical search information of the user), and selecting recommendation information from the candidate recommendation information according to the correlation and recommending the recommendation information to the user.
However, in a keyword recommendation scenario, since a keyword searched by a user changes in real time, some search words may appear in a feature library without corresponding features. Therefore, such search terms without corresponding features cannot be utilized in determining the correlation between the candidate recommendation information and the user, which affects the accuracy of the correlation between the determined candidate recommendation information and the user, and further affects the recommendation effect of recommending information to the user.
Disclosure of Invention
The embodiment of the invention aims to provide a recommendation information determination method, a recommendation information determination device, electronic equipment and a storage medium, so as to improve the accuracy of determining the correlation between candidate recommendation information and a user.
In a first aspect of the present invention, there is provided a recommendation information determining method, including:
acquiring historical search text information searched by a user;
splitting the historical search text information to obtain at least one first word segmentation corresponding to the historical search text information; obtaining semantic features of each first participle;
determining the interest tendency of the user to each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the historical search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information; the second word segmentation is obtained by splitting the candidate recommended text information;
and determining at least one piece of target recommendation information based on the magnitude of the interest tendency of the user to the candidate recommendation text information.
In a second aspect of the present invention, there is also provided a recommendation information determining apparatus, including:
the information acquisition module is used for acquiring historical search text information searched by a user;
the splitting module is used for splitting the historical search text information to obtain at least one first segmentation information corresponding to the historical search text information; semantic features of each piece of first word segmentation information are obtained;
the interest tendency determining module is used for determining the interest tendency of the user to each candidate recommended text message according to the semantic features of each first word segmentation message corresponding to the historical search text message and the semantic features of each second word segmentation message corresponding to each candidate recommended text message; the second word segmentation is obtained by splitting the candidate recommended text information;
and the recommendation information determining module is used for determining at least one piece of target recommendation information based on the interest tendency of the user to the candidate recommendation text information.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
and a processor for implementing any of the above recommended information determination method steps when executing the program stored in the memory.
In a fourth aspect, an embodiment of the present invention provides a computer-readable cache medium, where a computer program is cached in the computer-readable cache medium, and when being executed by a processor, the computer program implements any of the above described steps of the recommendation information determination method.
In a fifth aspect, an embodiment of the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to perform any of the steps of the recommendation information determination method described above.
By adopting the method provided by the embodiment of the invention, at least one first participle information corresponding to the historical search text information is obtained by splitting the historical search text information, and at least one second participle information corresponding to the candidate recommended text information is obtained by splitting the candidate recommended text information; semantic features of each first participle and semantic features of each second participle can be obtained; and then, according to the semantic features of each first participle and the semantic features of each second participle, the interest tendency degree of the user on the candidate recommended text information can be determined, and the target recommended information can be determined from the candidate recommended text information according to the interest tendency degree and recommended to the user. The method provided by the embodiment of the invention can be used for segmenting the search words without corresponding characteristics, then obtaining the semantic characteristics of the segmented words, determining the interest tendency of the user on the candidate recommended text information by using the semantic characteristics of the segmented words of the search words, and solving the problem that the search words without corresponding characteristics can not be used for determining the recommended information in the prior art. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the search words without corresponding characteristics, determines the recommendation information for the user by utilizing more comprehensive search words, and improves the accuracy of determining the interest tendency of the user on the candidate recommendation information, namely the recommendation effect of recommending the information for the user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a flowchart of a recommendation information determining method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for determining a tendency of interest according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the determination of the correlation between candidate recommended text information and user A;
FIG. 4 is another flowchart of a method for determining a propensity degree of interest according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for determining a propensity degree of interest according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a recommendation information determining apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In order to improve the accuracy of determining the correlation between candidate recommendation information and a user, embodiments of the present invention provide a recommendation information determining method, apparatus, electronic device, computer-readable storage medium, and computer program product. The following describes a recommendation information determination method provided by an embodiment of the present invention.
The recommendation information determining method provided by the embodiment of the present invention may be applied to any electronic device that needs information recommendation, for example, an electronic device such as a server, a processor, and a computer, and is not limited specifically herein.
The recommendation information determining method provided by the embodiment of the present invention may be applied to any application scenario that needs information recommendation, for example, a scenario that recommends video or video information based on video-related text information, a scenario that recommends a search term, a scenario that recommends a product or product information based on product-related text information, and the like, and is not limited specifically herein.
Fig. 1 is a flow of a recommendation information determining method according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step 101, obtaining historical search text information searched by a user.
102, splitting historical search text information to obtain at least one first word segmentation corresponding to the historical search text information; and obtaining semantic features of each first participle.
And 103, determining the interest tendency of the user to each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the historical search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information.
And the second word segmentation is obtained by splitting the candidate recommended text information.
And 104, determining at least one piece of target recommendation information based on the interest tendency of the user to each candidate recommendation text information.
In the embodiment of the invention, the candidate recommended text information can be ranked according to the sequence of the interest tendency degree of each candidate recommended text information from large to small by the user, and then the preset number of target recommended information with high interest tendency degree can be selected from the candidate recommended text information according to the ranking. The preset number is at least one, and the design can be customized in practical application.
The target recommendation information is the determined candidate recommendation text information, or may be information corresponding to the determined candidate recommendation text information.
For example, in a video recommendation scenario, the candidate recommendation text information may be a brief introduction text of a video, and the target recommendation information may be a video corresponding to the determined brief introduction text, or further other introduction information of the video, such as a video keyword introduction, and generated video recommendation information that meets a preset rule or format.
In the product recommendation scenario, the candidate recommendation text information may be a function profile text of a product, and the target recommendation information may be a product corresponding to the determined function profile text, or further other information of the product corresponding to the determined function profile text, such as product manufacturing location information or product component information.
The determined target recommendation information can be directly recommended to the user. For a specific application scene, related information corresponding to the target recommendation information in the application scene may also be recommended to the user, for example, for a video recommendation scene, the determined target recommendation information is video a, the video a may be directly recommended to the user, and a video whose title, tag, or comment includes a keyword of the video a may also be recommended to the user.
Specifically, the recommendation information determining method provided by the embodiment of the invention can determine at least one piece of target recommendation information for the user and recommend the target recommendation information to the user, and the method can be applied to any video recommendation scene. The specific way of recommending the target recommendation information for the user may include, but is not limited to, at least one of the following: the method comprises the steps of displaying target recommendation information for a user when the user opens target software, or directly sending push information containing the target recommendation information for the user after the target recommendation information is determined, or recommending the target recommendation information for the user in the process that the user randomly watches videos, or determining at least one piece of target recommendation information for the user at intervals of preset time length and pushing the target recommendation information to the user and the like. The preset time can be 30 minutes or 1 hour, and the like, and can be designed in a user-defined mode in practical application. For example, in a video recommendation scene, the preset duration may be set to 1 hour, the target video software may determine at least one piece of target recommendation information (a brief text of a video or a keyword introduction of a video, etc.) for the user every 1 hour, and may display the target recommendation information for the user when the user opens the target video software, or may directly send push information including the target recommendation information for the user after the target recommendation information is determined. In addition, the recommendation method of the target recommendation information can also be applied to scenes related to sequencing, for example, when a video search result is displayed, the target recommendation information is used as forward weighting data to perform secondary sequencing on the video search result, and the like, which is not exhaustive.
By adopting the method provided by the embodiment of the invention, at least one first participle information corresponding to the historical search text information is obtained by splitting the historical search text information, and at least one second participle information corresponding to the candidate recommended text information is obtained by splitting the candidate recommended text information; semantic features of each first participle and semantic features of each second participle can be obtained; and then, according to the semantic features of each first participle and the semantic features of each second participle, the interest tendency degree of the user on the candidate recommended text information can be determined, and the target recommended information can be determined from the candidate recommended text information according to the interest tendency degree and recommended to the user. The method provided by the embodiment of the invention can be used for segmenting the search words without corresponding characteristics, then obtaining the semantic characteristics of the segmented words, determining the interest tendency of the user on the candidate recommended text information by using the semantic characteristics of the segmented words of the search words, and solving the problem that the search words without corresponding characteristics can not be used for determining the recommended information in the prior art. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the search words without corresponding characteristics, determines the recommendation information for the user by utilizing more comprehensive search words, and improves the accuracy of determining the interest tendency of the user on the candidate recommendation information, namely the recommendation effect of recommending the information for the user.
In the embodiment of the invention, a plurality of candidate recommended text messages can be recalled from massive user behavior messages. Taking an application scenario in which video software is used as user recommendation video information as an example, the electronic device may recall a plurality of candidate recommendation text information from text information such as text information searched by a large number of users of the video software and video names watched by the users. The recall method may adopt, for example, a recall method based on popularity, a collaborative filtering recall method, a recall method based on a user portrait, and the like, which is not specifically limited herein.
In the embodiment of the invention, the historical search text information is text information searched in the target software and/or the search list of the associated software of the target software by the user before the current moment. The associated software of the target software may be application software in which the target software has a cooperative relationship, or application software belonging to a business subject together with the target software. The target software can be video software, novel software or music software, and the like, and the associated software of the target software can be the same type of software as the target software or different types of software from the target software. For example, the target software is video software, and the associated software of the target software can be the same type of video software as the target software; alternatively, the associated software of the target software may be novel software that has a cooperative relationship with the target software. In the embodiment of the present invention, the number of the associated software of the target software is not specifically limited, and the associated software of the target software may be one or more.
Specifically, in the embodiment of the present invention, all historical search text information searched by the user before the current time may be acquired, and the historical search text information of the user in a preset time period may also be acquired. The preset time period is set according to the actual application situation, and may be set as follows: a time period corresponding to a week from the current time or a time period corresponding to a month from the current time, etc., which are not specifically limited herein.
Taking the application scene recommended by the video information as an example: for a user A of the video software X, historical search text information of the user A in the video software X in a preset time period can be acquired: "learn ppt" and "learn excel".
In the embodiment of the invention, each historical search text message can be split according to the splitting rule of the minimum split word group to obtain at least one first word segment corresponding to each historical search text message; and acquiring the semantic features of the first participles from a preset semantic feature library. Specifically, the number of preset characters corresponding to the minimum detachable word group may be set, and each historical search text message is split into a plurality of first word segments including characters of the preset number of characters according to the character arrangement order. If the number of characters included in the historical search text information is greater than n times of the preset number of characters and less than n +1 times of the preset number of characters, splitting (n-1) first segmentation words including the characters of the preset number of characters from the historical search text information according to the character arrangement sequence, and then taking word groups corresponding to the remaining characters as the nth first segmentation words. For example, the preset number of characters may be set to be 2, and 2 first participles may be split from the historical search text information "learning ppt", where the first participles are: "learning" and "ppt".
In the embodiment of the invention, for the historical search text information which comprises both Chinese characters and English, the Chinese characters can be used as English separators, the English can be used as Chinese character separators, and the Chinese characters and the English in the historical search text information are separated according to the arrangement sequence of the Chinese characters and the English and are respectively used as first participles. For example, for the historical search text information "learning excel" including both chinese characters and english, the historical search text information "learning excel" can be split into 2 first participles, namely: "learning" and "excel".
In the embodiment of the invention, each historical search text message can be split according to a single character splitting rule. For example, the historical search text information "sky" may be split into two words "day" and "empty".
In the embodiment of the invention, each historical search text message can be split preferentially according to the splitting rule of the minimum split phrase.
In an embodiment of the present invention, for each first participle obtained after splitting, semantic analysis may be directly performed on the first participle to obtain semantic features of the first participle. Specifically, the semantic features of the first segmentation can be obtained by analyzing the first segmentation by using a text pre-training model, and/or the semantic features of the first segmentation can be directly obtained by using any feature extraction tool. The text pre-training model is obtained by utilizing a plurality of sample participles and semantic features of the sample participles for training in advance.
In another embodiment of the present invention, for each first participle obtained after splitting, the semantic feature of each first participle may also be searched from a preset semantic feature library. The semantic feature library is used for storing semantic features of the text information. The semantic feature library may be used to store text information (or identification information of the text information) and semantic features in correspondence.
In the embodiment of the invention, the semantic feature library can be constructed by utilizing at least one mode of analyzing the semantic features of the participles. For example, semantic features can be extracted from a large amount of text information determined based on user behavior (e.g., text information searched by a user and text information corresponding to a video watched by the user) by using a text pre-training model, and then the extracted semantic features are stored in a preset library to obtain a constructed semantic feature library. Or, for the segmented words obtained by splitting the text information, the semantic features of each segmented word can be extracted by using a text pre-training model, and then the extracted semantic features are all stored in a preset library to obtain a constructed semantic feature library. The semantic features of each participle are stored in the semantic feature library. Of course, the method for extracting the semantic features of the segmented words in the embodiments of the present invention is not limited to this, and any method capable of extracting the semantic features of the segmented words may be applied to the embodiments of the present invention, and will not be described herein again. And the second word segmentation is obtained by splitting the candidate recommended text information. Specifically, the method for splitting the candidate recommended text information to obtain the second segmentation is the same as the method for splitting the history search text information to obtain the first segmentation, and details are not repeated here. In the embodiment of the present invention, the method for extracting semantic features of each second participle may refer to the method for extracting semantic features of each first participle, and details are not repeated here.
In a possible implementation method, the step of splitting the historical search text information may specifically include steps a1-a 2:
step A1, determining whether the semantic features corresponding to the historical search text information can be obtained from a preset semantic feature library.
The semantic feature library is used for storing semantic features of the text information.
And step A2, if not, splitting the historical search text information according to the splitting rule of the minimum split phrase.
In the embodiment of the invention, aiming at each historical search text message, the semantic features corresponding to the historical search text message can be firstly searched from the preset semantic feature library, and if the semantic features corresponding to the historical search text message cannot be searched in the preset semantic feature library, the historical search text message is split according to the splitting rule of the minimum split word group.
In this embodiment of the present invention, the step of determining the tendency of interest of the user in each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the history search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information may include, but is not limited to: for each candidate recommended text message, calculating the correlation degree between the candidate recommended text message and the historical search text message by using the semantic features of each first word segmentation message corresponding to each historical search text message and the semantic features of each second word segmentation message corresponding to the candidate recommended text message, and then calculating the interest tendency of the user on the candidate recommended text message according to the correlation degree between the candidate recommended text message and each historical search text message.
Specifically, there may be various implementation manners for determining the interest tendency of the user in each candidate recommended text information, which will be described in detail below.
Fig. 2 is a flowchart of a method for determining a tendency of interest according to an embodiment of the present invention, and as shown in fig. 2, the step of determining the tendency of interest of the user in each candidate recommended text information according to semantic features of each first word segmentation information corresponding to history search text information and semantic features of each second word segmentation information corresponding to each candidate recommended text information may include:
step 201, performing feature interaction on the semantic features of the first participles and the semantic features of the second participles to obtain the correlation between the first participles and the second participles.
Wherein, the correlation degree between the first participle and the second participle is at least one of the similarity degrees calculated by the following similarity degree calculation modes a-c. One kind of similarity may be calculated as the degree of correlation by the following similarity calculation method, or the degree of similarity may be calculated by a combination of 2 or more kinds of the following similarity calculation methods to obtain the degree of correlation. For example, the similarity between the first segmentation and the second segmentation is calculated by using the following similarity calculation methods a-c, and then the average value (or weighted average or other arbitrary custom algorithm, not exhaustive) of the similarities is used as the correlation between the first segmentation and the second segmentation. Of course, the calculation method of the correlation between the participles is not limited to the following calculation method of the similarity, and in the embodiment of the present invention, the similarity between each first participle and each second participle may be calculated as the correlation between each first participle and each second participle by using another calculation method of the similarity according to the semantic features of each first participle and the semantic features of each second participle.
Similarity calculation mode a: the cosine similarity between each first participle and each second participle can be calculated according to the semantic features of each first participle and the semantic features of each second participle, and is used as the correlation between each first participle and each second participle. Specifically, the correlation between the first participle and each second participle and the relationship between the semantic features of the first participle and the semantic features of the second participle satisfy the following formula:
Figure BDA0003350206010000091
wherein, Similarity (x, y) represents cosine Similarity between the first participle and the second participle, namely, correlation;
Figure BDA0003350206010000092
a vector representing semantic features of the first participle,
Figure BDA0003350206010000093
a vector representing semantic features of the second participle;
Figure BDA0003350206010000094
the respective components of the vector representing the semantic features of the first participle,
Figure BDA0003350206010000095
the respective components of the vector representing the semantic features of the second participle; n is the number of components of the vector of any semantic feature.
Similarity calculation mode b: the euclidean distance between each first participle and each second participle may be calculated as the degree of correlation between each first participle and each second participle according to the semantic features of each first participle and the semantic features of each second participle.
Similarity calculation mode c: the manhattan distance between each first participle and each second participle can be calculated according to the semantic features of each first participle and the semantic features of each second participle, and the manhattan distance serves as the correlation degree between each first participle and each second participle.
For example, for a user a of the video software X, historical search text information of the user a in the video software X within a preset time period may be acquired: "learn ppt" and "learn excel". The current candidate recommended text information includes: "excel tutorial". According to the splitting rule of Chinese characters and English characters which are separators of each other, the historical search text information learning ppt is split to obtain two first participles [ "learning", "ppt" ], and the historical search text information learning excel is split to obtain two first participles [ "learning", "excel" ]. And splitting the candidate recommended text information 'excel course' to obtain two second participles [ 'excel' and 'course' ].
In this embodiment, semantic features of each first participle and each second participle may be searched from a semantic feature library: the semantic features of learning are [ -1,4,5, -6], "ppt" is [1,2,3,4], "excel" is [1.1,1.9,3,4.1], "course" is [1,1.5,2.6,4 ].
Further, the cosine similarity between each first participle and each second participle can be calculated by using the above calculation formula of cosine similarity, and is used as the correlation. For example, fig. 3 is a schematic diagram of determining the correlation between the candidate recommended text information and the user a, and referring to fig. 3, the correlation between the first participle "learning" in the historical search text information "learning PPT" and the second participle "EXCEL" in the candidate recommended text information "EXCEL course" is calculated:
Figure BDA0003350206010000101
based on the same method, the following can be calculated:
the relevance (PPT, EXCEL) between a first participle "PPT" in the historical search text information "learning PPT" and a second participle "EXCEL" in the candidate recommended text information "EXCEL tutorial";
a degree of correlation (learning, course) between a first participle "learning" in the historical search text information "learning PPT" and a second participle "course" in the candidate recommended text information "EXCEL course";
the relevance (PPT, course) between a first participle "PPT" in the historical search text information "study PPT" and a second participle "course" in the candidate recommended text information "EXCEL course";
a degree of correlation Similarity (learning, EXCEL) between a first participle "learning" in the historical search text information "learning EXCEL" and a second participle "EXCEL" in the candidate recommended text information "EXCEL course";
a correlation degree Similarity (EXCEL ) between a first participle "EXCEL" in the historical search text information "learning EXCEL" and a second participle "EXCEL" in the candidate recommended text information "EXCEL tutorial";
a degree of correlation (learning, course) between a first participle "learning" in the historical search text information "learning EXCEL" and a second participle "course" in the candidate recommended text information "EXCEL course";
the degree of correlation between the first participle "EXCEL" in the historical search text information "learning EXCEL" and the second participle "course" in the candidate recommended text information "EXCEL course" (EXCEL, course).
Step 202, for each candidate recommended text information, calculating the correlation between the candidate recommended text information and the historical search text information based on the correlation between each second participle corresponding to the candidate recommended text information and each first participle corresponding to the historical search text information.
In the embodiment of the present invention, the following steps B1-B2 may be adopted to calculate the correlation between each candidate recommended text information and each historical search text information:
step B1: and determining a matrix formed by taking the correlation degree between each second word corresponding to the candidate recommended text information and each first word corresponding to the historical search text information as an element for each candidate recommended text information.
Step B2: and inputting the matrix into a pre-trained correlation degree determination model to obtain the correlation degree between the candidate recommended text information and the historical search text information.
In the embodiment of the invention, the relevance determining model can be trained in advance according to a matrix formed by elements of the relevance between each sample second word corresponding to the sample candidate recommended text information and each sample first word corresponding to the sample historical search text information.
For example, the second participles of each sample corresponding to the sample candidate recommended text information "word learning" are "word" and "learning", the first participles of each sample corresponding to the sample history search text information "PPT making" are "PPT" and "making", a matrix [ learning, PPT) Similarity (learning, making) ] formed by using the correlation between each sample second participle and each sample first participle as an element is input into a correlation determination model to be trained to obtain a predicted correlation between sample candidate recommended text information and sample historical search text information, and then adjusting the correlation constraint condition to be trained according to the real correlation and the predicted correlation between the sample candidate recommended text information and the sample historical search text information, and adjusting and finally determining the parameters of the model. For example, the relevancy constraint may be: and when the difference value between the real correlation degree and the prediction correlation degree between the sample candidate recommended text information and the sample historical search text information is smaller than a preset correlation degree difference value threshold value, determining the convergence of the correlation degree determination model to be trained, and thus obtaining the correlation degree determination model based on the model parameters during model convergence. The preset correlation difference threshold may be set according to practical applications, and is not specifically limited herein.
In the embodiment of the present invention, the correlation between each second word and each first word corresponding to the historical search text information may be a matrix formed by elements. And inputting a pre-trained correlation degree determination model to obtain the correlation degree between the candidate recommended text information and the historical search text information. For example, for the candidate recommended text information "EXCEL course", a matrix [ study, EXCEL) study, study) study, and study) corresponding to each second participle corresponding to the "EXCEL course" and each first participle corresponding to the historical search text information "study PPT" may be input into the correlation determination model, and a corresponding output value may be determined as a correlation between the candidate recommended text information "EXCEL course" and the historical search text information "study PPT". Similarly, the correlation between each second participle corresponding to the "EXCEL tutorial" and each first participle corresponding to the historical search text information "learning EXCEL" may be a matrix [ Similarity (learning, EXCEL) Similarity (learning, tutorial) Similarity (EXCEL ) Similarity (EXCEL, tutorial) ] formed by elements, input into the correlation determination model, and determine a corresponding output value as the correlation between the candidate recommended text information "EXCEL tutorial" and the historical search text information "learning EXCEL".
In an embodiment of the present invention, an input matrix of the correlation determination model is limited to be a matrix of N rows × M columns, N and M are integers greater than 1, and N and M may be equal or unequal. The setting rule of N is greater than or equal to the row number of a matrix (a matrix formed by elements of the correlation between the second participle and each first participle corresponding to the historical search text information) of the correlation determination model to be input, and the setting rule of M is greater than or equal to the column number of the matrix of the correlation determination model to be input. If the number of rows (columns) of the matrix to be input into the correlation determination model is less than n (m), the bits may be automatically complemented, for example by 0 or 1 or any other custom value.
In another embodiment of the present invention, at least two correlation determination models may be further provided, and each model is set differently for the number of rows and columns of the input matrix. After determining, for each candidate recommended text message, a matrix formed by using the correlation degree between each second participle corresponding to the candidate recommended text message and each first participle corresponding to the historical search text message as an element, according to the number of rows and columns of the matrix, a correlation degree determination model that matches the number of rows and columns of the matrix (for example, the number of rows and columns of the matrix is equal to the number of rows and columns of the model input data) may be selected, and then the matrix may be input into the selected correlation degree determination model, so as to obtain the correlation degree between the candidate recommended text message and the historical search text message.
And step 203, determining the interest tendency of the user for each candidate recommended text information based on the correlation between each candidate recommended text information and the historical search text information.
For example, the tendency degree of interest of the candidate recommended text information by the user may be determined by using a pre-trained tendency degree determination model based on the correlation degree between each candidate recommended text information and the historical search text information. Or, according to the search time of each historical search text information, performing weighted summation on the correlation between the candidate recommended text information and each historical search text information to obtain the interest tendency of the user on the candidate recommended text information. Specific implementations of this step are described in detail below.
By adopting the method provided by the embodiment of the invention, the correlation degree between each first participle and each second participle can be determined firstly, then the correlation degree between each candidate recommended text information and each historical search text information is calculated according to the correlation degree between each second participle and each first participle, the interest tendency degree of the user on each candidate recommended text information is determined based on the correlation degree between each candidate recommended text information and the historical search text information, and then the target recommended text information can be determined from the candidate recommended text information according to the interest tendency degree and recommended to the user. The method provided by the embodiment of the invention can be used for segmenting the search words without corresponding characteristics, so that the semantic characteristics of the segmented words can be obtained, the interest tendency of the user on the candidate recommended text information is determined by utilizing the semantic characteristics of the segmented words of the search words, and the problem that the search words without corresponding characteristics can not be used for determining the recommended information in the prior art is solved. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the search words without corresponding characteristics, determines the recommendation information for the user by utilizing more comprehensive search words, and improves the accuracy of determining the interest tendency of the user on the candidate recommendation information, namely the recommendation effect of recommending the information for the user.
Fig. 4 is another flowchart of the method for determining a tendency of interest according to the embodiment of the present invention, and as shown in fig. 4, the step of determining the tendency of interest of the user in each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the history search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information may include:
step 401, determining semantic features corresponding to the historical search text information according to the semantic features of each first word segmentation information corresponding to the historical search text information.
Semantic features of each first participle can be extracted from a preset semantic feature library, and the semantic features of each first participle can be represented in a matrix. In the embodiment of the present invention, for each historical search text information, a matrix summation may be performed on a matrix of semantic features of each first participle corresponding to the historical search text information, and the matrix obtained by the summation is used to represent the semantic features corresponding to the historical search text information, or an arbitrary custom algorithm (for example, weighted summation, weighted averaging, and not exhaustive enumeration) may be used to calculate a matrix of semantic features of each first participle corresponding to the historical search text information, and a new matrix obtained by the calculation is used to represent the semantic features corresponding to the historical search text information.
For example, according to a splitting rule that chinese characters and english characters are separators of each other, splitting historical search text information "learning ppt" to obtain two first participles [ "learning", "ppt" ], splitting historical search text information "learning excel" to obtain two first participles [ "learning", "excel" ], and if semantic features of each first participle searched from a semantic feature library can be represented by a matrix as follows: the semantic features of learning are [ -1,4,5, -6], "ppt" is [1,2,3,4], "excel" is [1.1,1.9,3,4.1 ]. In the embodiment of the present invention, for the "learned ppt", matrix summation may be performed on the matrices of semantic features of the respective first terms "learned" and "ppt" corresponding to the historical search text information "learned ppt": [ -1,4,5, -6] + [1,2,3,4] ═ 0,6,8, -2], the summed matrix [0,6,8, -2] is used to represent the semantic features corresponding to the historical search text information "learning ppt". For "learning excel", matrix summation may be performed on a matrix of semantic features of each first term "learning" and "excel" corresponding to the historical search text information "learning excel": [ -1,4,5, -6] + [1.1,1.9,3,4.1] ═ 0.1,5.9,8, -1.9], and the summed matrix [0.1,5.9,8, -1.9] is used to represent the semantic features corresponding to the historical search text information "learning excel".
Step 402, for each candidate recommended text information, determining semantic features corresponding to the candidate recommended text information according to the semantic features of each second participle information corresponding to the candidate recommended text information.
The semantic features corresponding to the candidate recommended text information may be determined in the same manner as in step 401.
The execution sequence of steps 401 and 402 is not specifically limited.
Step 403, performing feature interaction on the semantic features corresponding to each candidate recommended text information and the semantic features corresponding to the historical search text information to obtain the correlation between the candidate recommended text information and the historical search text information.
In this step, the similarity between the candidate recommended text information and the historical search text information may be calculated according to the semantic features corresponding to each candidate recommended text information and the semantic features corresponding to the historical search text information, and the similarity is used as the correlation between the candidate recommended text information and the historical search text information.
Specifically, any one of cosine similarity, manhattan distance, euclidean distance, pearson correlation coefficient or hamming distance between the candidate recommended text information and the historical search text information, or a distance calculated by combining at least two distances may be calculated according to the semantic features corresponding to each candidate recommended text information and the semantic features corresponding to the historical search text information, as the correlation between the candidate recommended text information and the historical search text information.
And step 404, determining the interest tendency of the user for each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
By adopting the method provided by the embodiment of the invention, the semantic features corresponding to each historical search text information can be determined according to the semantic features of each first participle, the semantic features corresponding to each candidate recommended text information can be determined according to the semantic features of each second participle, the correlation degree between each historical search text information and each candidate recommended text information is calculated according to the semantic features corresponding to each historical search text information and the semantic features corresponding to each candidate recommended text information, the interest tendency of the user on each candidate recommended text information is determined, and then the target recommended information can be determined from the candidate recommended text information according to the interest tendency and recommended to the user. The method provided by the embodiment of the invention can determine the semantic features corresponding to the search words by the semantic features of the participles of the search words without the corresponding features, and then determines the interest tendency of the user on the candidate recommended text information by utilizing the semantic features of the search words, thereby solving the problem that the search words without the corresponding features can not be used for determining the recommended information in the prior art. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the search words without corresponding characteristics, determines the recommendation information for the user by utilizing more comprehensive search words, and improves the accuracy of determining the interest tendency of the user on the candidate recommendation information, namely the recommendation effect of recommending the information for the user.
Fig. 5 is still another flowchart of the method for determining a tendency of interest according to the embodiment of the present invention, and as shown in fig. 5, the step of determining the tendency of interest of the user in each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the history search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information may further include:
step 501, determining a first word segmentation sequence formed by each first word segmentation corresponding to the historical search text information, and determining a first sequence characteristic corresponding to the first word segmentation sequence according to the semantic characteristic of each first word segmentation.
In the embodiment of the present invention, for each piece of history search text information, the history search text information may be divided into a plurality of first participles, and a sequence obtained by arranging the respective first participles divided from the history search text information is used as a first participle sequence corresponding to the history search text information. The semantic features of each first participle in the first participle sequence can be extracted from a preset semantic feature library, and the semantic features of each first participle can be represented in a matrix. In the embodiment of the present invention, for each first word segmentation sequence, the matrices of the semantic features of each first word segmentation in the first word segmentation sequence may be spliced, and the spliced matrices represent the first sequence features of the first word segmentation sequence.
For example, according to a splitting rule that Chinese characters and English characters are separators of each other, splitting the historical search text information learning ppt to obtain two first terms learning and ppt, and then obtaining a first term sequence [ 'learning' and 'ppt' ] corresponding to the historical search text information learning ppt; the historical search text information learning excel is split to obtain two first word divisions learning and excel, and then a first word division sequence [ ' learning ' and excel ' ] corresponding to the historical search text information learning excel can be obtained. If the semantic features of each first participle searched from the semantic feature library can be expressed by a matrix as: the semantic features of learning are [ -1,4,5, -6], "ppt" is [1,2,3,4], "excel" is [1.1,1.9,3,4.1 ]. In the embodiment of the present invention, for a first word segmentation sequence [ "learning", "ppt" ], a matrix of semantic features of each first word segmentation "learning" and "ppt" in the first word segmentation sequence [ "learning", "ppt" ] may be spliced: splicing [ -1,4,5, -6] and [1,2,3,4] into a new matrix [ -1,4,5, -6,1,2,3,4], wherein the spliced matrix [ -1,4,5, -6,1,2,3,4] represents a first sequence feature of the first partial word sequence [ "learning", "ppt" ]. For the first word segmentation sequence [ "learning", "excel" ], the matrix of semantic features of each first word segmentation "learning", "excel" in the first word segmentation sequence can be spliced: splicing [ -1,4,5, -6] and [1.1,1.9,3,4.1] into a new matrix [ -1,4,5, -6,1.1,1.9,3,4.1], wherein the spliced matrix [ -1,4,5, -6,1.1,1.9,3,4.1] represents a first sequence feature of the first partial sequence [ "learn", "excel" ].
Step 502, determining a second word segmentation sequence formed by each second word segmentation corresponding to each candidate recommended text message, and determining a second sequence feature corresponding to the second word segmentation sequence according to the semantic feature of each second word segmentation.
The execution sequence of step 501 and step 502 is not specifically limited.
The method of determining the second sequence features in this step is the same as the method of determining the first sequence features in step 501.
For example, according to a splitting rule that chinese characters and english characters are separators, the candidate recommended text information "excel course" is split to obtain two second sub-words "excel" and "course", and then a second sub-word sequence [ "excel" and "course") corresponding to the candidate recommended text information can be obtained. If the semantic features of each second participle searched from the semantic feature library can be expressed by a matrix as: the semantic features of excel are [1.1,1.9,3,4.1], and the semantic features of course are [1,1.5,2.6,4 ]. In the embodiment of the present invention, for the second branch sequence [ "excel" and "course" ], the matrices of semantic features of each second branch "excel" and "course" in the second branch sequence [ "excel" and "course" ] may be spliced: and (3) splicing the [1.1,1.9,3,4.1] and the [1,1.5,2.6,4] into a new matrix [1.1,1.9,3,4.1,1,1.5,2.6,4], wherein the spliced matrix [1.1,1.9,3,4.1,1,1.5,2.6,4] is used for representing the second sequence characteristics of the second word sequence [ "excel", "course ].
Step 503, performing feature interaction on the first sequence feature corresponding to each candidate recommended text information and the second sequence feature corresponding to the historical search text information to obtain the correlation between the candidate recommended text information and the historical search text information.
In this step, according to the first sequence feature corresponding to each candidate recommended text information and the second sequence feature corresponding to the historical search text information, the similarity between the candidate recommended text information and the historical search text information may be calculated as the correlation between the candidate recommended text information and the historical search text information.
Specifically, the cosine similarity, the manhattan distance, the euclidean distance, the pearson correlation coefficient or the hamming distance between the candidate recommended text information and the historical search text information may be calculated according to the semantic features corresponding to each candidate recommended text information and the semantic features corresponding to the historical search text information, as the correlation between the candidate recommended text information and the historical search text information.
And step 504, determining the interest tendency of the user for each candidate recommended text information based on the correlation between each candidate recommended text information and the historical search text information.
By adopting the method provided by the embodiment of the invention, the first sequence characteristic corresponding to each first participle sequence can be determined according to the semantic characteristic of each first participle, the second sequence characteristic corresponding to each second participle sequence can be determined according to the semantic characteristic of each second participle, the correlation degree between each historical search text information and each candidate recommended text information can be determined through the first sequence characteristic and the second sequence characteristic, the interest tendency degree of the user on each candidate recommended text information can be determined, and then the target recommended information can be determined from a plurality of candidate recommended text information according to the interest tendency degree and recommended to the user. The method provided by the embodiment of the invention can determine the semantic features corresponding to the search words by the semantic features of the participles of the search words without the corresponding features, and then determines the interest tendency of the user on the candidate recommended text information by utilizing the semantic features of the search words, thereby solving the problem that the search words without the corresponding features can not be used for determining the recommended information in the prior art. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the search words without corresponding characteristics, determines the recommendation information for the user by utilizing more comprehensive search words, and improves the accuracy of determining the interest tendency of the user on the candidate recommendation information, namely the recommendation effect of recommending the information for the user.
In the embodiment of the invention, based on any one of the above embodiments, the correlation between each candidate recommended text information and the historical search text information can be obtained, and on this basis, the user's interest tendency for each candidate recommended text information can be determined based on the correlation between each candidate recommended text information and the historical search text information by using at least one of the following tendency calculation methods a-b.
Tendency calculation mode a: the following steps C1-C2 may be adopted to determine the interest tendency of the user in each candidate recommended text information based on the correlation between each candidate recommended text information and the historical search text information:
step C1: and for each candidate recommended text information, determining a matrix formed by taking the correlation between the candidate recommended text information and the historical search text information as an element as a matrix corresponding to the candidate recommended text information.
Step C2: and inputting the matrix corresponding to each candidate recommended text information into a pre-trained tendency degree determining model to obtain the tendency degree of interest of the user to the candidate recommended text information.
In the embodiment of the invention, the tendency degree determination model can be trained in advance according to a matrix formed by elements of the correlation degree between the sample candidate recommended text information and the sample historical search text information. For example, if the degree of correlation between the sample candidate recommended text information "word learning" and the sample history search text information "PPT preparation" is S1, and the degree of correlation between the sample candidate recommended text information "word learning" and the sample history search text information "OFFICE preparation" is S2, the tendency determination model to be trained may be input with the matrix [ S1, S2] in which the degree of correlation between the sample candidate recommended text information "word learning" and each sample history search text information is an element, the predicted interest tendency of the sample user for the sample candidate recommended text information "word learning" may be obtained, and then the parameters of the tendency determination model to be trained may be adjusted according to the true interest tendency and the predicted interest tendency of the sample user for the sample candidate recommended text information "word learning". And when the difference value between the real interest tendency and the predicted interest tendency of the sample user to the sample candidate recommended text information is smaller than the preset interest tendency difference threshold value, determining the convergence of the tendency determination model to be trained, and obtaining the tendency determination model. The preset interesting tendency difference threshold may be set according to practical applications, and is not specifically limited herein.
In the embodiment of the invention, for each candidate recommended text information, a matrix formed by taking the correlation between the candidate recommended text information and the historical search text information as elements can be input into a pre-trained tendency degree determination model, so that the tendency degree of interest of the user in the candidate recommended text information is obtained. For example, in each piece of historical search text information, if the correlation between the historical search text information "learning PPT" and the candidate recommended text information "EXCEL course" is S11, and the correlation between the historical search text information "learning EXCEL" and the candidate recommended text information "EXCEL course" is S21, then a matrix [ S11, S21] formed by elements of the correlation between each piece of historical search text information "learning PPT", "learning EXCEL" and the candidate recommended text information "EXCEL course" may be input into the pre-trained tendency determination model, so as to obtain the tendency of interest of the user in the candidate recommended text information "EXCEL course".
In an embodiment of the present invention, an input matrix of the tendency determination model may be limited to be a matrix of P rows × Q columns, where P and Q are integers greater than 1, and P and Q may be equal or unequal. The setting rule of P is greater than or equal to the number of rows of a matrix of the intended input tendency degree determination model (a matrix in which the degree of correlation between the candidate recommended text information and the historical search text information is an element), and the setting rule of Q is greater than or equal to the number of columns of the matrix of the intended input tendency degree determination model. If the number of rows (columns) of the matrix of the input tendency determination model is less than P (Q), the bits can be automatically complemented, for example, by 0 or 1 or any other custom value.
In another embodiment of the present invention, at least two tendency degree determination models may be further provided, and each tendency degree determination model is differently set for the number of rows and columns of the input matrix. After obtaining a matrix formed by elements of the correlation between the candidate recommended text information and the historical search text information, according to the number of rows and columns of the matrix, a tendency degree determination model that matches the number of rows and columns of the matrix (for example, the number of rows and columns of the matrix is equal to the number of rows and columns of the model input data) may be selected, and then the matrix is input into the selected tendency degree determination model, so as to obtain the tendency degree of interest of the user in the candidate recommended text information.
Tendency calculation method b: the following steps D1-D2 may be adopted to determine the interest tendency of the user in each candidate recommended text information based on the correlation between each candidate recommended text information and the historical search text information:
step D1: and acquiring the searching time of searching each historical searching text information by the user.
Step D2: and for each candidate recommended text information, carrying out weighted summation on the correlation degree between the candidate recommended text information and each historical search text information according to the search time, and determining the obtained sum value as the interest tendency of the user on the candidate recommended text information.
In the embodiment of the present invention, the correlation between the candidate recommended text information and each of the historical search text information may be weighted and summed according to the search time, wherein a weighted value of the correlation between the historical search text information whose search time is longer than the current time and the candidate recommended text information is smaller. The specific weight assignment rule may be: the weight is decreased as the difference between the search time and the current time increases. For example, the difference between the search time and the current time is given a weight of 1.0 in one day, and the weight is subtracted by 0.1 every day. For example, the difference between the search time of the user search history search text information a and the current time is within one day, a weight of 1.0 is given to the degree of correlation between the historical search text information a and the candidate recommended text information, the difference between the search time for the user to search the historical search text information b and the current time is within 2 days, a weight of 0.9 is given to the degree of correlation between the historical search text information b and the candidate recommended text information, the difference between the search time for the user to search the historical search text information c and the current time is within 3 days, a weight of 0.8 is given to the degree of correlation between the historical search text information b and the candidate recommended text information, the difference between the search time for the user to search the historical search text information d and the current time is within 4 days, a weight of 0.7 is given to the degree of correlation between the historical search text information b and the candidate recommended text information.
For example, in each historical search text information, the correlation between the historical search text information "learning PPT" and the candidate recommended text information "EXCEL course" is S11, and the correlation between the historical search text information "learning EXCEL" and the candidate recommended text information "EXCEL course" is S21, where the difference between the search time of the user search historical search text information "learning PPT" and the current time is within 2 days, and the difference between the search time of the user search historical search text information "learning EXCEL" and the current time is within 4 days, then a weight of 0.9 may be assigned to the historical search text information "learning PPT", and a weight of 0.7 may be assigned to the historical search text information "learning EXCEL". Further, the correlation between the candidate recommended text information "EXCEL tutorial" and each historical search text information may be weighted and summed: and h is 0.9S 11+ 0.7S 21, and the obtained sum value h is determined as the interest tendency of the user in the candidate recommended text information.
By adopting the method provided by the embodiment of the invention, the problem that the historical search text information with corresponding semantic features cannot be searched for in the prior art can not be used for determining the recommendation information can be solved. The method provided by the embodiment of the invention can be used for segmenting the historical search text information of which the corresponding semantic features cannot be found, and obtaining the semantic features of the segmentation for determining the recommendation information. Therefore, compared with the prior art, the method provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the historical search text information of which the corresponding semantic features cannot be found, determines the recommendation information for the user by utilizing more comprehensive search words, improves the accuracy of determining the correlation between the candidate recommendation text information and the user, innovatively obtains the segmentation by splitting the text information, determines the recommendation information for the user by utilizing the semantic features corresponding to the segmentation, and improves the recommendation effect of the recommendation information for the user.
Based on the same inventive concept, according to the recommendation information determining method provided in the above embodiment of the present invention, correspondingly, another embodiment of the present invention further provides a recommendation information determining apparatus, a schematic structural diagram of which is shown in fig. 6, specifically including:
an information obtaining module 601, configured to obtain historical search text information searched by a user;
a splitting module 602, configured to split the historical search text information to obtain at least one first segmentation information corresponding to the historical search text information; semantic features of each piece of first word segmentation information are obtained;
an interest tendency determining module 603, configured to determine, according to the semantic features of each first word segmentation information corresponding to the historical search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information, an interest tendency of the user to each candidate recommended text information; the second word segmentation is obtained by splitting the candidate recommended text information;
and a recommendation information determining module 604, configured to determine at least one piece of target recommendation information based on the magnitude of the tendency of interest of the user in each candidate recommended text information.
In an embodiment of the apparatus of the present invention, the interest tendency determining module 603 includes:
the first relevancy determination submodule is used for performing characteristic interaction on the semantic characteristics of each first participle and the semantic characteristics of each second participle to obtain the relevancy between each first participle and each second participle;
a second correlation degree determining sub-module, configured to calculate, for each candidate recommended text information, a correlation degree between each candidate recommended text information and each historical search text information based on a correlation degree between each second word corresponding to the candidate recommended text information and each first word corresponding to the historical search text information;
and the tendency determining module submodule is used for determining the interest tendency of the user on each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
In a possible implementation manner, the first relevance determining sub-module is specifically configured to, for any one of the first participles and any one of the second participles, calculate a cosine similarity between a semantic feature of the first participle and a semantic feature of the second participle, as a relevance between the first participle and the second participle.
In a possible implementation manner, the second relevance determining sub-module is specifically configured to determine a matrix formed by using, as elements, relevance between each second word corresponding to the candidate recommended text information and each first word corresponding to the historical search text information; and inputting the matrix into a pre-trained correlation degree determination model to obtain the correlation degree between the candidate recommended text information and the historical search text information.
In another apparatus embodiment of the present invention, the interest tendency determining module 603 is specifically configured to determine a semantic feature corresponding to the historical search text information according to a semantic feature of each first word segmentation information corresponding to the historical search text information; for each candidate recommended text message, determining semantic features corresponding to the candidate recommended text message according to the semantic features of each second participle message corresponding to the candidate recommended text message; performing feature interaction on semantic features corresponding to each candidate recommended text information and semantic features corresponding to the historical search text information to obtain the correlation degree between the candidate recommended text information and the historical search text information; and determining the interest tendency of the user to each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
In another embodiment of the apparatus of the present invention, the interest tendency determining module 603 is specifically configured to determine a first word segmentation sequence formed by each first word segmentation corresponding to the historical search text information, and determine a first sequence feature corresponding to the first word segmentation sequence according to a semantic feature of each first word segmentation; determining a second word segmentation sequence formed by each second word segmentation corresponding to each candidate recommended text message, and determining a second sequence characteristic corresponding to the second word segmentation sequence according to the semantic characteristic of each second word segmentation; performing feature interaction on a first sequence feature corresponding to each candidate recommended text information and a second sequence feature corresponding to the historical search text information to obtain the correlation degree between the candidate recommended text information and the historical search text information; and determining the interest tendency of the user to each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
In a possible implementation manner, the tendency determination module sub-module is specifically configured to determine, for each candidate recommended text information, a matrix formed by using a correlation between the candidate recommended text information and the historical search text information as an element as a matrix corresponding to the candidate recommended text information; and inputting the matrix corresponding to each candidate recommended text information into a pre-trained tendency degree determining model to obtain the tendency degree of interest of the user to the candidate recommended text information.
In another possible implementation manner, the tendency determination module sub-module is specifically configured to acquire search time for a user to search each piece of historical search text information; and for each candidate recommended text information, carrying out weighted summation on the correlation degree between the candidate recommended text information and each historical search text information according to the search time, and determining the obtained sum value as the interest tendency of the user on the candidate recommended text information.
In an embodiment of the apparatus of the present invention, the splitting module 602 is specifically configured to determine whether a semantic feature corresponding to the historical search text information can be acquired from a preset semantic feature library; the semantic feature library is used for storing semantic features of text information; and if not, splitting the historical search text information according to the splitting rule of the minimum split phrase.
By adopting the device provided by the embodiment of the invention, the problem that the historical search text information with the corresponding semantic features cannot be searched for in the prior art can not be used for determining the recommendation information can be solved, the word segmentation is carried out on the historical search text information with the corresponding semantic features which cannot be searched for, and the semantic features of the word segmentation are obtained for determining the recommendation information. Therefore, compared with the prior art, the device provided by the embodiment of the invention can also determine recommendation information for the user by utilizing the historical search text information of which the corresponding semantic features cannot be found, determines the recommendation information for the user by utilizing more comprehensive search words, improves the accuracy of determining the correlation between the candidate recommendation text information and the user, innovatively obtains the segmentation by splitting the text information, determines the recommendation information for the user by utilizing the semantic features corresponding to the segmentation, and improves the recommendation effect of the recommendation information for the user.
An embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the recommendation information determining method according to any of the above embodiments when executing the program stored in the memory 703.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In still another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the recommendation information determination method described in any of the above embodiments.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the recommendation information determination method of any of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus, the electronic device and the storage medium, since they are substantially similar to the method embodiments, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (12)

1. A method for determining recommendation information, comprising:
acquiring historical search text information searched by a user;
splitting the historical search text information to obtain at least one first word segmentation corresponding to the historical search text information; obtaining semantic features of each first participle;
determining the interest tendency of the user to each candidate recommended text information according to the semantic features of each first word segmentation information corresponding to the historical search text information and the semantic features of each second word segmentation information corresponding to each candidate recommended text information; the second word segmentation is obtained by splitting the candidate recommended text information;
and determining at least one piece of target recommendation information based on the magnitude of the interest tendency of the user to the candidate recommendation text information.
2. The method according to claim 1, wherein the determining, according to the semantic features of the respective first segmentation information corresponding to the historical search text information and the semantic features of the respective second segmentation information corresponding to the respective candidate recommended text information, the user's interest tendency of the respective candidate recommended text information comprises:
performing feature interaction on the semantic features of the first participles and the semantic features of the second participles to obtain the correlation degree between the first participles and the second participles;
for each candidate recommended text message, calculating the correlation degree between the candidate recommended text message and the historical search text message based on the correlation degree between each second word corresponding to the candidate recommended text message and each first word corresponding to the historical search text message;
and determining the interest tendency of the user to each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
3. The method of claim 2, wherein the performing feature interaction on the semantic features of each first participle and the semantic features of each second participle to obtain the correlation between each first participle and each second participle comprises:
and calculating cosine similarity between the semantic features of the first participle and the semantic features of the second participle as the correlation degree between the first participle and the second participle for any one of the first participles and any one of the second participles.
4. The method according to claim 2, wherein the calculating the correlation between the candidate recommended text information and the historical search text information based on the correlation between each second word corresponding to the candidate recommended text information and each first word corresponding to the historical search text information comprises:
determining a matrix formed by taking the correlation degree between each second word corresponding to the candidate recommended text information and each first word corresponding to the historical search text information as an element;
and inputting the matrix into a pre-trained correlation degree determination model to obtain the correlation degree between the candidate recommended text information and the historical search text information.
5. The method according to claim 1, wherein the determining, according to the semantic features of the respective first segmentation information corresponding to the historical search text information and the semantic features of the respective second segmentation information corresponding to the respective candidate recommended text information, the user's interest tendency of the respective candidate recommended text information comprises:
determining semantic features corresponding to the historical search text information according to the semantic features of each first word segmentation information corresponding to the historical search text information;
for each candidate recommended text message, determining semantic features corresponding to the candidate recommended text message according to the semantic features of each second participle message corresponding to the candidate recommended text message;
performing feature interaction on semantic features corresponding to each candidate recommended text information and semantic features corresponding to the historical search text information to obtain the correlation degree between the candidate recommended text information and the historical search text information;
and determining the interest tendency of the user to each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
6. The method according to claim 1, wherein the determining, according to the semantic features of the respective first segmentation information corresponding to the historical search text information and the semantic features of the respective second segmentation information corresponding to the respective candidate recommended text information, the user's interest tendency of the respective candidate recommended text information comprises:
determining a first word segmentation sequence formed by each first word segmentation corresponding to the historical search text information, and determining a first sequence characteristic corresponding to the first word segmentation sequence according to the semantic characteristic of each first word segmentation;
determining a second word segmentation sequence formed by each second word segmentation corresponding to each candidate recommended text message, and determining a second sequence characteristic corresponding to the second word segmentation sequence according to the semantic characteristic of each second word segmentation;
performing feature interaction on a first sequence feature corresponding to each candidate recommended text information and a second sequence feature corresponding to the historical search text information to obtain the correlation degree between the candidate recommended text information and the historical search text information;
and determining the interest tendency of the user to each candidate recommended text information based on the correlation degree between each candidate recommended text information and the historical search text information.
7. The method according to claim 2, 5 or 6, wherein the determining the interest tendency of the user for the candidate recommended text information based on the correlation between the candidate recommended text information and the historical search text information comprises:
for each candidate recommended text information, determining a matrix formed by taking the correlation degree between the candidate recommended text information and the historical search text information as an element as a matrix corresponding to the candidate recommended text information;
and inputting the matrix corresponding to each candidate recommended text information into a pre-trained tendency degree determining model to obtain the tendency degree of interest of the user to the candidate recommended text information.
8. The method according to claim 2, 5 or 6, wherein the determining the interest tendency of the user for the candidate recommended text information based on the correlation between the candidate recommended text information and the historical search text information comprises:
acquiring the searching time for searching each historical searching text information by the user;
and for each candidate recommended text information, carrying out weighted summation on the correlation degree between the candidate recommended text information and each historical search text information according to the search time, and determining the obtained sum value as the interest tendency of the user on the candidate recommended text information.
9. The method of claim 1, wherein the splitting the historical search textual information comprises:
determining whether semantic features corresponding to the historical search text information can be acquired from a preset semantic feature library; the semantic feature library is used for storing semantic features of text information;
and if not, splitting the historical search text information according to the splitting rule of the minimum split phrase.
10. A recommendation information determining apparatus, characterized by comprising:
the information acquisition module is used for acquiring historical search text information searched by a user;
the splitting module is used for splitting the historical search text information to obtain at least one first segmentation information corresponding to the historical search text information; semantic features of each piece of first word segmentation information are obtained;
the interest tendency determining module is used for determining the interest tendency of the user to each candidate recommended text message according to the semantic features of each first word segmentation message corresponding to the historical search text message and the semantic features of each second word segmentation message corresponding to each candidate recommended text message; the second word segmentation is obtained by splitting the candidate recommended text information;
and the recommendation information determining module is used for determining at least one piece of target recommendation information based on the interest tendency of the user to each candidate recommendation text information.
11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1-9 when executing a program stored in the memory.
12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-9.
CN202111335048.4A 2021-11-11 2021-11-11 Recommendation information determining method and device, electronic equipment and storage medium Pending CN113987159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111335048.4A CN113987159A (en) 2021-11-11 2021-11-11 Recommendation information determining method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111335048.4A CN113987159A (en) 2021-11-11 2021-11-11 Recommendation information determining method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113987159A true CN113987159A (en) 2022-01-28

Family

ID=79748048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111335048.4A Pending CN113987159A (en) 2021-11-11 2021-11-11 Recommendation information determining method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113987159A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204688A (en) * 2023-05-04 2023-06-02 量子数科科技有限公司 Method for recommending user search terms based on typing search terms

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204688A (en) * 2023-05-04 2023-06-02 量子数科科技有限公司 Method for recommending user search terms based on typing search terms
CN116204688B (en) * 2023-05-04 2023-06-30 量子数科科技有限公司 Method for recommending user search terms based on typing search terms

Similar Documents

Publication Publication Date Title
CN110929206B (en) Click rate estimation method and device, computer readable storage medium and equipment
CN110321422B (en) Method for training model on line, pushing method, device and equipment
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN111984689B (en) Information retrieval method, device, equipment and storage medium
CN108304512B (en) Video search engine coarse sorting method and device and electronic equipment
CN109753601B (en) Method and device for determining click rate of recommended information and electronic equipment
CN111324771B (en) Video tag determination method and device, electronic equipment and storage medium
CN110543598A (en) information recommendation method and device and terminal
CN109582847B (en) Information processing method and device and storage medium
CN110321437B (en) Corpus data processing method and device, electronic equipment and medium
CN112464100B (en) Information recommendation model training method, information recommendation method, device and equipment
CN112862567B (en) Method and system for recommending exhibits in online exhibition
CN111400586A (en) Group display method, terminal, server, system and storage medium
CN111159563A (en) Method, device and equipment for determining user interest point information and storage medium
CN113342958B (en) Question-answer matching method, text matching model training method and related equipment
CN111107416A (en) Bullet screen shielding method and device and electronic equipment
CN112328889A (en) Method and device for determining recommended search terms, readable medium and electronic equipment
CN115374362A (en) Multi-way recall model training method, multi-way recall device and electronic equipment
CN115687690A (en) Video recommendation method and device, electronic equipment and storage medium
CN112989118B (en) Video recall method and device
CN113987159A (en) Recommendation information determining method and device, electronic equipment and storage medium
CN114222000A (en) Information pushing method and device, computer equipment and storage medium
CN108563648B (en) Data display method and device, storage medium and electronic device
CN110188277B (en) Resource recommendation method and device
CN111324733A (en) Content recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination