WO2021027149A1

WO2021027149A1 - Portrait similarity-based information retrieval recommendation method and device and storage medium

Info

Publication number: WO2021027149A1
Application number: PCT/CN2019/117794
Authority: WO
Inventors: 刘利
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-08-14
Filing date: 2019-11-13
Publication date: 2021-02-18
Also published as: CN110598123A; CN110598123B

Abstract

A portrait similarity-based information retrieval recommendation method, a device, a system and a storage medium, the method comprising: obtaining user portraits of different users, and determining the user portrait similarity between the user portraits (S11); creating a dynamic user community on the basis of the user portrait similarity, and enabling users having similar portraits to be grouped into the same dynamic user community (S12); and performing information retrieval recommendation on the user according to the dynamic user community and a query statement of the user (S13). By calculating the similarity between the user portraits, the similarity between different users may be obtained, and personalized information retrieval and recommendation may be achieved.

Description

Information retrieval recommendation method, device and storage medium based on portrait similarity

This application requires the priority of the patent application whose application number is 201910748591.3, the filing date is August 14, 2019, and the invention-creation title is "Method, device and storage medium for information retrieval and recommendation based on portrait similarity".

Technical field

This application relates to the field of data analysis technology, and in particular to an information retrieval recommendation method, device, system and computer-readable storage medium based on the similarity of user portraits.

Background technique

Collaborative Information Retrieval (CIR) is an information retrieval method based on social relations. The CIR collaborative information retrieval system can analyze user interaction history records to more effectively respond to subsequent user queries. However, when two users send the same query to the CIR system at the same time, because the goals and behavior characteristics of the two users may be different, they may be interested in two different document lists. At this time, CIR faces a personalized Query recommended questions.

At present, information retrieval is the main way for users to query and obtain information. It is a method and means to find information. Information storage is the basis for information retrieval. The information to be stored here includes original document data, pictures, videos, and audio. In order to achieve information retrieval, the original information must be converted into computer language and stored in the database, otherwise machine identification cannot be performed. After the user enters the query request according to the intention, the retrieval system searches the database for information related to the query according to the user’s query request, calculates the similarity of the information through a certain matching mechanism, and converts the information in order from large to small Output.

The inventor realizes that the existing information retrieval methods are either relatively complicated, or have poor retrieval accuracy and insufficient personalization, resulting in poor recommendation effects and poor user experience.

Summary of the invention

This application provides a method, electronic device, system and computer-readable storage medium for information retrieval and recommendation based on the similarity of user portraits. The main purpose of the method is to obtain the similarity of user portraits through the maximum matching of weighted bipartite graphs, and to obtain information between different users. This method can dynamically build user communities in a collaborative information retrieval environment and apply it to personalized information retrieval, improve retrieval accuracy, and optimize user experience.

To achieve the above objective, this application provides an information retrieval recommendation method based on the similarity of user portraits, which is applied to an electronic device, and the method includes:

Acquire user portraits of different users, and determine the similarity of user portraits between user portraits;

Create user dynamic communities based on the similarity of user portraits, so that users with similar portraits belong to the same user dynamic community;

According to the user's dynamic community and the user's query statement, the user is recommended for information retrieval.

In order to achieve the above objective, the present application also provides an electronic device, the electronic device comprising: a memory and a processor, the memory includes an information retrieval recommendation program based on the similarity of portraits, and the information retrieval recommendation program based on the similarity of user portraits is processed by the processor. The following steps are implemented during execution:

In order to achieve the above objective, this application also provides an information retrieval recommendation system based on the similarity of portraits, including:

The user portrait similarity determination unit is used to obtain user portraits of different users and determine the user portrait similarity between user portraits;

Dynamic community creation unit, users create user dynamic communities based on the similarity of user portraits, so that users with similar portraits belong to the same user dynamic community;

The search recommendation unit is used to perform information search and recommendation for users according to the user's dynamic community and the user's query sentence.

In addition, in order to achieve the above-mentioned purpose, the present application also provides a computer-readable storage medium. The computer-readable storage medium includes an information retrieval recommendation program based on the similarity of user portraits. The information retrieval recommendation program based on the similarity of user portraits is processed by the processor. When executed, the steps of the above information retrieval recommendation method based on the similarity of the user portrait are realized.

The method, device, system and computer-readable storage medium for information retrieval and recommendation based on the similarity of user portraits proposed in this application construct a weighted bipartite graph based on user portraits, and obtain the maximum weight between user portraits by using the maximum matching of the weighted bipartite graphs The matching value can dynamically construct a user community based on the similarity of user portraits in a collaborative information retrieval environment, and perform personalized information retrieval recommendations based on the user community, which can improve user retrieval accuracy, optimize user experience, and achieve personalized recommendations.

In order to achieve the above and related objects, one or more aspects of the present application include features that will be described in detail later. The following description and drawings illustrate certain exemplary aspects of the present application in detail. However, these aspects indicate only some of the various ways in which the principles of this application can be used. Furthermore, this application is intended to include all these aspects and their equivalents.

Description of the drawings

FIG. 1 is a schematic diagram of an application environment of a preferred embodiment of an information retrieval recommendation method based on the similarity of user portraits according to the present application;

2 is a schematic diagram of modules of a preferred embodiment of an information retrieval recommendation system based on the similarity of user portraits according to the present application;

3 is a flowchart of a preferred embodiment of an information retrieval recommendation method based on the similarity of user portraits according to the present application;

Figure 4 is a flowchart of a method for calculating the similarity of user portraits based on graph algorithms:

Figure 5 is a bipartite graph constructed based on user portraits of two different users.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

detailed description

It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.

This application provides an information retrieval and recommendation method based on the similarity of user portraits, which is applied to an electronic device 1. Referring to FIG. 1, it is a schematic diagram of the application environment of the preferred embodiment of the information retrieval recommendation method based on the similarity of user portraits of this application.

In this embodiment, the electronic device 1 may be a terminal device with arithmetic function, such as a server, a smart phone, a tablet computer, a portable computer, a desktop computer, and the like.

The electronic device 1 includes a processor 12, a memory 11, a network interface 14 and a communication bus 15.

The memory 11 includes at least one type of readable storage medium. At least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory 11, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1. In other embodiments, the readable storage medium may also be the external memory 11 of the electronic device 1, such as a plug-in hard disk or a smart memory card (Smart Media Card, SMC) equipped on the electronic device 1. , Secure Digital (SD) card, Flash Card, etc.

In this embodiment, the readable storage medium of the memory 11 is generally used to store the information retrieval recommendation program 10 based on the similarity of user portraits installed in the electronic device 1 and the like. The memory 11 can also be used to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), a microprocessor or other data processing chip, which is used to run the program code or process data stored in the memory 11, for example, to perform execution based on user profile Similarity information retrieval recommendation program 10 etc.

The network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is generally used to establish a communication connection between the electronic device 1 and other electronic devices.

The communication bus 15 is used to realize the connection and communication between these components.

FIG. 1 only shows the electronic device 1 with the components 11-15, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.

Optionally, the electronic device 1 may also include a user interface. The user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other devices with voice recognition functions, and a voice output device such as audio, earphones, etc. Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the electronic device 1 may also include a display, and the display may also be called a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an organic light-emitting diode (Organic Light-Emitting Diode, OLED) touch device. The display is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

Optionally, the electronic device 1 further includes a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is called a touch area. In addition, the touch sensor described here may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like. In addition, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.

In addition, the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor. Optionally, the display and the touch sensor are stacked to form a touch display screen. The device detects the touch operation triggered by the user based on the touch screen.

Optionally, the electronic device 1 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.

In the device embodiment shown in FIG. 1, the memory 11 as a computer storage medium may include an operating system and an information retrieval recommendation program 10 based on the similarity of user portraits; the processor 12 executes the user-based information stored in the memory 11 The image similarity information retrieval recommendation program 10 implements the following steps:

In the above steps, the user portraits of different users are obtained, and the user portrait similarity between the user portraits is determined to be obtained by the user portrait similarity calculation method based on the graph algorithm;

Specifically, the user portrait similarity calculation method based on the graph algorithm includes the following steps:

Store the user portrait P as a collection related to coordinates (q, D _q ); where q represents any query record of the user, and D _q represents all documents related to the query record q;

Construct a weighted bipartite graph based on the user portrait P(X) and user portrait P(Y) to be processed; among them, P(X) is the user portrait of user X, P(Y) is the user portrait of user Y, and P(X) The vertex e of is connected to the vertex é of P(Y) through the edge (e, é);

Obtain the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph;

Determine the weight of the edge (e, é) according to the similarity between the vertex e of P(X) and the vertex é of P(Y);

Obtain the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é);

The user portrait similarity of user X and user Y is obtained according to the maximum weighted matching value.

Preferably, the user portrait P(X) of user X is stored as:

The user portrait P(Y) of user Y is stored as:

among them,

Represents the i-th query of user X,

Representation and query

All relevant documents;

Represents the jth query of user Y,

Representation and query

All relevant documents.

Preferably, the vertex e of the user portrait P(X) includes a corresponding first query element and a first document element, and the vertex e of the user portrait P(Y) includes a corresponding second query element and a second document element;

The process of obtaining the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) includes:

Acquiring the first similarity between the first query element and the second query element, and acquiring the second similarity between the first document element and the second document element;

The similarity between the vertex e and the vertex é is determined based on the first similarity and the second similarity.

Preferably, the first similarity between the first query element and the second query element is obtained through edit distance algorithm, Jaccard coefficient algorithm, TF algorithm, TFIDF algorithm, or Word2Vec algorithm;

The second similarity between the first document element and the second document element is obtained by the TFIDF algorithm or the space vector-based cosine algorithm.

Preferably, the user portrait P(X) of the user X includes elements A, B, C, D, and E, wherein the elements A, B, C, D, and E include the first query element and the first document element;

The user portrait P(Y) of user Y includes

elements

1, 2, 3, 4, and 5, where

elements

1, 2, 3, 4, and 5 include the second query element and the second document element;

Step 1: Obtain all weighted matching values of the weighted bipartite graph by the following formula;

M ₁ =w(A,1)+w(B,3)+w(C,2)+w(D,4)+w(E,5)

M ₂ =w(A,1)+w(B,3)+w(C,5)+w(D,4)+w(E,2)

M ₂ =w(A,1)+w(B,4)+w(C,2)+w(D,3)+w(E,5)

M ₂ =w(A,1)+w(B,4)+w(C,5)+w(D,3)+w(E,2)

Among them, w(i, j) represents the similarity between element i and element j or the weight of edge ij;

Step 2: Determine the maximum weighted matching value from all weighted matching values.

After obtaining the user portrait similarity of different users, a user community can be created based on the user portrait similarity between the user P(X) and the user P(Y), and the user query results can be ranked and recommended according to the created user community.

As a specific example, assuming that the sentence that user U needs to query is q, the steps of querying based on the similarity of user portraits between user P(X) and user P(Y) include:

Step 1: Find a historical query record A similar to query q.

Let A={(U ₁ ,q ₁ ,D _q1 ),(U ₂ ,q ₂ ,D _q2 ),...(U _m ,q _m ,D _qm )}

s(q,q _i )>θ and s(P(U),P(U _i ))>ω 1≤i≤m

Among them, U _m represents the user, q _m is the query of the user U _m , D _qm is all documents related to the query q _m , P(U) is the user portrait of the user U, P(U _i ) is the user portrait of the user i , S(P(u),P(U _i )) is the similarity of user portrait between user U and user I; s(q,q ₁ ) is the similarity between sentence q and sentence qi, the above similarity Both can be obtained by the user portrait similarity calculation method based on graph algorithm.

Step 2: Calculate all document collections related to query q.

D _q =D _q1 ∪D _q2 ∪…D _qm

Secondly, for each document d in the corpus meets d ∈ D _q , calculate the following score:

And for each d that does not belong to D _q , the default R(U,d,q)=0;

Step 3: For each document d in the corpus, calculate the similarity between d and q to obtain the similarity r(d, q);

Step 4: Calculate the final ranking of each document in the corpus:

R _final (U,d,q)=a*r(d,q)+b*R(U,d,q)

Among them, a and b are setting coefficients.

Step 5: According to the final ranking of the documents, the documents can be sorted to construct an output list. According to the output list, the sentence q that the user U needs to query can be queried and output.

The electronic device 1 proposed in the above embodiment obtains the similarity between user portraits through the maximum matching of the weighted bipartite graph, and can dynamically construct a user community based on the similarity of user portraits in a collaborative information retrieval environment, and is personalized according to the user community Information retrieval recommendation can improve user retrieval accuracy, optimize user experience, and achieve personalized recommendation.

Corresponding to the above electronic device, this application also provides an information retrieval recommendation system based on the similarity of user portraits. Referring to FIG. 2, it is a program module diagram of a preferred embodiment of the information retrieval recommendation system based on the similarity of user portraits in the embodiment of this application. The information retrieval recommendation system based on the similarity of user portraits can be divided into:

The user portrait similarity determination unit 110 is configured to obtain user portraits of different users and determine the user portrait similarity between the user portraits;

The dynamic community creation unit 120, the user creates a user dynamic community based on the similarity of user portraits, so that users with similar portraits belong to the same user dynamic community;

The search recommendation unit 130 is configured to perform information search and recommendation for users based on the user's dynamic community and the user's query sentence.

The user portrait similarity determination unit 110 further includes:

The user portrait storage module 111 is configured to store the user portrait P as a collection related to coordinates (q, D _q ); where q represents any query record of the user, and D _q represents all documents related to the query record q;

The weighted bipartite graph construction module 112 is used to construct a weighted bipartite graph based on the user profile P(X) and the user profile P(Y) to be processed; where P(X) is the user profile of user X, and P(Y) is the user User portrait of Y, vertex e of P(X) is connected to vertex é of P(Y) through edge (e, é);

The similarity acquisition module 113 is configured to acquire the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph;

The weight determination module 114 is configured to determine the weight of the edge (e, é) according to the similarity between the vertex e of P(X) and the vertex é of P(Y);

The maximum weighted matching value obtaining module 115 is configured to obtain the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é);

The user portrait similarity determination module 116 is configured to obtain the user portrait similarity of the user X and the user Y according to the maximum weighted matching value.

Specifically, the user portrait P(X) of user X is stored as:

The user portrait P(Y) of user Y is stored as:

among them,

Represents the i-th query of user X,

Representation and query

All relevant documents;

Represents the jth query of user Y,

Representation and query

All relevant documents.

Wherein, the vertex e of the user portrait P(X) includes the corresponding first query element and the first document element, and the vertex é of the user portrait P(Y) includes the corresponding second query element and the second document element;

The similarity acquisition module 113 includes:

The query element and document element similarity acquisition module 1131, configured to acquire the first similarity between the first query element and the second query element, and to acquire the second similarity between the first document element and the second document element;

The similarity determination module 1132 between vertices is used to determine the similarity between the vertex e and the vertex e based on the first similarity and the second similarity.

The query element and document element similarity acquisition module 1131 includes:

The first similarity acquisition module is used to acquire the first similarity between the first query element and the second query element through the edit distance algorithm, the Jaccard coefficient algorithm, the TF algorithm, the TFIDF algorithm, or the Word2Vec algorithm;

The second similarity acquisition module is configured to acquire the second similarity between the first document element and the second document element through the TFIDF algorithm or the space vector-based cosine algorithm.

The user portrait P(X) of user X includes elements A, B, C, D, and E, where elements A, B, C, D, and E include the first query element and the first document element;

The user portrait P(Y) of user Y includes

elements

1, 2, 3, 4, and 5, where

elements

M ₁ =w(A,1)+w(B,3)+w(C,2)+w(D,4)+w(E,5)

M ₂ =w(A,1)+w(B,3)+w(C,5)+w(D,4)+w(E,2)

M ₂ =w(A,1)+w(B,4)+w(C,2)+w(D,3)+w(E,5)

M ₂ =w(A,1)+w(B,4)+w(C,5)+w(D,3)+w(E,2)

In addition, this application also provides an information retrieval recommendation method based on the similarity of user portraits. Referring to FIG. 3, it is a flowchart of a preferred embodiment of an information retrieval recommendation method based on the similarity of user portraits according to this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the information retrieval recommendation method based on the similarity of user portraits includes the following steps:

Step S11: Obtain user portraits of different users, and determine the user portrait similarity between the user portraits.

Step S12: Create a user dynamic community based on the similarity of user portraits, so that users with similar portraits belong to the same user dynamic community.

Step S13: Perform information search and recommendation on the user according to the user's dynamic community and the user's query sentence.

As shown in the flowchart of the user portrait similarity calculation method based on the graph algorithm in FIG. 4, the above step S11 further includes the following steps:

Step S101: Store the user portrait P as a collection related to the coordinates (q, D _q ); where q represents any query record of the user, and D _q represents all documents related to the query record q.

Among them, user portraits are also known as user roles. As an effective tool for delineating target users, contacting user demands and design directions, user portraits have been widely used in various fields. In the process of actual operation, we often use the most simple and life-like words to connect users' attributes, behaviors and expectations. As virtual representatives of actual users, the user roles formed by user portraits are not constructed outside of the product and the market. The user roles formed need to have representative performance to represent the main audience and target groups of the product.

In this application, the user portrait P(X) of user X can be stored as:

The user portrait P(Y) of user Y can be stored as:

among them,

Represents the i-th query of user X,

Representation and query

All relevant documents;

Represents the jth query of user Y,

Representation and query

All relevant documents.

Therefore, the User Profile Similarity (UPS) between User X and User Y is to calculate the similarity between the above two sets of P(x) and P(y).

Step S102: Construct a weighted bipartite graph based on the user portrait P(X) and user portrait P(Y) to be processed; where P(X) is the user portrait of user X, P(Y) is the user portrait of user Y, and P The vertex e of (X) is connected to the vertex é of P(Y) through the edge (e, é).

Among them, bipartite graph is also called bipartite graph, which is a special model in graph theory. Let G=(V,E) be an undirected graph. If the vertex V can be divided into two disjoint subsets (A, B), and each edge (i, j) in the graph is associated with two If the vertices i and j belong to these two different vertex sets (i in A, j in B), then the graph G is called a bipartite graph.

Among them, a weighted bipartite graph G=(V=(P(X), P(Y)), E) is constructed based on the aforementioned user portrait P(x) and user portrait P(Y). The elements of the user portrait P(X) form part of the graph G, and the elements of P(Y) form another part of the graph. Each vertex e of P(X) is connected to each vertex é of P(Y) by an edge (e, é). The weight of the edge (e, é) is equal to the similarity between the vertices (or elements) e and é. Among them, the weight of the edge (e, é) is related to the element type, and the element type includes query or document.

First, obtain the first similarity between the first query element and the second query element, and obtain the second similarity between the first document element and the second document element; then, based on the first similarity and the second similarity The sex determines the similarity between vertex e and vertex é.

Step S103: Obtain the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph.

Among them, each vertex e of the user portrait P(X) includes a corresponding query element and document element, and each vertex of the user portrait P(Y) also includes a corresponding query element and document element. The difference between vertex e and vertex é is obtained. In the process of similarity, first obtain the similarity between the query elements of the user profile P(X) and the user profile P(Y), and the documents between the user profile P(X) and the user profile P(Y) The similarity between the elements, based on the similarity between the query vertices of the user profile P(x) and the user profile P(Y) and the similarity between the vertices of each document, we can determine all the vertices e and é The similarity.

Specifically, the calculation of the similarity between the queries of the user profile P(X) and the user profile P(Y) is actually the calculation of the similarity between the query sentences. The current method for obtaining the similarity of query sentences mainly includes: Edit distance algorithm, Jaccard coefficient algorithm, TF algorithm, TFIDF algorithm, Word2Vec algorithm, etc.

Further, the edit distance algorithm, called Edit Distance in English, is also called Levenshtein distance algorithm, which refers to the minimum number of edit operations required to convert two strings from one to the other. If their distance is greater, they The more different. The permitted editing operations include replacing one character with another, inserting a character, deleting a character, etc.

The Jaccard coefficient, called Jaccard index in English, is also called Jaccard similarity coefficient, which is used to compare the similarity and difference between a limited sample set. The larger the Jaccard coefficient value, the higher the sample similarity. Actually. The calculation method of the Jaccard coefficient is very simple. It is the value obtained by dividing the intersection of two samples by the union. When the two samples are exactly the same, the result is 1, and when the two samples are completely different, the result is 0.

In addition, the similarity calculation methods between the documents of the user profile P(X) and the user profile P(Y) mainly include the TFIDF algorithm and the cosine algorithm based on space vectors.

In other words, the first similarity between the first query element and the second query element is obtained by the edit distance algorithm, the Jacquard coefficient algorithm, the TF algorithm, the TFIDF algorithm, or the Word2Vec algorithm; between the first document element and the second document element The second similarity of is obtained by TFIDF algorithm or cosine algorithm based on space vector.

Step S104: Determine the weight of the edge (e, é) according to the similarity between the vertex e of P(X) and the vertex é of P(Y).

Specifically, the weight of the edge (e, é) can be set equal to the similarity between the vertex e of P(X) and the vertex é of P(Y).

Step S105: Obtain the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é).

Among them, the maximum matching of the bipartite graph mainly refers to: given a bipartite graph G, in a subgraph M of the bipartite graph G, any two edges in the edge set of M are not attached to the same vertex, then M is called a match. Choosing such a subset with the largest number of edges is called the maximum matching problem of the graph. If in a match, every vertex in the graph is associated with an edge in the graph, then the match is called a complete match , Also known as complete matching.

For example, the user portrait P(X) of user X includes elements A, B, C, D, and E, where A, B, C, D, and E contain the first query element and the first document element, and the user portrait of user Y P(Y) contains

elements

1, 2, 3, 4, and 5, of which 1, 2, 3, 4, and 5 contain the second query element and the second document element. The user profile P(X) and the user profile P(Y ) The constructed bipartite graph is shown in Figure 4.

According to the bipartite graph constructed based on the user portrait in Figure 5, the weighted matching value of the maximum matching situation is calculated by the following formula:

M ₁ =w(A,1)+w(B,3)+w(C,2)+w(D,4)+w(E,5)

M ₂ =w(A,1)+w(B,3)+w(C,5)+w(D,4)+w(E,2)

M ₂ =w(A,1)+w(B,4)+w(C,2)+w(D,3)+w(E,5)

M ₂ =w(A,1)+w(B,4)+w(C,5)+w(D,3)+w(E,2)

Among them, w(i, j) represents the similarity between element i and element j or the weight of edge ij; for example, w(A, 1) represents the similarity between element A and element 1, which also represents the edge The weight of A1, w(B,3), w(C,2)...w(E,5), etc. are similar.

Furthermore, the maximum weighted matching value is determined from all the weighted matching values. In this specific embodiment, the maximum weighted matching value is 3.5.

Step S106: Acquire the user portrait similarity of the user X and the user Y according to the maximum weighted matching value.

Among them, after the user portrait similarity of different users is to be obtained, a user community can be created based on the user portrait similarity between the user P(X) and the user P(Y), and the user query results can be performed according to the created user community. Sort recommendation.

Step 1: Find a historical query record A similar to query q.

Let A={(U ₁ ,q ₁ ,D _q1 ),(U ₂ ,q ₂ ,D _q2 ),...(U _m ,q _m ,D _qm )}

s(q,q _i )>θ and s(P(U),P(U _i ))>ω 1≤i≤m

Step 2: Calculate all document collections related to query q.

D _q =D _q1 ∪D _q2 ∪…D _qm

And for each d that does not belong to D _q , the default R(U,d,q)=0;

Step 4: Calculate the final ranking of each document in the corpus:

R _final (U,d,q)=a*r(d,q)+b*R(U,d,q)

Among them, a and b are setting coefficients.

Using the above-mentioned information retrieval recommendation method based on the similarity of user options, the weighted bipartite graph maximum matching method is used to obtain the similarity between user portraits, and the user community can be dynamically constructed based on the similarity of user portraits in the collaborative information retrieval environment, and according to users The community’s personalized information retrieval recommendation can improve user retrieval accuracy, optimize user experience, and achieve personalized recommendation.

In addition, the embodiment of the present application also proposes a computer-readable storage medium. The computer-readable storage medium includes an information retrieval recommendation program based on the similarity of user portraits. The information retrieval recommendation program based on the similarity of user portraits is implemented when the processor is executed. Do as follows:

The specific implementation of the computer-readable storage medium of the present application is substantially the same as the specific implementation of the above-mentioned information retrieval recommendation method, electronic device, and system based on the similarity of user portraits, and will not be repeated here.

It should be noted that in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article or method that includes the element.

The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments. Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.

The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An information retrieval recommendation method based on portrait similarity, applied to an electronic device, characterized in that the method includes:

Acquire user portraits of different users, and determine the similarity of user portraits between user portraits;

Create a user dynamic community based on the similarity of the user portraits, so that users with similar portraits belong to the same user dynamic community;

According to the user dynamic community and the query sentence of the user, information retrieval recommendation is performed on the user.
The method for information retrieval and recommendation based on portrait similarity according to claim 1, wherein the step of obtaining user portraits of different users and determining the user portrait similarity between the user portraits comprises:

Store the user portrait P as a collection related to the coordinates (q, D q ); where q represents any query record of the user, and D q represents all documents related to the query record q;

Construct a weighted bipartite graph based on the user portrait P(X) and user portrait P(Y) to be processed; among them, P(X) is the user portrait of user X, P(Y) is the user portrait of user Y, and P(X) The vertex e of is connected to the vertex é of P(Y) through the edge (e, é);

Acquiring the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph;

Determining the weight of the edge (e, é) according to the similarity between the vertex e of the P(X) and the vertex é of the P(Y);

Obtaining the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é);

The user portrait similarity of the user X and the user Y is obtained according to the maximum weighted matching value.
The information retrieval recommendation method based on the similarity of portraits according to claim 2, characterized in that,

The user portrait P(X) of the user X is stored as:

The user portrait P(Y) of the user Y is stored as:

among them,
Represents the i-th query of user X,
Representation and query
All relevant documents;
Represents the jth query of user Y,
Representation and query
All relevant documents.
The information retrieval recommendation method based on the similarity of portraits according to claim 2, characterized in that,

The vertex e of the user portrait P(X) includes a corresponding first query element and a first document element, and the vertex e of the user portrait P(Y) includes a corresponding second query element and a second document element;

The process of obtaining the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) includes:

Acquiring a first similarity between the first query element and a second query element, and acquiring a second similarity between the first document element and the second document element;

The similarity between the vertex e and the vertex e is determined based on the first similarity and the second similarity.
The information retrieval recommendation method based on the similarity of portraits according to claim 4, characterized in that,

The first similarity between the first query element and the second query element is obtained through an edit distance algorithm, a Jacquard coefficient algorithm, a TF algorithm, a TFIDF algorithm, or a Word2Vec algorithm;

The second similarity between the first document element and the second document element is obtained through a TFIDF algorithm or a space vector-based cosine algorithm.
The information retrieval recommendation method based on the similarity of portraits according to claim 2, characterized in that,

The user portrait P(X) of the user X includes elements A, B, C, D, and E, where the elements A, B, C, D, and E include the first query element and the first document element;

The user portrait P(Y) of user Y includes elements 1, 2, 3, 4, and 5, where elements 1, 2, 3, 4, and 5 include the second query element and the second document element;

Step 1: Obtain all weighted matching values of the weighted bipartite graph by the following formula;

M 1 =w(A,1)+w(B,3)+w(C,2)+w(D,4)+w(E,5)

M 2 =w(A,1)+w(B,3)+w(C,5)+w(D,4)+w(E,2)

M 2 =w(A,1)+w(B,4)+w(C,2)+w(D,3)+w(E,5)

M 2 =w(A,1)+w(B,4)+w(C,5)+w(D,3)+w(E,2)

Among them, w(i, j) represents the similarity between element i and element j or the weight of edge ij;

Step 2: Determine the maximum weighted matching value from all weighted matching values.
The method for information retrieval and recommendation based on portrait similarity according to claim 1, wherein the step of assigning users with similar portraits to the same user dynamic community comprises:

All users whose user portrait similarity meets the preset similarity range belong to the same user dynamic community.
The method for information retrieval and recommendation based on portrait similarity according to claim 1, wherein the step of performing information retrieval and recommendation on the user according to the user dynamic community and the query sentence of the user comprises:

Obtaining all historical query records related to the query sentence;

Acquiring a document collection related to the historical query record;

Acquiring the similarity between each document in the document collection and the query sentence;

Sorting the documents in the document collection based on the similarity to construct an output list according to the sorting;

Output the query result corresponding to the query sentence based on the output list.
The method for information retrieval and recommendation based on the similarity of portraits according to claim 8, wherein the formula for obtaining all historical query records is as follows:

A={(U 1 ,q 1 ,D q1 ),(U 2 ,q 2 ,D q2 ),...(U m ,q m ,D qm )}

s(q,q i )>θ and s(P(U),P(U i ))>ω1≤i≤m

Among them, A represents historical query records, U m represents users, q m is the query of user U m , D qm is all documents related to the query q m , P(U) is the user portrait of user U, P(U i ) Is the user portrait of user i, s(P(U), P(U i )) is the user portrait similarity between user U and user I; s(q, q 1 ) is the difference between sentence q and sentence qi Similarity.
The method for information retrieval and recommendation based on portrait similarity according to claim 8, wherein the formula for obtaining the document collection related to the historical query record is as follows:

D q =D q1 ∪D q2 ∪…D qm

Secondly, D q is a document set, D qm is all documents related to the query q m , and each document d in the document set conforms to d ∈ D q .
An electronic device, characterized in that the electronic device includes a memory and a processor, the memory includes an information retrieval recommendation program based on portrait similarity, and the information retrieval recommendation program based on portrait similarity is used by the processor The following steps are implemented during execution:

Acquire user portraits of different users, and determine the similarity of user portraits between user portraits;

Create a user dynamic community based on the similarity of the user portraits, so that users with similar portraits belong to the same user dynamic community;

According to the user dynamic community and the query sentence of the user, information retrieval recommendation is performed on the user.
The electronic device according to claim 11, wherein the step of obtaining user portraits of different users and determining the similarity of the user portraits between the user portraits comprises:

Store the user portrait P as a collection related to the coordinates (q, D q ); where q represents any query record of the user, and D q represents all documents related to the query record q;

Construct a weighted bipartite graph based on the user portrait P(X) and user portrait P(Y) to be processed; among them, P(X) is the user portrait of user X, P(Y) is the user portrait of user Y, and P(X) The vertex e of is connected to the vertex é of P(Y) through the edge (e, é);

Acquiring the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph;

Determining the weight of the edge (e, é) according to the similarity between the vertex e of the P(X) and the vertex é of the P(Y);

Obtaining the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é);

The user portrait similarity of the user X and the user Y is obtained according to the maximum weighted matching value.
The electronic device according to claim 12, wherein the user portrait P(X) of the user X is stored as:

The user portrait P(Y) of the user Y is stored as:

among them,
Represents the i-th query of user X,
Representation and query
All relevant documents;
Represents the jth query of user Y,
Representation and query
All relevant documents.
An information retrieval recommendation system based on portrait similarity, which is characterized in that it includes:

The user portrait similarity determination unit is used to obtain user portraits of different users and determine the user portrait similarity between user portraits;

A dynamic community creation unit, where the user creates a user dynamic community based on the similarity of the user portrait, so that users with similar portraits belong to the same user dynamic community;

The search recommendation unit is configured to perform information search and recommendation on the user according to the user dynamic community and the user's query sentence.
The information retrieval recommendation system based on portrait similarity according to claim 14, wherein the user portrait similarity determination unit comprises:

The user portrait storage module is used to store the user portrait P as a collection related to coordinates (q, D q ); wherein, q represents any query record of the user, and D q represents all documents related to the query record q;

The weighted bipartite graph construction module is used to construct a weighted bipartite graph based on the user profile P(X) and user profile P(Y) to be processed; where P(X) is the user profile of user X, and P(Y) is user Y In the user portrait of, the vertex e of P(X) is connected to the vertex é of P(Y) through the edge (e, é);

A similarity acquisition module, configured to acquire the similarity between the vertex e of the user portrait P(X) and the vertex é of the user portrait P(Y) based on the weighted bipartite graph;

A weight determination module, configured to determine the weight of the edge (e, é) according to the similarity between the vertex e of the P(X) and the vertex é of the P(Y);

The maximum weighted matching value obtaining module is configured to obtain the maximum weighted matching value between the user portrait P(X) and the user portrait P(Y) based on the weight of the edge (e, é);

The user portrait similarity determination module is configured to obtain the user portrait similarity of the user X and the user Y according to the maximum weighted matching value.
The information retrieval recommendation system based on the similarity of portraits according to claim 15, characterized in that,

The user portrait P(X) of the user X is stored as:

The user portrait P(Y) of the user Y is stored as:

among them,
Represents the i-th query of user X,
Representation and query
All relevant documents;
Represents the jth query of user Y,
Representation and query
All relevant documents.
The information retrieval recommendation system based on the similarity of portraits according to claim 15, characterized in that,

The vertex e of the user portrait P(X) includes a corresponding first query element and a first document element, and the vertex e of the user portrait P(Y) includes a corresponding second query element and a second document element;

The similarity acquisition module includes:

The query element and the document element similarity acquisition module is used to acquire the first similarity between the first query element and the second query element, and to acquire the difference between the first document element and the second document element Second similarity

The similarity determination module between vertices is used to determine the similarity between the vertex e and the vertex e based on the first similarity and the second similarity.
The information retrieval recommendation system based on portrait similarity according to claim 17, wherein said query element and document element similarity acquisition module comprises:

The first similarity acquisition module is configured to acquire the first similarity between the first query element and the second query element through edit distance algorithm, Jaccard coefficient algorithm, TF algorithm, TFIDF algorithm or Word2Vec algorithm;

The second similarity acquisition module is configured to acquire the second similarity between the first document element and the second document element through the TFIDF algorithm or the space vector-based cosine algorithm.
The information retrieval recommendation system based on portrait similarity according to claim 15, characterized in that,

The user portrait P(X) of the user X includes elements A, B, C, D, and E, where the elements A, B, C, D, and E include the first query element and the first document element;

The user portrait P(Y) of user Y includes elements 1, 2, 3, 4, and 5, where elements 1, 2, 3, 4, and 5 include the second query element and the second document element;

Step 1: Obtain all weighted matching values of the weighted bipartite graph by the following formula;

M 1 =w(A,1)+w(B,3)+w(C,2)+w(D,4)+w(E,5)

M 2 =w(A,1)+w(B,3)+w(C,5)+w(D,4)+w(E,2)

M 2 =w(A,1)+w(B,4)+w(C,2)+w(D,3)+w(E,5)

M 2 =w(A,1)+w(B,4)+w(C,5)+w(D,3)+w(E,2)

Among them, w(i, j) represents the similarity between element i and element j or the weight of edge ij;

Step 2: Determine the maximum weighted matching value from all weighted matching values.
A computer-readable storage medium, wherein the computer-readable storage medium includes an information retrieval recommendation program based on portrait similarity, and when the information retrieval recommendation program based on portrait similarity is executed by a processor, the following The steps of an information retrieval recommendation method based on portrait similarity according to any one of claims 1 to 10.