WO2021004124A1

WO2021004124A1 - Data comparison-based information recommendation method and device, and storage medium

Info

Publication number: WO2021004124A1
Application number: PCT/CN2020/086286
Authority: WO
Inventors: 郭鸿程
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2019-07-05
Filing date: 2020-04-23
Publication date: 2021-01-14
Also published as: CN110457574A

Abstract

The present application discloses a data comparison-based information recommendation method, comprising: acquiring first data of a target user and second data of a compared user, wherein the first data and the second data are related to a preconfigured theme, and the first data and the second data are homomorphically encrypted; comparing the size of the first data and the size of the second data by means of a homomorphic operation, and obtaining a ranking result; acquiring, from the ranking result, the position of the target user in the ranking, and returning the ranking result to the target user; if the position of the target user is a preconfigured position, acquiring recommended product information corresponding to the preconfigured theme; and transmitting to the target user the recommended product information corresponding to the preconfigured theme. The present application further provides a data comparison-based information recommendation device, and a storage medium. The present application protects private data of a user, and accurately performs personalized recommendation.

Description

Information recommendation method, device and storage medium based on data comparison

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on July 5, 2019, the application number is 201910605697.8, and the invention title is "Data comparison-based information recommendation method, device and storage medium", the entire content of which is incorporated by reference Incorporated in this application.

Technical field

This application relates to the field of big data technology, and in particular to an information recommendation method, device and computer-readable storage medium based on data comparison.

Background technique

In recent years, with the emergence of social platforms such as Moments of Friends and Weibo, people increasingly like to share information through these social platforms, such as ranking sharing behaviors such as salary posting, posting consumption, posting weight, and posting age. The inventor realizes that this ranking sharing behavior compares the user's personal data with other people's data, and then obtains the comparison result for publication, which is likely to cause the leakage of the user's personal privacy data and cause information security problems for the user. On the other hand, when the user's personal privacy data cannot be obtained, it is also impossible to effectively recommend personalized products to the user. Therefore, how to protect the user's personal information security and accurately perform personalized recommendations is an urgent problem to be solved.

Summary of the invention

This application provides an information recommendation method, device, and computer-readable storage medium based on data comparison, the main purpose of which is not only to protect the user's private data, but also to accurately perform personalized recommendations.

To achieve the above objective, this application also provides an information recommendation method based on data comparison, which includes:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data;

Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user;

If the ranking of the target user is a preset ranking, obtaining recommended product information corresponding to the preset theme;

Sending recommended product information corresponding to the preset theme to the target user.

Optionally, the obtaining recommended product information corresponding to the preset theme includes:

Acquiring relevant information about the preset theme;

Mapping the relevant information of the preset theme to the target dictionary of the preset BOW model to obtain the target histogram feature vector, the target dictionary being obtained by clustering processing of training samples;

The target histogram feature vector is input to the naive Bayes classifier used to construct the preset BOW model, and the relevant information of the preset topic is classified by the naive Bayes classifier to obtain all State the category of related information about the preset theme;

Obtaining the product information to be recommended corresponding to the category of the related information of the preset theme;

It is determined that the product information to be recommended corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme.

In addition, in order to achieve the above object, the present application also provides an information recommendation device based on data comparison. The device includes a memory and a processor. The memory stores information recommendation based on data comparison that can run on the processor. A program, when the information recommendation program based on data comparison is executed by the processor, the following steps are implemented:

In addition, in order to achieve the above objective, this application also provides a computer device, including:

One or more processors;

Memory

One or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, and the one or more computer programs are configured to execute An information recommendation method based on data comparison, wherein the information recommendation method based on data comparison includes:

In addition, in order to achieve the above-mentioned object, the present application also provides a computer-readable storage medium that stores an information recommendation program based on data comparison, and the information recommendation program based on data comparison can be used by one or A plurality of processors are executed to implement the steps of the information recommendation method based on data comparison as described above.

The method, device, and computer-readable storage medium for information recommendation based on data comparison proposed in this application acquire first data of a target user and second data of a comparison user. The first data and the second data are related presets. Subject data, and the first data and the second data are homomorphically encrypted data; compare the sizes of the first data and the second data through a homomorphic operation to obtain the sorting result; from the Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user; if the ranking of the target user is a preset ranking, acquiring recommended product information corresponding to the preset theme; The target user sends recommended product information corresponding to the preset theme. Since the first data of the target user and the second data of other users are encrypted data, and the data is compared through homomorphic operations, this application protects the details of the data from being disclosed while comparing the data; at the same time, because The user can still be sorted accurately without obtaining the details of the number of users, and then personalized recommendation can be made according to the user's ranking. Therefore, this application achieves the purpose of not only protecting the user's private data, but also accurately performing personalized recommendation.

Description of the drawings

FIG. 1 is a schematic flowchart of an information recommendation method based on data comparison provided by an embodiment of this application;

2 is a schematic diagram of the internal structure of an information recommendation device based on data comparison provided by an embodiment of the application;

FIG. 3 is a schematic diagram of modules of an information recommendation program based on data comparison in an information recommendation device based on data comparison provided by an embodiment of the application.

Detailed ways

This application provides an information recommendation method based on data comparison. Referring to FIG. 1, it is a schematic flowchart of an information recommendation method based on data comparison provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

Optionally, the device is a data comparison center, and the data comparison center mainly centers on big data technologies hadoop and spark. Among them, Hadoop is composed of hdfs (responsible for cluster storage management) and yarn (responsible for system resource scheduling), and spark is used for specific calculation logic.

Preferably, the data comparison center performs network security protection measures based on the big data architecture obtained by cloud computing.

In an optional embodiment, the adopted network security protection measures may include:

(1) East-west flow monitoring. Through the virtual firewall technology, all traffic flows through the virtual firewall, and the data is forwarded to the target virtual host through the virtual firewall, so as to realize the isolation, control and security inspection of the traffic between different virtual machines of the same physical host and between different physical hosts . All virtual protective wall functions are consistent with physical firewalls, and can be divided into different security zones such as Trust, Untrust, Local, and DMZ (undefended zones). Different security zones can be pre-flexibly configured with security policies, and users can control the flow of data packets to achieve security protection. The default setting of the network between different virtual firewalls is blocked, which can solve the problem of horizontal traffic control on the core device to a certain extent.

(2) Deploy IDS/IPS intrusion detection and defense equipment. Deploy IPS intrusion prevention equipment between the platform core router and egress firewall, and attach IDS intrusion detection equipment on the side of the core switch to defend the application layer, such as preventing worms, viruses, Trojan horses, denial of service attacks, spyware, and VoIP attacks And point-to-point application abuse, blocking malicious traffic before loss occurs, avoiding external application layer attacks.

In this embodiment, the information recommendation method based on data comparison includes:

Step S101: Obtain first data of a target user and second data of a comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are Data that is homomorphically encrypted.

In this embodiment, the target user is a user who wants to compare data. The number of the comparison users may be multiple, and the second data is the second data of each comparison user.

The preset theme may be consumption amount, height, age, etc. within a period of time.

For example, the target user is user A, the comparison user is user B, the first data of the target user is the first cumulative consumption amount of user A in the past six months, and the second data of the comparison user is the second data of user B in the past six months. Cumulative consumption amount.

In this embodiment, the first data and the second data are both homomorphically encrypted data. In an alternative embodiment, the first data is transmitted to the data comparison center after the client of the target user is homomorphically encrypted, and the second data is transmitted to the data comparison center after the client of the comparison user is homomorphically encrypted of.

The homomorphic encryption refers to a given plaintext (x ₁ , x ₂ ,..., x _n ), encrypted with a homomorphic encryption algorithm to obtain the ciphertext c. Fully homomorphic encryption allows anyone to perform anything on the ciphertext c. Operation f, the ciphertext f(c) obtained after the operation is the same as the result of f(x ₁ ,x ₂ ,...,x _n ) after decryption. In this process (x ₁ ,x ₂ ,…,x _n ), f(x ₁ ,x ₂ ,…,x _n ) and any intermediate plaintext are not leaked; the input value, output value, and intermediate value are always in encrypted state in. There are different requirements for the final ciphertext form of f(x ₁ ,x ₂ ,…,x _n ). The minimum requirement is that it can be decrypted correctly to obtain f(x ₁ ,x ₂ ,…,x _n ), However, satisfying different ciphertext calculation characteristics leads to different forms of homomorphic encryption.

Homomorphic encryption includes semi-homomorphic encryption and fully homomorphic encryption. Semi-homomorphic encryption means that data encryption meets additive homomorphism or multiplicative homomorphism. The RSA algorithm satisfies the multiplicative homomorphism, and the Paillier algorithm satisfies the additive homomorphism.

For example, for the RSA algorithm, the public key (e, N), the encryption of the plaintext M is expressed as C = E (M) = M e mod N;

Exist for any M ₁ and M ₂ :

That is, for any plaintext M ₁ , M ₂ ,...M _n , all have:

E(M ₁ )*E(M ₂ )*...E(M _n )=E(M ₁ *M ₂ *...M _n ), that is, the RSA algorithm satisfies the multiplication homomorphic operation.

In an optional embodiment of the present application, the first data and the second data may be obtained through encryption using an asymmetric encryption (RSA) algorithm. Specifically, the first data and the second data may be provided by the data comparison center The public key is encrypted.

Step S201: Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result.

In this embodiment, a homomorphic operation is performed on the first data and the second data. For example, the homomorphic operation is to add or multiply the first data and the second data with a standard number and then size them.

In this embodiment, the sorting result is which one of the first data and the second data is larger and which one is smaller.

When the second data is data of multiple comparison users, the first data and the multiple data are respectively compared to obtain the sorting result.

Optionally, in another embodiment of the present application, the comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result includes:

Adding the negative numbers of the first data and the second data to obtain a first calculation result, and if the first calculation result is a positive number, obtaining a sorting result in which the first data is greater than the second data, If the first calculation result is a negative number, a sorting result in which the first data is smaller than the second data is obtained; or

The negative number of the first data is added to the second data to obtain a second calculation result, the second calculation result is a positive number, and the sorting result that the first data is less than the second data is obtained, if The second calculation result is a negative number, and a sorting result in which the first data is greater than the second data is obtained.

Step S301: Obtain the ranking of the target user from the ranking result, and return the ranking result to the target user.

For example, the ranking of the target user obtained from the ranking result is the first; or the ranking of the target user obtained from the ranking result is the second; the ranking of the target user obtained from the ranking result is the third.

In an optional embodiment, returning the sorting result to the target user includes returning the sorting result to the client of the target user, and the client of the target user may share and display the sorting result through the user's sharing operation.

Optionally, in another embodiment of the present application, the returning the sorting result to the target user includes:

Encrypt the sorting result by using the received public key sent by the target user to obtain the encrypted sorting result;

Returning the encrypted sorting result to the target user.

In this embodiment, when returning the sorting result to the target user, the sorting result can be encrypted with the public key of the target user. After receiving the encrypted sorting result, the client of the target user decrypts it with the private key to obtain To the specific sorting result, the security in the data transmission process is improved.

Step S401: If the target user's ranking is a preset ranking, obtain recommended product information corresponding to the preset theme.

In an optional embodiment, the ranking of the target users as the preset ranking includes: the ranking of the target users is the first.

In another optional embodiment, the ranking of the target users as the preset ranking includes: the ranking of the target users is the top third.

In another optional example, the order of the target users as the preset order includes: the order of the target users is the first third and the first third.

The recommended product information corresponding to the preset theme may be preset. For example, the preset recommended products corresponding to the preset theme are low-fat snacks and low-fat drinks. When the user's body fat percentage ranks the lowest, the low-fat snacks and low-fat drinks are recommended to the user.

In other embodiments of the present application, different product information can also be recommended according to different rankings of preset themes and target users.

Optionally, in another implementation of the present application, the obtaining recommended product information corresponding to the preset theme includes:

Acquiring relevant information about the preset theme;

In this embodiment, the information related to the preset theme is information related to the preset theme.

For example, the preset theme is a consumption theme, and the consumption-related information of the preset theme includes consumption history records (such as information about commodities purchased at different consumption times and locations).

In this embodiment, the BOW model is established in advance, and the specific BOW model is constructed by a clustering algorithm (such as the k-means algorithm and the naive Bayes classifier).

In an optional embodiment, the preset BOW can be constructed in the following manner:

(1) Use a clustering algorithm (such as k-means algorithm) to cluster big data and find the cluster center point (that is, vocabulary). The so-called clustering refers to dividing data objects with higher similarity into the same cluster, and dividing data objects with higher dissimilarity into different clusters according to the principle of similarity. Among them, k in the k-means algorithm represents the number of clusters, and means represents the mean value of the data objects in the cluster (this kind of mean is a description of the center of the cluster), therefore, the k-means algorithm is also called k-means algorithm. The k-means algorithm is a clustering algorithm based on partitioning. It uses distance as a measure of similarity between data objects. That is, the smaller the distance between data objects, the higher their similarity, which means that they are more likely to be in the same A cluster. In the embodiment of this application, the distance between data objects is calculated using Euclidean distance, assuming that x _i and x _j are data, D represents the number of attributes of the data object, and the distance between the two is:

Among them, x _{i, d} represents the d-dimensional coordinates of the i-th point, and x _{j, d} represents the d-dimensional coordinates of the j-th point.

At the same time, define the cluster center of the k-th cluster as Center _k , and its update method is:

Where C _k represents the number of data objects in the k-th cluster, and Center _k represents a vector containing D attributes.

Finally, use the error sum of squares criterion function to obtain the final clustering result J:

The training data is like a cluster center mapping, and a low-dimensional representation of each training data in the cluster center space is obtained. Through the final clustering result J, use it as the basis of the histogram, use the basis vector to construct other vectors, and do mapping to obtain the statistics of the histogram of one category of different categories. This process is also the process of extracting the features of the BOW model. .

After obtaining the low-dimensional representation of each training data, select a polynomial-based naive Bayes classifier for training. Naive Bayes classification is a classifier with low variance and high deviation. It is assumed that there is a conditional independence hypothesis between each feature: for a given category, all features are independent of each other. For a given sample x=(x ₁ ,x ₂ ,...,x _d ) ^T , the posterior probability of belonging to category w _i is:

Where d is the feature dimension, and x _k is the value of the sample on the k-th feature. To avoid the problem of data sparseness, you can use smoothing on the data first:

Where c _k represents the number of possible values of the k-th dimension feature, and α is the coefficient. This application uses the MLE maximum likelihood estimation method to obtain:

Among them, D _i represents the set of training samples of class w _i , and the numerator

Training sample set D _i w _i represents the class configuration, the value of the k-th feature x _k is the number of samples.

In this embodiment, after constructing the preset BOW model and obtaining relevant information about the preset theme, the relevant information about the preset theme is mapped to the target dictionary of the preset BOW model, where the target dictionary of the preset BOW model is The cluster center space obtained by clustering when constructing the BOW model.

In this embodiment, the product information to be recommended corresponding to the category of the related information of the preset theme may be preset, that is, the corresponding relationship of the product information to be recommended corresponding to different categories is preset, and then the preset theme is obtained After the category of the related information belongs to, the product information to be recommended corresponding to the category is obtained according to the category.

Optionally, in another embodiment of the present application, the determining that the product information corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme includes:

Performing word frequency feature vector extraction on the relevant information of the preset theme to obtain a word frequency vector;

Calculating the similarity between the word frequency vector and the product information to be recommended;

It is determined that among the product information to be recommended, the product information whose similarity with the word frequency vector is greater than the preset similarity is the recommended product information corresponding to the preset theme.

In an embodiment, the similarity can be calculated by the cosine similarity.

The cosine similarity uses the cosine value of the angle between two vectors in the vector space as a measure of the difference between two individuals. The closer the cosine value is to 1, the closer the angle is to 0 degrees, that is, the two vectors. The more similar. For the obtained information related to the topic posted by the customer and the recommended product information, use the following formula to calculate:

Where, X is a vector-related information indicate the subject matter of the sun, Y is a vector representation of the recommended product information, X _i represents the component of the vector X, Y _i represents the vector Y component.

The similarity obtained by the above formula ranges from -1 to 1, where -1 means that the two vectors point in opposite directions, 1 means that their directions are exactly the same, and 0 usually means that they are independent.

In this embodiment, the similarity is judged based on the calculated value, so that recommended product information with high similarity is recommended to the target user, so that products that are more suitable for the user can be recommended.

Optionally, in another embodiment of the present application, before the mapping related information of the preset theme to the target dictionary of the preset BOW model, the method further includes:

Performing text processing on the related information of the preset topic, the text processing includes performing word segmentation processing on the related information of the preset topic through a hidden Markov model, and after word segmentation processing through a preset keyword extraction algorithm The information is rewritten.

In this embodiment, text processing is performed on the related information of the preset theme first, and then the operation of mapping to the target dictionary of the preset BOW model is performed according to the related information of the preset theme obtained after processing.

The text rewrite (Rewrite) refers to a text that first uses Chinese word segmentation, then cleans up, retains the main words, and performs semantic enhancement (synonym/related word supplement) on the main words.

First, this application performs word segmentation processing on the related information of the preset topic by constructing a hidden Markov model. Since the text satisfies the Markov property, that is, the possibility of the occurrence of the m-th word in the text is only related to the occurrence of the preceding m-1 words, and has nothing to do with all words before the m-th word and after the m-th word, so N The purpose of the metagrammatic model is to give the probability of the occurrence of the m-th word when the first m-1 words appear, specifically expressed as:

P(W _m |W ₁ ,…W _m-1 )=P(W _m |W ₁ ,…W _m-n+1 ,…W _m-1 )

Among them, m represents any word in the text, and n represents the previous word of the m-th word.

If the sentence S consists of the word sequence {W ₁ , W ₂ …W _m }, the probability that the sentence is arranged according to the word order is:

P(S)=P(W ₁ W ₂ …W _m )=P(W ₁ )P(W ₂ |W ₁ )…P(W _m |W _m-n+1 ,…W _m-1 )

Among them, the conditional probability P(W _m |W _m-n+1 ,...W _m-1 ) means: the probability that W _m appears when the character string W _m-n+1 ,...W _m-1 appears, in Based on the large-scale corpus training, the binary grammar model is used. Therefore, the probability model of the sentence is:

The sentence S is segmented using the full segmentation method to obtain all possible Chinese word segmentation methods, and then the probability of each word segmentation method is calculated, and the word segmentation method with the highest probability is selected as the final text segmentation result. The selection process is to find the maximum value of P(S):

Since there are narratives that have nothing to do with the theme in the related information of the preset theme, this application performs keyword extraction in the case of word segmentation based on the hidden Markov model.

The keyword extraction algorithm uses statistical information, word vector information, and dependency syntax information between words to calculate the correlation strength between words by constructing a dependency relationship graph, and iteratively calculates the importance score of words using the TextRank algorithm, first based on the dependency of the sentence The result of syntactic analysis constructs an undirected graph for all non-stop words, and then calculates the weight of the edges by using the gravity value between the words and the degree of dependence. Therefore, the dependency correlation degree of any two words W _i and W _j is:

Among them, len(W _i , W _j ) represents the length of the dependency path between words W _i and W _j , and b is a hyperparameter.

At the same time, the IDF value is introduced, and the word frequency is replaced with the TF-IDF value, thus taking into account more global information. So a new formula for the value of word gravity is obtained. Gravity of text words W _i and W _j :

Among them, tfidf(W) is the TF-IDF value of word W, and d is the Euclidean distance between the word vectors of words W _i and W _j .

Therefore, the degree of relevance between the two words is:

weight(W _i ,W _j )=Dep(W _i ,W _j )*f _grav (W _i ,W _j )

Finally, the present application establish an undirected graph G = (V, E) using TextRank algorithm, where V is the set of vertices, E is the set of edges, the score is calculated the WS (W _i) vertex W _i according to the following formula, wherein ,

Is the set related to the vertex W _i (the set of vertices pointing to the vertex), η is the damping coefficient, W _k represents the vertex in the undirected graph G, and WS(W _j ) is the score of the vertex W _j . In this embodiment, several words with the highest scores can be selected as the main words, and the main words can be semantically enhanced.

Step S501: Send recommended product information corresponding to the preset theme to the target user.

For example, if the recommended product information corresponding to the preset consumption theme is the information of the m electronic product and the information of the n electronic product, the information of the m electronic product and the information of the n electronic product are sent to the user.

After obtaining the recommended product information corresponding to the preset theme, the recommended product information is sent to the target user, so that information can be accurately recommended to the target user.

The information recommendation method based on data comparison proposed in this embodiment obtains first data of a target user and second data of a comparison user. The first data and the second data are data related to a preset theme, and the The first data and the second data are homomorphic encrypted data; the size of the first data and the second data is compared through a homomorphic operation to obtain the sorting result; the target is obtained from the sorting result The ranking of users, and returning the ranking result to the target user; if the ranking of the target user is a preset ranking, obtain recommended product information corresponding to the preset theme; Describe the recommended product information corresponding to the preset theme. Since the first data of the target user and the second data of other users are encrypted data, and the data is compared through homomorphic operations, this application protects the details of the data from being disclosed while comparing the data; at the same time, because The user can still be sorted accurately without obtaining the details of the number of users, and then personalized recommendation can be made according to the user's ranking. Therefore, this application achieves the purpose of not only protecting the user's private data, but also accurately performing personalized recommendation.

This application also provides an information recommendation device based on data comparison. Referring to FIG. 2, it is a schematic diagram of the internal structure of an information recommendation device based on data comparison provided by an embodiment of this application.

In this embodiment, the information recommendation device 1 based on data comparison may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer. The information recommendation device 1 based on data comparison at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.

Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 11 may be an internal storage unit of the information recommendation device 1 based on data comparison, such as a hard disk of the information recommendation device 1 based on data comparison. In some other embodiments, the memory 11 may also be an external storage device of the information recommendation device 1 based on data comparison, such as a plug-in hard disk equipped on the information recommendation device 1 based on data comparison, and a smart media card (SMC). ), Secure Digital (SD) card, Flash Card, etc. Further, the memory 11 may also include both an internal storage unit of the information recommendation device 1 based on data comparison and an external storage device. The memory 11 can not only be used to store application software and various data installed in the information recommendation device 1 based on data comparison, such as the code of the information recommendation program 01 based on data comparison, etc., but also can be used to temporarily store what has been output or will be output The data.

The processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 11 Data, for example, the information recommendation program 01 based on data comparison is executed. The communication bus 13 is used to realize the connection and communication between these components. The network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the device 1 and other electronic devices.

Optionally, the device 1 may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the information recommendation device 1 based on data comparison and to display a visualized user interface.

Figure 2 only shows the data comparison-based information recommendation device 1 with components 11-14 and the data-comparison-based information recommendation program 01. Those skilled in the art can understand that the structure shown in Figure 2 does not constitute a The limitation of the information recommendation device 1 for data comparison may include fewer or more components than shown, or a combination of certain components, or a different component arrangement.

In the embodiment of the device 1 shown in FIG. 2, the memory 11 stores the information recommendation program 01 based on data comparison; the processor 12 implements the following steps when executing the information recommendation program 01 based on the data comparison stored in the memory 11:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data.

In this embodiment, the first data and the second data are both homomorphically encrypted data. In an optional embodiment, the first data is transmitted to the data comparison center after the client of the target user is homomorphically encrypted, and the second data is transmitted to the data comparison center after the client of the comparison user is homomorphically encrypted of.

Exist for any M ₁ and M ₂ :

That is, for any plaintext M ₁ , M ₂ ,...M _n , all have:

The size of the first data and the second data is compared through a homomorphic operation to obtain a sorting result.

Obtain the ranking of the target user from the ranking result, and return the ranking result to the target user.

Returning the encrypted sorting result to the target user.

If the ranking of the target user is a preset ranking, obtain recommended product information corresponding to the preset theme.

Acquiring relevant information about the preset theme;

For example, the preset theme is a consumption theme, and the consumption-related information of the preset theme includes consumption history records (such as information on commodities purchased at different consumption times and locations).

In an embodiment, the similarity can be calculated by the cosine similarity.

The cosine similarity uses the cosine value of the angle between two vectors in the vector space as a measure of the difference between two individuals. The closer the cosine value is to 1, the closer the angle is to 0 degrees, that is, the two vectors. The more similar. For the obtained related information X and recommended product information Y of the subject posted by the customer, use the following formula to calculate:

Optionally, in another embodiment of the present application, before mapping the related information of the preset topic to the target dictionary of the preset BOW model, text processing is performed on the related information of the preset topic, and the text processing It includes performing word segmentation processing on the related information of the preset topic through a hidden Markov model, and performing text rewriting on the information after the word segmentation processing through a preset keyword extraction algorithm.

P(W _m |W ₁ ,…W _m-1 )=P(W _m |W ₁ ,…W _m-n+1 ,…W _m-1 )

Therefore, the degree of relevance between the two words is:

weight(W _i ,W _j )=Dep(W _i ,W _j )*f _grav (W _i ,W _j )

The information recommendation device based on data comparison proposed in this embodiment obtains the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the The first data and the second data are homomorphic encrypted data; the size of the first data and the second data is compared through a homomorphic operation to obtain the sorting result; the target is obtained from the sorting result User ranking, and returning the ranking result to the target user; if the ranking of the target user is a preset ranking, obtain the recommended product information corresponding to the preset theme; and send information to the target user Describe the recommended product information corresponding to the preset theme. Since the first data of the target user and the second data of other users are encrypted data, and the data is compared through homomorphic operations, this application protects the details of the data from being disclosed while comparing the data; at the same time, because The user can still be sorted accurately without obtaining the details of the number of users, and then personalized recommendation can be made according to the user's ranking. Therefore, this application achieves the purpose of not only protecting the user's private data, but also accurately performing personalized recommendation.

Optionally, in other embodiments, the information recommendation program based on data comparison may also be divided into one or more modules, and the one or more modules are stored in the memory 11 and run by one or more processors (this The embodiment is executed by the processor 12) to complete this application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is used to describe the information recommendation program based on data comparison in the information based on data comparison. Recommend the implementation process in the device.

For example, referring to FIG. 3, a schematic diagram of program modules of an information recommendation program based on data comparison in an embodiment of the information recommendation device based on data comparison of this application. In this embodiment, the information recommendation program based on data comparison can be divided For the first acquisition module 10, the comparison module 20, the first transmission module 30, the second acquisition module 40, and the second transmission module 50, exemplarily:

The first acquisition module 10 is configured to acquire first data of a target user and second data of a comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data The second data is data that has been homomorphically encrypted;

The comparison module 20 is configured to compare the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

The first transmission module 30 is configured to: obtain the ranking of the target user from the ranking result, and return the ranking result to the target user;

The second acquiring module 40 is configured to: if the ranking of the target user is a preset ranking, acquire recommended product information corresponding to the preset theme;

The second transmission module 50 is configured to send recommended product information corresponding to the preset theme to the target user.

When the program modules such as the first acquisition module 10, the comparison module 20, the first transmission module 30, the second acquisition module 40, and the second transmission module 50 are executed, the functions or operation steps implemented by the program modules are substantially the same as those in the foregoing embodiment. No longer.

In addition, the present application also provides a computer device, including: one or more processors; a memory; one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be Executed by the one or more processors, and the one or more computer programs are configured to execute an information recommendation method based on data comparison, wherein the information recommendation method based on data comparison includes:

The specific implementation of the computer equipment of this application is basically the same as the foregoing embodiments of the information recommendation device and method based on data comparison, and will not be repeated here.

In addition, an embodiment of the present application also proposes a computer-readable storage medium that stores an information recommendation program based on data comparison, and the information recommendation program based on data comparison can be processed by one or more Executed to achieve the following operations:

The computer-readable storage medium of the present application, wherein the storage medium is a volatile storage medium or a non-volatile storage medium, and the specific implementation is basically the same as the foregoing embodiments of the information recommendation device and method based on data comparison. Not to be exhausted.

It should be noted that the serial numbers of the above-mentioned embodiments of the present invention are only for description, and do not represent the superiority of the embodiments. And the terms "include", "include" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, but also includes The other elements listed may also include elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article or method that includes the element.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of the present invention essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (can be a mobile phone, computer, server, or network device, etc.) execute the method described in each embodiment of the present invention.

The above are only the preferred embodiments of the present invention, and do not limit the scope of the present invention. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present invention, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of the present invention.

Claims

An information recommendation method based on data comparison, wherein the method includes:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data;

Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user;

If the ranking of the target user is a preset ranking, obtaining recommended product information corresponding to the preset theme;

Sending recommended product information corresponding to the preset theme to the target user.
8. The information recommendation method based on data comparison according to claim 1, wherein said obtaining recommended product information corresponding to said preset theme comprises:

Acquiring relevant information about the preset theme;

Mapping the relevant information of the preset theme to the target dictionary of the preset BOW model to obtain the target histogram feature vector, the target dictionary being obtained by clustering processing of training samples;

The target histogram feature vector is input to the naive Bayes classifier used to construct the preset BOW model, and the relevant information of the preset topic is classified by the naive Bayes classifier to obtain all State the category of related information about the preset theme;

Obtaining the product information to be recommended corresponding to the category of the related information of the preset theme;

It is determined that the product information to be recommended corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme.
3. The information recommendation method based on data comparison according to claim 2, wherein the determining that the product information corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme comprises:

Performing word frequency feature vector extraction on the relevant information of the preset theme to obtain a word frequency vector;

Calculating the similarity between the word frequency vector and the product information to be recommended;

It is determined that among the product information to be recommended, the product information whose similarity with the word frequency vector is greater than the preset similarity is the recommended product information corresponding to the preset theme.
3. The information recommendation method based on data comparison according to claim 2, wherein before the mapping the related information of the preset topic to the target dictionary of the preset BOW model, the method further comprises:

Performing text processing on the related information of the preset topic, the text processing includes performing word segmentation processing on the related information of the preset topic through a hidden Markov model, and after word segmentation processing through a preset keyword extraction algorithm The information is rewritten.
The information recommendation method based on data comparison according to any one of claims 1 to 4, wherein the comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result comprises:

Adding the negative numbers of the first data and the second data to obtain a first calculation result, and if the first calculation result is a positive number, obtaining a sorting result in which the first data is greater than the second data, If the first calculation result is a negative number, a sorting result in which the first data is smaller than the second data is obtained; or

The negative number of the first data is added to the second data to obtain a second calculation result, the second calculation result is a positive number, and the sorting result that the first data is less than the second data is obtained, if The second calculation result is a negative number, and a sorting result in which the first data is greater than the second data is obtained.
The information recommendation method based on data comparison according to any one of claims 1 to 4, wherein the returning the ranking result to the target user comprises:

Encrypt the sorting result by using the received public key sent by the target user to obtain the encrypted sorting result;

Returning the encrypted sorting result to the target user.
An information recommendation device based on data comparison, wherein the device includes a memory and a processor, the memory stores an information recommendation program based on data comparison that can be run on the processor, and the data comparison-based information recommendation program When the information recommendation program is executed by the processor, the following steps are implemented:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data;

Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user;

If the ranking of the target user is a preset ranking, obtaining recommended product information corresponding to the preset theme;

Sending recommended product information corresponding to the preset theme to the target user.
8. The information recommendation device based on data comparison according to claim 7, wherein said obtaining recommended product information corresponding to said preset theme comprises:

Acquiring relevant information about the preset theme;

Mapping the relevant information of the preset theme to the target dictionary of the preset BOW model to obtain the target histogram feature vector, the target dictionary being obtained by clustering processing of training samples;

The target histogram feature vector is input to the naive Bayes classifier used to construct the preset BOW model, and the relevant information of the preset topic is classified by the naive Bayes classifier to obtain all State the category of related information about the preset theme;

Obtaining the product information to be recommended corresponding to the category of the related information of the preset theme;

It is determined that the product information to be recommended corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme.
The information recommendation device based on data comparison according to claim 7 or 8, wherein the comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result comprises:

Adding the negative numbers of the first data and the second data to obtain a first calculation result, and if the first calculation result is a positive number, obtaining a sorting result in which the first data is greater than the second data, If the first calculation result is a negative number, a sorting result in which the first data is smaller than the second data is obtained; or

The negative number of the first data is added to the second data to obtain a second calculation result, the second calculation result is a positive number, and the sorting result that the first data is less than the second data is obtained, if The second calculation result is a negative number, and a sorting result in which the first data is greater than the second data is obtained.
A computer device including:

One or more processors;

Memory

One or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, and the one or more computer programs are configured to execute An information recommendation method based on data comparison, wherein the information recommendation method based on data comparison includes:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data;

Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user;

If the ranking of the target user is a preset ranking, obtaining recommended product information corresponding to the preset theme;

Sending recommended product information corresponding to the preset theme to the target user.
The computer device according to claim 10, wherein said obtaining recommended product information corresponding to said preset theme comprises:

Acquiring relevant information about the preset theme;

Mapping the relevant information of the preset theme to the target dictionary of the preset BOW model to obtain the target histogram feature vector, the target dictionary being obtained by clustering processing of training samples;

The target histogram feature vector is input to the naive Bayes classifier used to construct the preset BOW model, and the relevant information of the preset topic is classified by the naive Bayes classifier to obtain all State the category of related information about the preset theme;

Obtaining the product information to be recommended corresponding to the category of the related information of the preset theme;

It is determined that the product information to be recommended corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme.
The computer device according to claim 11, wherein the determining that the product information corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme comprises:

Performing word frequency feature vector extraction on the relevant information of the preset theme to obtain a word frequency vector;

Calculating the similarity between the word frequency vector and the product information to be recommended;

It is determined that among the product information to be recommended, the product information whose similarity with the word frequency vector is greater than the preset similarity is the recommended product information corresponding to the preset theme.
11. The computer device according to claim 11, wherein before the mapping the related information of the preset theme to the target dictionary of the preset BOW model, the method further comprises:

Performing text processing on the related information of the preset topic, the text processing includes performing word segmentation processing on the related information of the preset topic through a hidden Markov model, and after word segmentation processing through a preset keyword extraction algorithm The information is rewritten.
The computer device according to any one of claims 10 to 13, wherein the comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result comprises:

Adding the negative numbers of the first data and the second data to obtain a first calculation result, and if the first calculation result is a positive number, obtaining a sorting result in which the first data is greater than the second data, If the first calculation result is a negative number, a sorting result in which the first data is smaller than the second data is obtained; or

The negative number of the first data is added to the second data to obtain a second calculation result, the second calculation result is a positive number, and the sorting result that the first data is less than the second data is obtained, if The second calculation result is a negative number, and a sorting result in which the first data is greater than the second data is obtained.
The computer device according to any one of claims 10 to 13, wherein the returning the sorting result to the target user comprises:

Encrypt the sorting result by using the received public key sent by the target user to obtain the encrypted sorting result;

Returning the encrypted sorting result to the target user.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, an information recommendation method based on data comparison is implemented, wherein the information recommendation method based on data comparison includes The following steps:

Obtain the first data of the target user and the second data of the comparison user. The first data and the second data are data related to a preset theme, and the first data and the second data are homomorphic Encrypted data;

Comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result;

Acquiring the ranking of the target user from the ranking result, and returning the ranking result to the target user;

If the ranking of the target user is a preset ranking, obtaining recommended product information corresponding to the preset theme;

Sending recommended product information corresponding to the preset theme to the target user.
The computer-readable storage medium according to claim 16, wherein said obtaining recommended product information corresponding to said preset theme comprises:

Acquiring relevant information about the preset theme;

Mapping the relevant information of the preset theme to the target dictionary of the preset BOW model to obtain the target histogram feature vector, the target dictionary being obtained by clustering processing of training samples;

The target histogram feature vector is input to the naive Bayes classifier used to construct the preset BOW model, and the relevant information of the preset topic is classified by the naive Bayes classifier to obtain all State the category of related information about the preset theme;

Obtaining the product information to be recommended corresponding to the category of the related information of the preset theme;

It is determined that the product information to be recommended corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme.
18. The computer-readable storage medium of claim 17, wherein the determining that the product information corresponding to the category of the related information of the preset theme is the recommended product information corresponding to the preset theme comprises:

Performing word frequency feature vector extraction on the relevant information of the preset theme to obtain a word frequency vector;

Calculating the similarity between the word frequency vector and the product information to be recommended;

It is determined that among the product information to be recommended, the product information whose similarity with the word frequency vector is greater than the preset similarity is the recommended product information corresponding to the preset theme.
18. The computer-readable storage medium according to claim 17, wherein before the mapping the relevant information of the preset theme to the target dictionary of the preset BOW model, the method further comprises:

Performing text processing on the related information of the preset topic, the text processing includes performing word segmentation processing on the related information of the preset topic through a hidden Markov model, and after word segmentation processing through a preset keyword extraction algorithm The information is rewritten.
The computer-readable storage medium according to any one of claims 16 to 19, wherein the comparing the sizes of the first data and the second data through a homomorphic operation to obtain a sorting result comprises:

Adding the negative numbers of the first data and the second data to obtain a first calculation result, and if the first calculation result is a positive number, obtaining a sorting result in which the first data is greater than the second data, If the first calculation result is a negative number, a sorting result in which the first data is smaller than the second data is obtained; or

The negative number of the first data is added to the second data to obtain a second calculation result, the second calculation result is a positive number, and the sorting result that the first data is less than the second data is obtained, if The second calculation result is a negative number, and a sorting result in which the first data is greater than the second data is obtained.