CN109672706B

CN109672706B - Information recommendation method and device, server and storage medium

Info

Publication number: CN109672706B
Application number: CN201710960175.0A
Authority: CN
Inventors: 许瑾
Original assignee: Baidu Online Network Technology Beijing Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd
Priority date: 2017-10-16
Filing date: 2017-10-16
Publication date: 2022-06-14
Anticipated expiration: 2037-10-16
Also published as: CN109672706A

Abstract

The embodiment of the invention discloses an information recommendation method, an information recommendation device, a server and a storage medium. The method comprises the following steps: determining the relevance weight of the target document and each character chain knowledge point according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document; determining the word chain knowledge points to be recommended of the target document according to the relevance weight of the target document and each word chain knowledge point; and pushing information according to the word chain knowledge points to be recommended. According to the technical scheme provided by the embodiment of the invention, the accuracy of information recommendation is improved, the viscosity between the user and the pushed information is improved, and the user experience is further improved.

Description

Information recommendation method and device, server and storage medium

Technical Field

The invention relates to the technical field of internet application, in particular to an information recommendation method, an information recommendation device, a server and a storage medium.

Background

In recent years, with the development of internet wave, the internet is full of various massive content information, and how to present better content to users and enable users to find desired content is one of the problems to be urgently solved at present.

Currently, common recommendation methods are: collaborative filtering, which is a classic recommendation algorithm, judges what commodity should be recommended by a user through preference data of the user and the commodity; latent factor analysis is a classical derivative of the recommendation system. These two common recommendation methods have had great success in the field of internet advertising.

However, collaborative filtering often presents cold start problems due to lack of user interaction behavior data, such as: when a new user in the system does not have any browsing or purchasing record, the characteristics of the new user cannot be described, and further, recommended article matching cannot be carried out. And the potential factor analysis model is complex, the calculation performance requirement is high, the decomposed factors are abstract vectors and are unreadable, and the requirement of fast iteration of internet products cannot be met. In addition, the method of entity mining based construction of word chains is inefficient.

Disclosure of Invention

The embodiment of the invention provides an information recommendation method, an information recommendation device, a server and a storage medium, which can improve the accuracy of information recommendation, improve the viscosity between a user and push information and improve the user experience.

In a first aspect, an embodiment of the present invention provides an information recommendation method, where the method includes:

determining the relevance weight of the target document and each character chain knowledge point according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document;

determining the word chain knowledge points to be recommended of the target document according to the relevance weight of the target document and each word chain knowledge point;

and pushing information according to the word chain knowledge points to be recommended.

In a second aspect, an embodiment of the present invention further provides an information recommendation apparatus, where the apparatus includes:

the relevance weight determining module is used for determining the relevance weight of the target document and each character chain knowledge point according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document;

the word chain to be recommended determining module is used for determining word chain knowledge points to be recommended of the target document according to the relevance weight of the target document and each word chain knowledge point;

and the information pushing module is used for pushing information according to the word chain knowledge points to be recommended.

In a third aspect, an embodiment of the present invention further provides a server, where the server includes:

one or more processors;

storage means for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors implement the information recommendation method of any of the first aspects.

In a fourth aspect, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the information recommendation method according to any one of the first aspects.

According to the information recommendation method, the information recommendation device, the server and the storage medium, the relevance weight of the target document and each character chain knowledge point is determined according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document, the character chain knowledge point to be recommended of the target document is determined according to the relevance weight, and the character chain knowledge point to be recommended is pushed to a user. The problems of cold start, complex weight determination model, low text chain building efficiency and the like in the conventional recommendation method are solved, the information recommendation accuracy is improved, the viscosity between a user and the pushed information is improved, and the user experience is further improved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

fig. 1 is a flowchart of an information recommendation method according to a first embodiment of the present invention;

fig. 2 is a flowchart of an information recommendation method provided in the second embodiment of the present invention;

fig. 3 is a flowchart of a method for determining a knowledge point of a text chain included in a document based on NLP and a knowledge graph according to a third embodiment of the present invention;

fig. 4 is a block diagram of an information recommendation apparatus according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a server provided in the fifth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings.

Example one

Fig. 1 is a flowchart of an information recommendation method according to an embodiment of the present invention, which is based on a case that an NLP and an application text chain of a knowledge graph on a library product line are used to help a user find a desired document and content. The method can be executed by the information recommendation device/server/computer readable storage medium provided by the embodiment of the invention, and the device/server/computer readable storage medium can be implemented in a software and/or hardware manner. Referring to fig. 1, the method specifically includes:

s110, determining the relevance weight of the target document and each character chain knowledge point according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document.

Among them, NLP (Natural Language Processing) is a sub-field in the field of Artificial Intelligence (AI), and is used to research various theories and methods that can realize effective communication between a person and a computer using Natural Language. The knowledge graph is a semantic network which takes entities and concepts as nodes and takes semantic relations as edges, and is essentially a semantic network formed by mutually connecting knowledge points. Generally, a knowledge graph is a relational network obtained by connecting all different kinds of information together. Knowledge-graphs provide the ability to analyze problems from a "relational" perspective. Knowledge-graph makes knowledge acquisition more direct.

The word chain knowledge points are selected from all knowledge points contained in the document to be used as the knowledge points of the word chain. The word chain is used for solving the requirement that a user can further interpret the knowledge which the user wants to know in the document content. The word chain can improve the distribution efficiency of the content, and a user can enter the next webpage by clicking the word chain, wherein the webpage comprises documents related to the knowledge points of the word chain and paraphrases of the knowledge. A chain of words that are greenish may be derived from an entity in the educational knowledge map. If one word chain knowledge point and the other word chain knowledge point appear in the same document, the two word chain knowledge points are the co-occurrence word chain knowledge points. The Inverse Document Frequency (IDF) is a measure of the general importance of a term, and the IDF of a particular term can be obtained by dividing the total number of documents by the number of documents containing the term and taking the logarithm of the obtained quotient. The target document may be the kth document of all documents, for example, one document selected from ten million documents as the target document. And S120, determining the word chain knowledge points to be recommended of the target document according to the relevance weight of the target document and each word chain knowledge point.

The word chain knowledge points to be recommended are word chain knowledge points capable of expressing more and potential intentions which the user wants to know. And processing the obtained relevance weight of the target document and each character chain knowledge point, for example, sequencing the relevance weight, screening out character chain knowledge points capable of expressing the intention of the user according to a sequencing result, and serving as character chain knowledge points to be recommended to push to the user.

Illustratively, S120 may specifically include: sequencing the text chain knowledge points according to the relevance weight of the target document and the text chain knowledge points; filtering out text chain knowledge points contained in the target document from the sequencing result; and selecting a preset number of character chain knowledge points with high relevance weight as character chain knowledge points to be recommended of the target document according to the filtering result. The preset value may be set according to needs, for example, may be 6. In order to know the potential requirements of the user, the word chain knowledge points which are already contained in the target document are removed, and then the rest word chain knowledge points are selected to be word chain knowledge points to be recommended, wherein the word chain knowledge points have higher weights, such as the first 6 word chain knowledge points, according to the sequence of the weights from the higher value to the lower value.

And S130, carrying out information push according to the word chain knowledge points to be recommended.

The information pushing mode may be displaying on the right side of the search result display area. Specifically, the content related to the word chain knowledge point to be recommended, such as pictures, news or articles, can be found in the knowledge graph or the library according to the word chain knowledge point to be recommended, and the content is displayed on the right side of the search result display area so as to be recommended to the user.

According to the information recommendation method provided by the embodiment of the invention, the relevance weight of the target document and each character chain knowledge point is determined according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document, the character chain knowledge point to be recommended of the target document is determined according to the relevance weight, and the character chain knowledge point to be recommended is pushed to a user. The problems of cold start, complex weight determination model, low text chain building efficiency and the like in the conventional recommendation method are solved, the information recommendation accuracy is improved, the viscosity between a user and the pushed information is improved, and the user experience is further improved.

Example two

Fig. 2 is a flowchart of an information recommendation method according to a second embodiment of the present invention, which is based on the first embodiment of the present invention and further provides a method for determining a relevance weight between a target document and each text chain knowledge point according to a co-occurrence text chain knowledge point of each text chain knowledge point in all documents, a reverse file frequency of each text chain knowledge point in all documents, and a text chain knowledge point included in the target document. Correspondingly, the method comprises the following steps:

s210, constructing a co-occurrence matrix according to the co-occurrence character chain knowledge points of the character chain knowledge points in all the documents.

If one word chain knowledge point and the other word chain knowledge point occur together in one document, the two word chain knowledge points are the co-occurrence word chain knowledge points. Taking the existing ten-million documents as an example, the co-occurrence matrix G can be obtained by counting the co-occurrence text chains in the ten-million documents for each text chain knowledge point.

For example, constructing the co-occurrence matrix according to the co-occurrence word chain knowledge points of the word chain knowledge points in all the documents may include: counting all the documents to determine co-occurrence text chain knowledge points of the text chain knowledge points; if the jth word chain knowledge point and the kth word chain knowledge point appear in a document together, G in the co-occurrence matrix_jkTaking 1; otherwise, G_jkTake 0, where j is 1, …, N, k is 1, …, N is all literal chain knowledge pointsThe number of the cells.

For example, taking N to 130 ten thousand, according to the above rule, a 130 x 130 ten thousand co-occurrence matrix G can be obtained, and can be expressed as:

s220, determining a reverse file vector according to the reverse file frequency of each character chain knowledge point in all the documents.

The inverse file frequency IDF of a word chain knowledge point can be obtained by dividing the total number of all documents by the number of documents containing the word chain knowledge point, and the inverse file vector is a vector formed by the inverse file frequency IDF of all word chain knowledge points and can be expressed as

For example, the inverse document vector may be determined according to the following formula

Middle IDF_n1：

Where M is the total number of all documents, E_nThe number of documents containing the nth word chain knowledge point is shown, N is 1, …, and N is the number of all word chain knowledge points. For N130 ten thousand, a 130 ten thousand × 1 vector can be obtained. For example, it may be:

s230, determining a word chain correlation matrix of the target document according to the word chain knowledge points contained in the target document.

Illustratively, the determination may be made according to the following stepsThe text chain correlation matrix of the target document specifically comprises: if the target document contains the ith word chain knowledge point, X in the word chain correlation matrix of the target document_1iTaking 1; otherwise, X_1iTake 0, where i ═ 1, …, N, is the number of knowledge points for all literal chains. For example, if there are ten million documents with 130 ten thousand word chain knowledge points, the word chain correlation matrix X may be a 1000 ten thousand by 130 ten thousand matrix:

then the Kth document in ten million documents is taken as the target document, and the character chain correlation matrix X_KCan be expressed as:

X_K＝[0010000......01]

s240, determining the relevance vector of the target document and each character chain knowledge point according to the character chain relevance matrix, the co-occurrence matrix and the reverse file vector of the target document.

For example, the relevance vector of the target document and each word chain knowledge point can be determined according to the following formula:

wherein the content of the first and second substances,

is the correlation vector of the target document and each word chain knowledge point, X is the word chain correlation matrix of the target document, G is the co-occurrence matrix,

is an inverse file vector. Where, denotes the dot product of two matrices, and x denotes the product of two vectors (cross product).

When the Kth document is taken as the target document, the steps S210, S220 and S230 are executed to obtain the co-occurrence matrix G and the reverse file vector

And a correlation matrix X_KSubstituting it into the above formula to obtain a correlation vector

Take N130 ten thousand to order X_KDot product with G matrix to obtain matrix X^/ _KPost-processing into vectors

The specific operation process is as follows:

first, matrix X_KDot product with matrix G yields matrix X^/ _KNamely: x^/ _K＝X_K·G

Then, to facilitate subsequent vector cross multiplication, matrix X is multiplied^/ _KConverted into a column vector

After transformation:

finally, by the product of two vectors, i.e.

The specific calculation process is as follows:

and S250, determining the relevance weight of the target document and each character chain knowledge point according to the relevance vector of the target document and each character chain knowledge point.

The relevance vector between the target document and each word chain knowledge point is obtained in step S240

Each value in (a) represents a relevance weight of the target document to each word chain knowledge point.

And S260, determining the word chain knowledge points to be recommended of the target document according to the relevance weight of the target document and each word chain knowledge point.

And S270, carrying out information push according to the word chain knowledge points to be recommended.

According to the information recommendation method provided by the embodiment of the invention, the relevance vector of the target document and each character chain knowledge point is determined according to the character chain relevance matrix, the co-occurrence matrix and the reverse file vector of the target document, so that the relevance weight of the target document and each character chain knowledge point is obtained, the character chain knowledge point to be recommended of the target document is determined according to the relevance weight, and the character chain knowledge point to be recommended of the target document is pushed to a user. The problem of current weight determination model complicacy is solved, the degree of accuracy of information recommendation has been improved, the viscosity between user and the propelling movement information has been promoted, and then user experience has been promoted.

EXAMPLE III

Fig. 3 is a flowchart of a method for determining a word chain knowledge point included in a document based on NLP and a knowledge graph according to a third embodiment of the present invention, where the method is based on the above-mentioned embodiment of the present invention, and the specific method may include the following steps:

s310, taking the product of the word frequency of each knowledge point in the document and the reverse file frequency of each knowledge point contained in the document as the correlation degree of each knowledge point and the document.

Wherein, the term frequency refers to the number of times a given knowledge point is in the document, and can be represented by C (d, e), wherein d represents the document, e represents the knowledge point in the knowledge map, and the inverse file frequency of each knowledge point can be represented by IDF_e. Relevancy refers to the percentage of two things that are related to each other and is commonly used to evaluate the importance of a word to one of a set of documents or a corpus, i.e., to one of the documents in a corpusThe degree of association between a given knowledge point and the document in which the knowledge point is located, and the mathematical formula for expressing the degree of association by the product of the word frequency and the inverse file frequency can be expressed as follows: c (d, e). times.IDF_e。

S320, determining the weight of each knowledge point according to the correlation degree of each knowledge point and the document, the information amount of each knowledge point and the similarity of each knowledge point and the document title.

The information amount refers to a measure of the amount of information, and the information amount of each knowledge point can be determined according to the following formula:

I_e＝log₂(len(e))

wherein e is a knowledge point, I_eIs the amount of information of e, len (e) is the length of e.

The specific operation process for determining the similarity between each knowledge point and the document title comprises the following steps: extracting subject terms in the document titles, calculating semantic similarity between the subject terms and each knowledge point, and expressing the semantic similarity as follows by using mathematical symbols:

sim (t, e), where t represents the title of document d.

The weight of each knowledge point is the proportion value of each knowledge point in the document and is represented by Q (d, e). According to the correlation degree C (d, e) x IDF between each knowledge point and the document_eInformation amount of each knowledge point I_e＝log₂(len (e)) and the similarity sim (t, e) of each knowledge point and the document title determine an expression of each knowledge point weight Q (d, e) that can be expressed as:

Q(d,e)＝α((C(d,e)×IDF_e)+β(log₂(len(e))+θsim(t,e))

wherein alpha, beta and theta are adjustment parameters, and document parameters in different fields are slightly different.

S330, screening the knowledge points contained in the document according to the weight of the knowledge points, and taking the screened knowledge points as character chain knowledge points.

Arranging the related weights of the knowledge points according to the weights of the knowledge points calculated in the step S320 from large to small, and screening the knowledge points in the preset range as character chain knowledge points. The preset range is user-defined according to actual conditions, for example, 10 or 15 is available, and the first 10 knowledge points with large weight are screened out as text chain knowledge points.

According to the method for determining the character chain knowledge points in the document based on the NLP and the knowledge graph, provided by the embodiment of the invention, the weight of each knowledge point is determined according to the correlation degree of each knowledge point and the document, the information content of each knowledge point and the similarity of each knowledge point and the title of the document; and screening the knowledge points contained in the document according to the weight of the knowledge points to determine the text chain knowledge points in the document. The problem that the efficiency of constructing the character chain is low in the conventional recommendation method is solved.

Example four

Fig. 4 is a block diagram of an information recommendation apparatus according to a fourth embodiment of the present invention, which is capable of executing an information recommendation method according to any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method. As shown in fig. 4, the apparatus may include:

a relevance weight determining module 401, configured to determine a relevance weight of the target document and each text chain knowledge point according to a co-occurrence text chain knowledge point of each text chain knowledge point in all documents, a reverse file frequency of each text chain knowledge point in all documents, and a text chain knowledge point included in the target document;

a module 402 for determining a word chain to be recommended, configured to determine a word chain knowledge point to be recommended for the target document according to the relevance weight between the target document and each word chain knowledge point;

and an information pushing module 403, configured to push information according to the word chain knowledge point to be recommended.

According to the information recommendation device provided by the embodiment of the invention, the relevance weight of the target document and each character chain knowledge point is determined according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document, the character chain knowledge point to be recommended of the target document is determined according to the relevance weight, and the character chain knowledge point to be recommended of the target document is pushed to a user. The problems of cold start, complex weight determination model, low text chain building efficiency and the like in the conventional recommendation method are solved, the information recommendation accuracy is improved, the viscosity between a user and the pushed information is improved, and the user experience is further improved.

Illustratively, the relevance weight determining module 401 includes:

the co-occurrence matrix construction unit is used for constructing a co-occurrence matrix according to the co-occurrence character chain knowledge points of the character chain knowledge points in all the documents;

the reverse file vector determining unit is used for determining a reverse file vector according to the reverse file frequency of each character chain knowledge point in all the documents;

the correlation matrix determining unit is used for determining a word chain correlation matrix of the target document according to word chain knowledge points contained in the target document;

a relevance vector determining unit, configured to determine a relevance vector between the target document and each text chain knowledge point according to the text chain relevance matrix of the target document, the co-occurrence matrix, and the reverse file vector;

and the relevance weight determining unit is used for determining the relevance weight of the target document and each character chain knowledge point according to the relevance vector of the target document and each character chain knowledge point.

Optionally, the correlation matrix determining unit may be specifically configured to:

if the target document K contains the ith word chain knowledge point, the word chain correlation matrix of the target document

Taking 1; if not, then,

take 0, where i ═ 1, …, N, is the number of knowledge points for all literal chains.

Optionally, the co-occurrence matrix constructing unit may specifically be configured to:

counting all the documents to determine co-occurrence text chain knowledge points of the text chain knowledge points;

if the jth word chain knowledge point and the kth word chain knowledge point appear in a document together, G in the co-occurrence matrix_jkTaking 1; otherwise, G_jkTake 0 where j is 1, …, N, k is 1, …, N is the number of knowledge points of all word chains.

Optionally, the inverse file vector determining unit may be specifically configured to:

determining IDF in the reverse file vector according to the following formula_n1：

Where M is the total number of all documents, E_nThe number of documents containing the nth word chain knowledge point is N, which is 1, …, N is the number of all word chain knowledge points.

Optionally, the correlation vector determination unit may be specifically configured to:

determining a relevance vector of the target document and each word chain knowledge point according to the word chain relevance matrix of the target document, the co-occurrence matrix and the reverse file vector, wherein the determining comprises the following steps:

determining a relevance vector of the target document and each word chain knowledge point according to the following formula:

wherein

Is the correlation vector, X, of the target document K and each word chain knowledge point_kIs the word chain dependency matrix of the target document K, G is the co-occurrence matrix,

is the inverse document vector.

For example, the module 402 for determining word chain to be recommended may include:

the word chain knowledge point sequencing unit is used for sequencing the word chain knowledge points according to the relevance weight of the target document and each word chain knowledge point;

the word chain knowledge point filtering unit is used for filtering word chain knowledge points contained in the target document from the sequencing result;

and the word chain to be recommended determining unit is used for selecting a preset number of word chain knowledge points with high relevance weight as the word chain knowledge points to be recommended of the target document according to the filtering result.

Optionally, the apparatus may further include a text chain knowledge point determining module; the word chain knowledge point determination module may be specifically configured to:

taking the product of the word frequency of each knowledge point in the document and the reverse file frequency of each knowledge point contained in the document as the correlation degree of each knowledge point and the document;

determining the weight of each knowledge point according to the correlation degree of each knowledge point and the document, the information content of each knowledge point and the similarity of each knowledge point and the title of the document;

and screening the knowledge points contained in the document according to the weight of the knowledge points, and taking the screened knowledge points as character chain knowledge points.

Optionally, the information amount of each knowledge point is determined according to the following formula:

I_e＝log₂(len(e))

EXAMPLE five

Fig. 5 is a schematic structural diagram of a server according to a fifth embodiment of the present invention. FIG. 5 illustrates a block diagram of an exemplary server 12 suitable for use in implementing embodiments of the present invention. The server 12 shown in fig. 5 is only an example, and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.

As shown in fig. 5, the server 12 is in the form of a general purpose computing device. The components of the server 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

The server 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by server 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The server 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5 and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

The server 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the device, and/or with any devices (e.g., network card, modem, etc.) that enable the server 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the server 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the server 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the server 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, for example, implementing an information recommendation method provided by an embodiment of the present invention, by executing a program stored in the system memory 28.

EXAMPLE six

An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement any one of the information recommendation methods in the foregoing embodiments.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer-readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The above example numbers are for description only and do not represent the merits of the examples.

It will be understood by those skilled in the art that the modules or steps of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and optionally they may be implemented by program code executable by a computing device, such that it may be stored in a memory device and executed by a computing device, or it may be separately fabricated into various integrated circuit modules, or it may be fabricated by fabricating a plurality of modules or steps thereof into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An information recommendation method, comprising:

carrying out information pushing according to the word chain knowledge points to be recommended, wherein the word chain knowledge points select a part from all knowledge points contained in the document as word chain knowledge points;

determining the relevance weight of the target document and each character chain knowledge point according to the co-occurrence character chain knowledge point of each character chain knowledge point in all documents, the reverse file frequency of each character chain knowledge point in all documents and the character chain knowledge point contained in the target document, wherein the relevance weight comprises the following steps:

constructing a co-occurrence matrix according to the co-occurrence text chain knowledge points of the text chain knowledge points in all the documents;

determining a reverse file vector according to the reverse file frequency of each character chain knowledge point in all the documents;

determining a word chain correlation matrix of a target document according to word chain knowledge points contained in the target document;

determining a correlation vector of the target document and each word chain knowledge point according to the word chain correlation matrix of the target document, the co-occurrence matrix and the reverse file vector;

determining the relevance weight of the target document and each character chain knowledge point according to the relevance vector of the target document and each character chain knowledge point;

wherein, the determining the relevance vector of the target document and each word chain knowledge point according to the word chain relevance matrix of the target document, the co-occurrence matrix and the reverse file vector comprises:

wherein

is the inverse document vector.

2. The method of claim 1, wherein determining a word chain correlation matrix of a target document according to word chain knowledge points contained in the target document comprises:

if the target document contains the ith word chain knowledge point, X in the word chain correlation matrix of the target document_1iTaking 1; otherwise, X_1iTake 0, where i ═ 1, …, N, is the number of knowledge points for all literal chains.

3. The method of claim 1, wherein constructing a co-occurrence matrix from co-occurrence word chain knowledge points of each word chain knowledge point in all documents comprises:

4. The method of claim 1, wherein determining a reverse document vector based on the reverse document frequency of each word chain knowledge point in all documents comprises:

Where M is the total number of all documents, E_nThe number of documents containing the nth word chain knowledge point is shown, N is 1, …, and N is the number of all word chain knowledge points.

5. The method of claim 1, wherein determining the word chain knowledge points to be recommended for the target document according to the relevance weight of the target document and each word chain knowledge point comprises:

sequencing the knowledge points of each character chain according to the relevance weight of the target document and the knowledge points of each character chain;

filtering out word chain knowledge points contained in the target document from the sequencing result;

and selecting a preset number of character chain knowledge points with high relevance weight as character chain knowledge points to be recommended of the target document according to the filtering result.

6. The method of claim 1, wherein determining word-chain knowledge points contained in the document comprises:

7. The method of claim 6, comprising:

the information amount of each knowledge point is determined according to the following formula:

I_e＝log₂(len(e))

8. An information recommendation apparatus, comprising:

the information pushing module is used for pushing information according to the word chain knowledge points to be recommended, wherein the word chain knowledge points select a part from all knowledge points contained in a document as word chain knowledge points;

the correlation weight determination module includes:

the relevance weight determining unit is used for determining the relevance weight of the target document and each character chain knowledge point according to the relevance vector of the target document and each character chain knowledge point;

wherein the correlation vector determination unit is specifically configured to:

wherein

is the inverse document vector.

9. The apparatus of claim 8, wherein the text chain to be recommended determining module comprises:

the word chain knowledge point sequencing unit is used for sequencing the word chain knowledge points according to the relevance weight of the target document and the word chain knowledge points;

10. The apparatus of claim 9, further comprising a text chain knowledge point determination module; the word chain knowledge point determination module is specifically configured to:

11. A server, characterized in that the server comprises:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the information recommendation method of any of claims 1-7.

12. A storage medium on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the information recommendation method according to any one of claims 1-7.