CN109672706A

CN109672706A - A kind of information recommendation method, device, server and storage medium

Info

Publication number: CN109672706A
Application number: CN201710960175.0A
Authority: CN
Inventors: 许瑾
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-10-16
Filing date: 2017-10-16
Publication date: 2019-04-23
Anticipated expiration: 2037-10-16
Also published as: CN109672706B

Abstract

The embodiment of the invention discloses a kind of information recommendation method, device, server and storage mediums.This method comprises: the co-occurrence text chain knowledge point according to each text chain knowledge point in all documents, the text chain knowledge point that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents, determines the relevance weight of the destination document Yu each text chain knowledge point；According to the relevance weight of the destination document and each text chain knowledge point, the text chain to be recommended knowledge point of the destination document is determined；Information push is carried out according to the text chain to be recommended knowledge point.Technical solution provided in an embodiment of the present invention improves the accuracy of information recommendation, improves the viscosity between user and pushed information, and then the user experience is improved.

Description

A kind of information recommendation method, device, server and storage medium

Technical field

The present invention relates to technical field of internet application more particularly to a kind of information recommendation method, device, server and deposit Storage media.

Background technique

Recently as the development of interconnection networking tide, the content information of various magnanimity is flooded on internet, such as More good content is presented to user in what, and user is allowed to find the content for wanting to look for, and is one of problem in the urgent need to address at present.

Currently, common recommended method has: collaborative filtering is a kind of proposed algorithm of classics, passes through user and commodity Data are liked to judge what commodity this recommends with user；Latent factor analysis is a kind of classical factions of recommender system.This two The common recommended method of kind achieves huge success in Internet advertising field.

But collaborative filtering method usually will appear cold start-up problem caused by lacking user mutual behavior data, such as: when New user in system does not have any browsing or purchaser record, can not just portray its feature, and then can not carry out recommending article Match.And latent factor analysis model is complicated, and calculated performance is more demanding, and decompositing the factor come is abstract vector, can not It reads, is unable to satisfy the demand of internet product iteratively faster.In addition, constructing text chain low efficiency based on the method that entity excavates Under.

Summary of the invention

The embodiment of the present invention provides a kind of information recommendation method, device, server and storage medium, and information can be improved and push away The accuracy recommended promotes the viscosity between user and pushed information, promotes user experience.

In a first aspect, the embodiment of the invention provides a kind of information recommendation methods, this method comprises:

According to co-occurrence text chain knowledge point of each text chain knowledge point in all documents, each text chain knowledge point is all The text chain knowledge point for including in reverse document-frequency and destination document in document, determines the destination document and each text The relevance weight of chain knowledge point；

According to the relevance weight of the destination document and each text chain knowledge point, the to be recommended of the destination document is determined Text chain knowledge point；

Information push is carried out according to the text chain to be recommended knowledge point.

Second aspect, the embodiment of the invention also provides a kind of information recommending apparatus, which includes:

Relevance weight determining module, for the co-occurrence text chain knowledge according to each text chain knowledge point in all documents Point, the text chain knowledge point that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents, Determine the relevance weight of the destination document Yu each text chain knowledge point；

Text chain determining module to be recommended, for being weighed according to the destination document and the correlation of each text chain knowledge point Weight, determines the text chain to be recommended knowledge point of the destination document；

Info push module, for carrying out information push according to the text chain to be recommended knowledge point.

The third aspect, the embodiment of the invention also provides a kind of server, which includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes any information recommendation method in first aspect.

Fourth aspect, the embodiment of the invention also provides a kind of storage mediums, are stored thereon with computer program, the program Any information recommendation method in first aspect is realized when being executed by processor.

Information recommendation method, device, server and storage medium provided in an embodiment of the present invention, according to each text chain knowledge Point in reverse document-frequency of the co-occurrence text chain knowledge point, each text chain knowledge point in all documents in all documents and The text chain knowledge point for including in destination document, determines the relevance weight of destination document Yu each text chain knowledge point, according to phase Closing property weight determines the text chain to be recommended knowledge point of destination document, and is pushed to user.Solves existing recommendation side The problems such as cold start-up, weight determine complicated model and building text chain low efficiency in method, improves the accuracy of information recommendation, The viscosity between user and pushed information is improved, and then the user experience is improved.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:

Fig. 1 is a kind of flow chart of the information recommendation method provided in the embodiment of the present invention one；

Fig. 2 is a kind of flow chart of the information recommendation method provided in the embodiment of the present invention two；

Fig. 3 is the text chain for including in a kind of document based on NLP and knowledge mapping provided in the embodiment of the present invention three The flow chart of the determination method of knowledge point；

Fig. 4 is a kind of structural block diagram of the information recommending apparatus provided in the embodiment of the present invention four；

Fig. 5 is a kind of structural schematic diagram of the server provided in the embodiment of the present invention five.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched State that the specific embodiments are only for explaining the present invention, rather than limitation of the invention.It also should be noted that for the ease of Description, only some but not all contents related to the present invention are shown in the drawings.

Embodiment one

Fig. 1 is a kind of flow chart for information recommendation method that the embodiment of the present invention one provides, and the present embodiment is based on NLP and knows Map is known on the class product line of library the case where helping user to find desired document and content using text chain.This method It can be executed by information recommending apparatus provided in an embodiment of the present invention/server/computer readable storage medium storing program for executing, the device/ The mode that software and/or hardware can be used in server/computer readable storage medium storing program for executing is realized.Referring to Fig. 1, this method is specifically wrapped It includes:

S110, according to co-occurrence text chain knowledge point of each text chain knowledge point in all documents, each text chain knowledge point The text chain knowledge point for including in reverse document-frequency and destination document in all documents, determine the destination document with The relevance weight of each text chain knowledge point.

Wherein, NLP (Natural Language Processing, natural language processing) is artificial intelligence A subdomains in the field (Artificial Intelligence, AI) are used for studying to be able to achieve between people and computer The various theory and methods of natural language progress efficient communication.Knowledge mapping is using entity, concept as node, with semantic relation As the semantic network on side, be substantially one be connected with each other as knowledge point made of semantic network.Generally, knowledge mapping A relational network obtained from exactly all different types of information are linked together.Knowledge mapping is provided from " relationship " Angle go the ability of problem analysis.Knowledge mapping makes knowledge acquisition more direct.

Text chain knowledge point is the knowledge point selected from each knowledge point for including Yi Bufen as text chain in document.Text Word chain is to do the demand further interpreted to the knowledge for thinking understanding in document content for solving user.Content can be improved in text chain Distribution efficiency, user click text chain can enter next webpage, the webpage include document relevant to text chain knowledge point With the paraphrase of the knowledge.The text chain for floaing green can derive from the entity in padagogical knowledge map.If a text chain knowledge point Occur in same piece document with another text chain knowledge point, then the two text chain knowledge points co-occurrence text chain knowledge each other Point.Reverse document-frequency (Inverse Document Frequency, IDF) is the measurement of a word general importance, certain The IDF of one particular words, can be by general act number divided by the number of the file comprising the word, then obtained quotient taken logarithm It obtains.Destination document can be K documents in all documents, such as select a document as mesh from 10,000,000 documents Mark document.S120 determines the text to be recommended of destination document according to the relevance weight of destination document and each text chain knowledge point Chain knowledge point.

Wherein, text chain to be recommended knowledge point is that can express user to want to know about text chain that is more, being potentially intended to Knowledge point.Obtained destination document and the relevance weight of each text chain knowledge point are handled, such as to relevance weight It is ranked up, the text chain knowledge point that user is intended to can be expressed by filtering out according to ranking results knows as text chain to be recommended Know point and is pushed to user.

Illustratively, S120 can specifically include: right according to the relevance weight of destination document and each text chain knowledge point The sequence of each text chain knowledge point；From filtering out the text chain knowledge point for including in destination document in ranking results；According to filtering knot To be recommended text chain knowledge point of the default value text chain knowledge point that fruit selects relevance weight high as destination document.Its In, default value can be according to being set, for example can be 6 etc..In order to want to know about user's pent-up demand, mesh Included text chain knowledge point is removed in mark document, then remaining text chain knowledge point by weight from big to small suitable Such as preceding 6 text chain knowledge points that sequence selects weight big are as text chain to be recommended knowledge point.

S130 carries out information push according to the text chain to be recommended knowledge point.

Wherein, the mode of information push can be shows on the right side of search result display area.Specifically, can basis Text chain to be recommended knowledge point finds content associated therewith, such as picture, news or article etc. in knowledge mapping or library It is shown on the right side of search result display area and user is recommended with this.

Information recommendation method provided in an embodiment of the present invention, according to co-occurrence text of each text chain knowledge point in all documents The text chain that word chain knowledge point, each text chain knowledge point include in the reverse document-frequency and destination document in all documents Knowledge point determines the relevance weight of destination document Yu each text chain knowledge point, determines destination document according to relevance weight Text chain to be recommended knowledge point, and it is pushed to user.It solves cold start-up, weight in existing recommended method and determines model The problems such as complicated and building text chain low efficiency, improve the accuracy of information recommendation, improve user and pushed information it Between viscosity, and then the user experience is improved.

Embodiment two

Fig. 2 is a kind of information recommendation method flow chart provided by Embodiment 2 of the present invention, and the information recommendation method is with this hair Based on bright embodiment one, the co-occurrence text chain knowledge according to each text chain knowledge point in all documents is further provided Point, the text chain knowledge point that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents, The method for determining the relevance weight of the destination document and each text chain knowledge point.Correspondingly, this method comprises:

S210 constructs co-occurrence matrix according to co-occurrence text chain knowledge point of each text chain knowledge point in all documents.

Wherein, if a text chain knowledge point occurs in a document jointly with another text chain knowledge point, The two text chain knowledge points co-occurrence text chain knowledge point each other.By taking existing 10,000,000 document as an example, then for each text Co-occurrence matrix G can be obtained in the statistics that co-occurrence text chain is made in chain knowledge point in 10,000,000 documents.

Illustratively, the co-occurrence text chain knowledge point according to each text chain knowledge point in all documents constructs co-occurrence square Battle array may include: the co-occurrence text chain knowledge point that all documents of statistics determine each text chain knowledge point；If j-th of text chain knowledge Point appears in a document jointly with k-th of text chain knowledge point, then G in co-occurrence matrix_jkTake 1；Otherwise, G_jk0 is taken, wherein j =1 ..., N, k=1 ..., N, N are the quantity of all text chains knowledge point.

For example, taking N=130 ten thousand, according to above-mentioned rule, available one 1,300,000 × 1,300,000 co-occurrence matrix G can To indicate are as follows:

S220 determines reverse file vector according to reverse document-frequency of each text chain knowledge point in all documents.

Wherein, the reverse document-frequency IDF of some text chain knowledge point can be removed by all total number of documents amounts It is obtained with the number of documents of the text chain knowledge point comprising in, and reverse file vector is the reverse text of all text chains knowledge point Vector composed by part frequency IDF, can be expressed as

Illustratively, reverse file vector can be determined according to following formulaMiddle IDF_n1:

Wherein, M is the total quantity of all documents, E_nIt is the number of documents comprising n-th of text chain knowledge point, n=1 ..., N, N are the quantity of all text chains knowledge point.For, N=130 ten thousand, then available one 1,300,000 × 1 vector.For example, It may is that

S230 determines the text chain correlation matrix of destination document according to the text chain knowledge point for including in destination document.

Illustratively, the text chain correlation matrix that destination document can be determined according to following steps, specifically includes: if mesh Marking includes i-th of text chain knowledge point in document, then X in the text chain correlation matrix of destination document_1iTake 1；Otherwise, X_1iIt takes 0, wherein i=1 ..., N, N are the quantity of all text chains knowledge point.For example, existing 10,000,000 document, shares 1,300,000 texts Chain knowledge point, then text chain correlation matrix X can be 10,000,000 × 1,300,000 matrix:

K in 10,000,000 documents are so taken to be used as destination document, then text chain correlation matrix X_KIt can be expressed as:

X_K=[0010000......01]

S240 determines target according to text chain correlation matrix, co-occurrence matrix and the reverse file vector of destination document The correlation vector of document and each text chain knowledge point.

Illustratively, the correlation vector of destination document Yu each text chain knowledge point can be determined according to following formula:

Wherein,It is the correlation vector of destination document Yu each text chain knowledge point, X is that the text chain of destination document is related Property matrix, G is co-occurrence matrix,It is reverse file vector.Wherein, the dot product of two matrixes is indicated, × indicate two vectors Product (multiplication cross).

When taking K documents as destination document, executing step S210, S220 and S230 can be obtained co-occurrence matrix G, inverse To file vectorWith correlation matrix X_K, above-mentioned formula is carried it into, correlation vector can be obtained

N=130 ten thousand is taken, X is enabled_KMatrix X is obtained with the dot product of G matrix^/ _KIt post-processes as vector and isSpecifically operated Journey is as follows:

Firstly, matrix X_KMatrix X is obtained with the dot product of matrix G^/ _K, it may be assumed that X^/ _K=X_K·G

Then, for the ease of subsequent progress vector multiplication cross, by matrix X^/ _KIt is converted into a column vectorAfter conversion are as follows:

Finally, being by the product of two vectorsSpecific calculating process is as follows:

S250 determines destination document and each text chain according to the correlation vector of destination document and each text chain knowledge point The relevance weight of knowledge point.

The correlation vector of destination document Yu each text chain knowledge point is obtained by step S240In each numerical value Represent the relevance weight of destination document Yu each text chain knowledge point.

S260 determines the text to be recommended of destination document according to the relevance weight of destination document and each text chain knowledge point Word chain knowledge point.

S270 carries out information push according to text chain to be recommended knowledge point.

Information recommendation method provided in an embodiment of the present invention, according to text chain correlation matrix, the co-occurrence square of destination document Battle array and reverse file vector, determine the correlation vector of destination document Yu each text chain knowledge point, to obtain destination document With the relevance weight of each text chain knowledge point, the text chain to be recommended knowledge point of destination document is determined according to relevance weight, And it is pushed to user.It solves the problems, such as that existing weight determines model complexity, improves the accuracy of information recommendation, promoted Viscosity between user and pushed information, and then the user experience is improved.

Embodiment three

The text chain for including in a kind of document based on NLP and knowledge mapping that Fig. 3 provides for the embodiment of the present invention three is known Know the determination method flow diagram of point, for this method based on the above embodiment of the present invention, specific method may include as follows:

S310, by the reverse document-frequency of each knowledge point for including in document word frequency within said document and each knowledge point Product, the degree of correlation as each knowledge point and document.

Wherein, word frequency refers to number of a certain given knowledge point in the document, can be indicated with C (d, e), wherein d table Show that document, e are expressed as the knowledge point in knowledge mapping, the reverse document-frequency of each knowledge point can be expressed as IDF_e.The degree of correlation Refer to the percentage for existing between referring to two things and connecting each other, commonly uses to assess a words for a file set or a corpus The significance level of a copy of it file in library, i.e., for being associated with journey between a certain given knowledge point and document where it Degree, can be indicated are as follows: C (d, e) × IDF with the mathematical formulae of word frequency and the product representation degree of correlation of reverse document-frequency_e。

S320, according to the degree of correlation of each knowledge point and document, the information content of each knowledge point and each knowledge point and document mark The similarity of topic determines the weight of each knowledge point.

Wherein, information content refers to the how many measurement of information, and the information content of each knowledge point can be determined according to following formula:

I_e=log₂(len(e))

Wherein, e is knowledge point, I_eIt is the information content of e, len (e) is the length of e.

Determine the specific operation process of the similarity of each knowledge point and Document Title are as follows: extract the theme in Document Title Word calculates the semantic similarity of descriptor and each knowledge point, can be indicated with mathematic sign are as follows:

Sim (t, e), wherein the title of t expression document d.

The weight of each knowledge point is each knowledge point shared ratio value in a document, is indicated with Q (d, e).Foundation is respectively known Know degree of correlation C (d, e) × IDF of the point with document_e, each knowledge point information content I_e=log₂(len (e)) and each knowledge point with The similarity sim (t, e) of Document Title determines that the expression formula of each knowledge point weight Q (d, e) can indicate are as follows:

Q (d, e)=α ((C (d, e) × IDF_e)+β(log₂(len(e))+θsim(t,e))

Wherein, α, β and θ are adjustment parameter, and the document parameter of different field is slightly distinguished.

S330, the weight according to each knowledge point screen each knowledge point for including in document, and by knowing after screening Know point and is used as text chain knowledge point.

According to the weight of each knowledge point being calculated step S320, the associated weight of each knowledge point is pressed from big to small Sequence is arranged, and the knowledge point in preset range is screened as text chain knowledge point.Wherein, preset range is user It is customized according to the actual situation, for example can be 10 or 15 etc., big preceding 10 knowledge points of weight are screened as text Word chain knowledge point.

Determination side provided in an embodiment of the present invention based on the text chain knowledge point for including in the document of NLP and knowledge mapping Method, according to the degree of correlation of each knowledge point and document, the information content of each knowledge point and each knowledge point are similar to Document Title Degree, determines the weight of each knowledge point；Weight according to each knowledge point screens each knowledge point for including in document, to determine Text chain knowledge point in the document.It solves the problems, such as to construct text chain inefficiency in existing recommended method.

Example IV

Fig. 4 is a kind of structural block diagram for information recommending apparatus that the embodiment of the present invention four provides, which can be performed this hair Information recommendation method provided by bright any embodiment has the corresponding functional module of execution method and beneficial effect.Such as Fig. 4 institute Show, the apparatus may include:

Relevance weight determining module 401, for the co-occurrence text chain according to each text chain knowledge point in all documents Knowledge point, the text chain knowledge that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents Point determines the relevance weight of the destination document Yu each text chain knowledge point；

Text chain determining module 402 to be recommended, for the correlation according to the destination document and each text chain knowledge point Weight determines the text chain to be recommended knowledge point of the destination document；

Info push module 403, for carrying out information push according to the text chain to be recommended knowledge point.

Information recommending apparatus provided in an embodiment of the present invention, according to co-occurrence text of each text chain knowledge point in all documents The text chain that word chain knowledge point, each text chain knowledge point include in the reverse document-frequency and destination document in all documents Knowledge point determines the relevance weight of destination document Yu each text chain knowledge point, determines destination document according to relevance weight Text chain to be recommended knowledge point, and it is pushed to user.It solves cold start-up, weight in existing recommended method and determines model The problems such as complicated and building text chain low efficiency, improve the accuracy of information recommendation, improve user and pushed information it Between viscosity, and then the user experience is improved.

Illustratively, relevance weight determining module 401 includes:

Co-occurrence matrix construction unit, for the co-occurrence text chain knowledge according to each text chain knowledge point in all documents Point constructs co-occurrence matrix；

Reverse file vector determination unit, for the reverse file frequency according to each text chain knowledge point in all documents Rate determines reverse file vector；

Correlation matrix determination unit, for determining the target according to the text chain knowledge point for including in destination document The text chain correlation matrix of document；

Correlation vector determination unit, for text chain correlation matrix, the co-occurrence square according to the destination document Battle array and the reverse file vector, determine the correlation vector of the destination document Yu each text chain knowledge point；

Relevance weight determination unit, for the correlation vector according to the destination document and each text chain knowledge point, Determine the relevance weight of the destination document Yu each text chain knowledge point.

Optionally, correlation matrix determination unit specifically can be used for:

If in destination document K including i-th of text chain knowledge point, in the text chain correlation matrix of the destination documentTake 1；Otherwise,0 is taken, wherein i=1 ..., N, N are the quantity of all text chains knowledge point.

Optionally, co-occurrence matrix construction unit specifically can be used for:

Count the co-occurrence text chain knowledge point that all documents determine each text chain knowledge point；

If j-th of text chain knowledge point and k-th of text chain knowledge point are appeared in jointly in a document, the co-occurrence G in matrix_jkTake 1；Otherwise, G_jk0 is taken, wherein j=1 ..., N, k=1 ..., N, N are the quantity of all text chains knowledge point.

Optionally, reverse file vector determination unit specifically can be used for:

According to following formula, IDF in the reverse file vector of institute is determined_n1:

Wherein M is the total quantity of all documents, E_nIt is the number of documents comprising n-th of text chain knowledge point, n=1 ..., N, N are the quantity of all text chains knowledge point.

Optionally, correlation vector determination unit specifically can be used for:

According to the text chain correlation matrix, the co-occurrence matrix and the reverse file vector of the destination document, Determine the correlation vector of the destination document Yu each text chain knowledge point, comprising:

According to following formula, the correlation vector of the destination document Yu each text chain knowledge point is determined:

WhereinIt is the correlation vector of the destination document K Yu each text chain knowledge point, X_kIt is the destination document K Text chain correlation matrix, G are the co-occurrence matrixes,It is the reverse file vector.

Illustratively, text chain determining module 402 to be recommended may include:

Text chain knowledge point sequencing unit, for being weighed according to the destination document and the correlation of each text chain knowledge point Weight sorts to each text chain knowledge point；

Text chain knowledge point filter element, for from filtering out the text chain for including in the destination document in ranking results Knowledge point；

Text chain determination unit to be recommended, for the default value text high according to filter result selection relevance weight To be recommended text chain knowledge point of the chain knowledge point as the destination document.

Optionally, above-mentioned apparatus can also include text chain knowledge point determining module；The text chain knowledge point determining module Specifically it can be used for:

By multiplying for the word frequency of each knowledge point for including in document within said document and the reverse document-frequency of each knowledge point Product, the degree of correlation as each knowledge point and document；

According to the degree of correlation of each knowledge point and document, the information content of each knowledge point and each knowledge point and Document Title Similarity determines the weight of each knowledge point；

Weight according to each knowledge point screens each knowledge point for including in document, and the knowledge point after screening is made For text chain knowledge point.

Optionally, the information content of each knowledge point is determined according to following formula:

I_e=log₂(len(e))

Embodiment five

Fig. 5 is a kind of structural schematic diagram for server that the embodiment of the present invention five provides.Fig. 5, which is shown, to be suitable for being used to realizing The block diagram of the exemplary servers 12 of embodiment of the present invention.The server 12 that Fig. 5 is shown is only an example, should not be to this The function and use scope of inventive embodiments bring any restrictions.

As shown in figure 5, the server 12 is showed in the form of universal computing device.The component of the server 12 may include But be not limited to: one or more processor or processing unit 16, system storage 28, connect different system components (including System storage 28 and processing unit 16) bus 18.

Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.

Server 12 typically comprises a variety of computer system readable media.These media can be and any can be serviced The usable medium that device 12 accesses, including volatile and non-volatile media, moveable and immovable medium.

System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Server 12 may further include other removable/nonremovable , volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing not removable Dynamic, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 5, can provide Disc driver for being read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can To be connected by one or more data media interfaces with bus 18.Memory 28 may include at least one program product, The program product has one group of (for example, at least one) program module, these program modules are configured to perform each implementation of the invention The function of example.

Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual Execute the function and/or method in embodiment described in the invention.

Server 12 can also be logical with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.) Letter can also enable a user to equipment interact with the equipment with one or more and communicate, and/or with enable the server 12 Any equipment (such as network interface card, modem etc.) communicated with one or more of the other calculating equipment communicates.It is this Communication can be carried out by input/output (I/O) interface 22.Also, server 12 can also pass through network adapter 20 and one A or multiple networks (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.Such as figure Shown, network adapter 20 is communicated by bus 18 with other modules of server 12.It should be understood that although not shown in the drawings, Other hardware and/or software module can be used in conjunction with server 12, including but not limited to: microcode, device driver, redundancy Processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..

Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize information recommendation method provided by the embodiment of the present invention.

Embodiment six

The embodiment of the present invention six additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should Program can realize information recommendation method any in above-described embodiment when being executed by processor.

The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable Storage medium can be for example but not limited to: electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or Any above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes: with one Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), The read-only storage (CD-ROM) of erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc, light are deposited Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can be with To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or It is in connection.

Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.

The program code for including on computer-readable medium can transmit with any suitable medium, including but not limited to: Wirelessly, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.

The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.? It is related in the situation of remote computer, remote computer can include local area network (LAN) or wide area by the network of any kind Net (WAN) is connected to subscriber computer, or, it may be connected to outer computer (such as using ISP come It is connected by internet).

Above-described embodiment serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.

Will be appreciated by those skilled in the art that each module of the above invention or each step can use general meter Device is calculated to realize, they can be concentrated on single computing device, or be distributed in network constituted by multiple computing devices On, optionally, they can be realized with the program code that computer installation can be performed, so as to be stored in storage It is performed by computing device in device, perhaps they are fabricated to each integrated circuit modules or will be more in them A module or step are fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and The combination of software.

All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar part between each embodiment may refer to each other.

The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal Replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of information recommendation method characterized by comprising

According to co-occurrence text chain knowledge point of each text chain knowledge point in all documents, each text chain knowledge point is in all documents In reverse document-frequency and destination document in include text chain knowledge point, determine that the destination document is known with each text chain Know the relevance weight of point；

According to the relevance weight of the destination document and each text chain knowledge point, the text to be recommended of the destination document is determined Chain knowledge point；

2. the method according to claim 1, wherein the co-occurrence according to each text chain knowledge point in all documents Text chain knowledge point, the text that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents Chain knowledge point determines the relevance weight of the destination document Yu each text chain knowledge point, comprising:

According to co-occurrence text chain knowledge point of each text chain knowledge point in all documents, co-occurrence matrix is constructed；

According to reverse document-frequency of each text chain knowledge point in all documents, reverse file vector is determined；

According to the text chain knowledge point for including in destination document, the text chain correlation matrix of the destination document is determined；

According to the text chain correlation matrix, the co-occurrence matrix and the reverse file vector of the destination document, determine The correlation vector of the destination document and each text chain knowledge point；

According to the correlation vector of the destination document and each text chain knowledge point, determine that the destination document is known with each text chain Know the relevance weight of point.

3. according to the method described in claim 2, it is characterized in that, according to the text chain knowledge point for including in destination document, really The text chain correlation matrix of the fixed destination document, comprising:

If in destination document including i-th of text chain knowledge point, X in the text chain correlation matrix of the destination document_1iIt takes 1；Otherwise, X_1i0 is taken, wherein i=1 ..., N, N are the quantity of all text chains knowledge point.

4. according to the method described in claim 2, it is characterized in that, co-occurrence according to each text chain knowledge point in all documents Text chain knowledge point constructs co-occurrence matrix, comprising:

If j-th of text chain knowledge point and k-th of text chain knowledge point are appeared in jointly in a document, the co-occurrence matrix Middle G_jkTake 1；Otherwise, G_jk0 is taken, wherein j=1 ..., N, k=1 ..., N, N are the quantity of all text chains knowledge point.

5. according to the method described in claim 2, it is characterized in that, reverse in all documents according to each text chain knowledge point Document-frequency determines reverse file vector, comprising:

Wherein M is the total quantity of all documents, E_nIt is the number of documents comprising n-th of text chain knowledge point, n=1 ..., N, N is The quantity of all text chains knowledge point.

6. according to the method described in claim 2, it is characterised by comprising:

According to the text chain correlation matrix, the co-occurrence matrix and the reverse file vector of the destination document, determine The correlation vector of the destination document and each text chain knowledge point, comprising:

WhereinIt is the correlation vector of the destination document Yu each text chain knowledge point, X is the text chain phase of the destination document Closing property matrix, G is the co-occurrence matrix,It is the reverse file vector.

7. the method according to claim 1, wherein the phase according to the destination document and each text chain knowledge point Closing property weight, determines the text chain to be recommended knowledge point of the destination document, comprising:

According to the relevance weight of the destination document and each text chain knowledge point, sort to each text chain knowledge point；

From filtering out the text chain knowledge point for including in the destination document in ranking results；

The default value text chain knowledge point that selects relevance weight high according to filter result as the destination document to Recommend text chain knowledge point.

8. the method according to claim 1, wherein the determination for the text chain knowledge point for including in document, comprising:

By the product of the word frequency of each knowledge point for including in document within said document and the reverse document-frequency of each knowledge point, make For the degree of correlation of each knowledge point and document；

According to the degree of correlation of each knowledge point and document, the information content of each knowledge point and each knowledge point are similar to Document Title Degree, determines the weight of each knowledge point；

Weight according to each knowledge point screens each knowledge point for including in document, and using the knowledge point after screening as text Word chain knowledge point.

9. according to the method described in claim 8, it is characterised by comprising:

The information content of each knowledge point is determined according to following formula:

I_e=log₂(len(e))

10. a kind of information recommending apparatus characterized by comprising

Relevance weight determining module, for the co-occurrence text chain knowledge point according to each text chain knowledge point in all documents, The text chain knowledge point that each text chain knowledge point includes in the reverse document-frequency and destination document in all documents determines The relevance weight of the destination document and each text chain knowledge point；

Text chain determining module to be recommended, for the relevance weight according to the destination document and each text chain knowledge point, really The text chain to be recommended knowledge point of the fixed destination document；

11. device according to claim 10, which is characterized in that the relevance weight determining module includes:

Co-occurrence matrix construction unit, for the co-occurrence text chain knowledge point according to each text chain knowledge point in all documents, structure Build co-occurrence matrix；

Reverse file vector determination unit, for the reverse document-frequency according to each text chain knowledge point in all documents, really Fixed reverse file vector；

Correlation matrix determination unit, for determining the destination document according to the text chain knowledge point for including in destination document Text chain correlation matrix；

Correlation vector determination unit, for according to the destination document text chain correlation matrix, the co-occurrence matrix with And the reverse file vector, determine the correlation vector of the destination document Yu each text chain knowledge point；

Relevance weight determination unit is determined for the correlation vector according to the destination document and each text chain knowledge point The relevance weight of the destination document and each text chain knowledge point.

12. device according to claim 10, which is characterized in that the text chain determining module to be recommended includes:

Text chain knowledge point sequencing unit, it is right for the relevance weight according to the destination document and each text chain knowledge point The sequence of each text chain knowledge point；

Text chain knowledge point filter element, for from filtering out the text chain knowledge for including in the destination document in ranking results Point；

Text chain determination unit to be recommended is known for the default value text chain high according to filter result selection relevance weight Know to be recommended text chain knowledge point of the point as the destination document.

13. device according to claim 10, which is characterized in that described device further includes that text chain knowledge point determines mould Block；Text chain knowledge point determining module is specifically used for:

14. a kind of server, which is characterized in that the server includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now information recommendation method as described in any in claim 1-9.

15. a kind of storage medium, is stored thereon with computer program, which is characterized in that the realization when program is executed by processor Information recommendation position method as described in any in claim 1-9.