CN110737774B - Book knowledge graph construction method, book recommendation method, device, equipment and medium - Google Patents

Book knowledge graph construction method, book recommendation method, device, equipment and medium Download PDF

Info

Publication number
CN110737774B
CN110737774B CN201810719673.0A CN201810719673A CN110737774B CN 110737774 B CN110737774 B CN 110737774B CN 201810719673 A CN201810719673 A CN 201810719673A CN 110737774 B CN110737774 B CN 110737774B
Authority
CN
China
Prior art keywords
entity
book
entities
books
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810719673.0A
Other languages
Chinese (zh)
Other versions
CN110737774A (en
Inventor
许瑾
刘文昱
郝萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810719673.0A priority Critical patent/CN110737774B/en
Publication of CN110737774A publication Critical patent/CN110737774A/en
Application granted granted Critical
Publication of CN110737774B publication Critical patent/CN110737774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a book knowledge graph construction method, a book recommendation method, a book knowledge graph construction device, a book recommendation device, computer equipment and a storage medium. The book recommendation method comprises the following steps: acquiring abstract corpus corresponding to at least two books respectively; entity identification is carried out according to abstract corpus corresponding to books; establishing an entity and book association table according to the identified entity; according to the book knowledge graph constructed according to the entity and book association table, the technical scheme of the embodiment of the invention provides a new mode of constructing the book knowledge graph according to the association degree between each entity and each book and recommending books based on the book knowledge graph, optimizes the existing book recommendation technology and meets the increasingly personalized and convenient book reading demands of people.

Description

Book knowledge graph construction method, book recommendation method, device, equipment and medium
Technical Field
The embodiment of the invention relates to a data processing technology, in particular to a book knowledge graph construction method, a book recommendation device, computer equipment and a storage medium.
Background
At present, with the popularization of mobile terminals such as mobile phones and the development of electronic book readers, electronic books are increasingly favored by reading users. At present, reading APP (application program) can recommend some more popular books or books with higher good score to a main interface for users to select and read. But a free-selling book or a book with a better score than a book is not necessarily liked by the user.
With the continuous improvement of the technology, the requirements of people on book recommendation technology are also continuously improved, and the existing book recommendation technology cannot meet the increasingly personalized and convenient book reading requirements of people.
Disclosure of Invention
The embodiment of the invention provides a book knowledge graph construction method, a book recommendation device, computer equipment and a storage medium, so as to provide a new mode for constructing the book knowledge graph and provide a new mode for recommending books based on the book knowledge graph.
In a first aspect, an embodiment of the present invention provides a method for constructing a book knowledge graph, including:
acquiring abstract corpus corresponding to at least two books respectively;
Entity identification is carried out according to abstract corpus corresponding to books;
According to the identified entity, establishing an entity and book association table, wherein the entity and book association table comprises: a first association weight between the entity and the book;
and constructing the book knowledge graph according to the entity and book association table.
In a second aspect, an embodiment of the present invention further provides a book recommendation method, including:
acquiring at least one historical attention book of a user, and inquiring a book knowledge graph according to the historical attention book, wherein the book knowledge graph comprises: the entity and book association table comprises: a first association weight between the entity and the book;
Searching at least one target entity associated with the historical attention book in the entity and book association table;
according to the target entity, reversely searching at least one book corresponding to the target entity in the entity and book association table to serve as a book to be recommended;
And providing the books to be recommended to the user.
In a third aspect, an embodiment of the present invention further provides a device for constructing a book knowledge graph, including:
the abstract corpus acquisition module is used for acquiring abstract corpuses corresponding to at least two books respectively;
The entity recognition module is used for carrying out entity recognition according to abstract corpus corresponding to the book;
the entity and book association table establishing module is used for establishing an entity and book association table according to the identified entity, wherein the entity and book association table comprises the following components: a first association weight between the entity and the book;
and the book knowledge graph construction module is used for constructing the book knowledge graph according to the entity and book association table.
In a fourth aspect, an embodiment of the present invention further provides a book recommendation apparatus, including:
The book knowledge graph inquiring module is used for acquiring at least one historical attention book of a user and inquiring a book knowledge graph according to the historical attention book, and the book knowledge graph comprises: the entity and book association table comprises: a first association weight between the entity and the book;
The target entity searching module is used for searching at least one target entity associated with the historical attention book in the entity and book association table;
The book searching module to be recommended is used for reversely searching at least one book corresponding to the target entity in the entity and book association table according to the target entity to serve as a book to be recommended;
And the book to be recommended providing module is used for providing the book to be recommended to the user.
In a fifth aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the method for building a book knowledge graph according to any one of the embodiments of the present invention or implements the method for recommending a book according to any one of the embodiments of the present invention when executing the program.
In a sixth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the program when executed by a processor implements a method for building a book knowledge graph according to any one of the embodiments of the present invention, or implements a book recommendation method according to any one of the embodiments of the present invention.
According to the technical scheme, entity identification is performed according to abstract corpus corresponding to books; according to the identified entity, establishing an entity and book association table, and constructing the book knowledge graph based on the entity and book association table; and inquiring a book knowledge graph according to the historical attention books of the user, searching at least one target entity associated with the historical attention books in the entity and book association table, and reversely searching at least one book corresponding to the target entity in the entity and book association table according to the target entity to serve as a book to be recommended for the user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for constructing a book knowledge graph in accordance with a first embodiment of the present invention;
FIG. 2 is a flowchart of a method for constructing a book knowledge graph in a second embodiment of the invention;
FIG. 3 is a flowchart of a method for constructing a book knowledge graph in a third embodiment of the invention;
FIG. 4a is a flowchart of a method for constructing a book knowledge graph in a fourth embodiment of the invention;
fig. 4b is a schematic diagram of a book knowledge graph constructed by the method in the fourth embodiment of the invention;
FIG. 5 is a flowchart of a book recommendation method in a fifth embodiment of the invention;
FIG. 6a is a flowchart of a book recommendation method in a sixth embodiment of the invention;
FIG. 6b is a schematic diagram of an application scenario to which the method of the embodiment of the present invention is applied;
FIG. 7 is a schematic diagram of a construction apparatus for a book knowledge graph in a seventh embodiment of the invention;
FIG. 8 is a schematic view of a book recommendation device according to an eighth embodiment of the invention;
Fig. 9 is a schematic structural diagram of a computer device according to a ninth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
It should be further noted that, for convenience of description, only some, but not all of the matters related to the present invention are shown in the accompanying drawings. Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or at the same time. Furthermore, the order of the operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.
For convenience of reading, the correspondence between the various figures and the main content of the application is first briefly described: aiming at the embodiment of the construction method of the book knowledge graph, the main related drawings are as follows: fig. 1, 2, 3, 4a and 4b; for the embodiment of the book recommendation method, mainly related drawings are as follows: fig. 5, fig. 6a, and fig. 6b mainly relate to the following drawings for embodiments of a book knowledge graph constructing device and a book recommending device: fig. 7 and 8; for an embodiment of a computer device, the drawings referred to primarily are fig. 9.
Example 1
Fig. 1 is a flowchart of a method for constructing a book knowledge graph according to an embodiment of the present invention, where the method may be applied to the case of constructing a book knowledge graph for recommending books to users, and the method may be performed by a device for constructing a book knowledge graph according to an embodiment of the present invention, where the device may be implemented in a software and/or hardware manner, and may generally be integrated in a device having a certain computing capability, for example, a terminal or a server. As shown in fig. 1, the method specifically includes the following operations:
S110, acquiring abstract corpus corresponding to at least two books respectively.
The abstract corpus is a short text which simply and exactly describes the semantic coherence of the main content of the book. Generally, when an electronic book is recorded in the internet, abstract corpus corresponding to the book is recorded correspondingly, and the abstract corpus is used for searching the book by a user.
Typically, the abstract corpus corresponding to each electronic book may be obtained from one or more data sources (e.g., hundred degree library or other electronic document library, etc.). It should be noted that if the abstract corpus of the multiple books is obtained through at least two data sources, it is necessary to perform book name mapping on the abstract corpus of the multiple data sources (typically, book name mapping may be implemented by calculating similarity of book names), so as to obtain one or more abstract corpora matched with the same book.
Optionally, after the abstract corpus corresponding to at least two books is obtained, the abstract corpus may be first subjected to data filtering and cleaning, for example, an operation of removing special symbols or solving the problems of coding format and the like is performed, so as to obtain the abstract corpus meeting the set data format.
S120, entity recognition is carried out according to abstract corpus corresponding to books.
In this embodiment, the entity refers to a definition or an abstract concept for describing the contents of the book at a set angle. For example, place names (e.g., in a river) contained in the book, history periods (e.g., north-south periods) corresponding to the contents described by the book, main history characters (e.g., zhao cloud) appearing in the book, and the like.
In an optional implementation manner of this embodiment, the entity recognition according to the abstract corpus corresponding to the book may be: word segmentation is carried out on abstract corpus corresponding to books, word segmentation results are matched with a set entity word stock, and entities included in the abstract corpus are determined according to the matching results;
In another optional implementation manner of this embodiment, the entity recognition according to the abstract corpus corresponding to the book may further be: and inputting abstract corpus corresponding to the books into an entity recognition model for entity recognition to obtain an entity matched with the entity recognition model.
The entity recognition model can be trained in advance by setting a training sample marked with an entity.
S130, establishing an entity and book association table according to the identified entity, wherein the entity and book association table comprises: a first association weight between the entity and the book.
The first association weight is used for measuring the association degree between the entity and the book, and the higher the association degree between the entity and the book is, the larger the value of the first association weight is. Typically, the range of values of the first association weight may be between [0,1 ].
Alternatively, only the entity and the book with the value of the first association weight greater than the set threshold (for example, 0.8) may be obtained, and the association table of the entity and the book is constructed, so that it may be ensured that there is an explicit association relationship between the entity and the book recorded in the association table of the illustration.
In this embodiment, a first association weight between an entity and a book may be determined according to the frequency of occurrence of the entity in the abstract corpus of the book;
Furthermore, because the abstract corpus of the book reflects the main content of the book to a certain extent, the association weight between the entity and the abstract corpus can be simply used as the first association weight between the entity and the book, so that in the process of outputting the entity corresponding to the abstract corpus through the entity identification model in S120, the identification weight between the abstract corpus and the entity can be simultaneously output, and the identification weight is used as the first association weight between the entity and the book.
By way of example and not limitation, the data in the entity-book association table may be in the form of: entity name: a history of Mingqing; book name: "Ming dynasty" those things; first association weight: 0.85; entity name: zurich; book name: "Bie ai Zu Li Shi"; first association weight: 0.92.
It will be appreciated that, in the table of associating entities with books, the entity names and book names may be represented by chinese names or english names, or may be represented by entity name numbers (IDs) or book name numbers, so long as the same number is guaranteed to be capable of uniquely determining one entity name or one book name, which is not limited herein.
S140, constructing the book knowledge graph according to the entity and book association table.
In this embodiment, a book knowledge graph may be constructed based on the book-related tables of entities corresponding to a plurality of books and a plurality of entities, where the book knowledge graph records the association relationship between the book and the entities.
Further, when a user needs to recommend a book, the user may first pay attention to the book (e.g., a browsed or searched book) according to the history of the user, query the book knowledge graph, determine the entity included in the history of the book, and after the entity is obtained, reversely search the book knowledge graph to determine other books corresponding to the entity, thereby implementing a new method for recommending books based on the entity.
According to the technical scheme, entity identification is carried out according to abstract corpus corresponding to books; according to the identified entity, an entity and book association table is established, and based on the entity and book association table, a technical means of book knowledge graph is established, a new book knowledge graph representing the association degree between the entity and books is established, and through the entity conditions included in the books, the association relationship between the books is established, the content in the book knowledge graph is redefined, and therefore a new technology for recommending books based on the book knowledge graph can be achieved.
On the basis of the above embodiments, after performing entity recognition according to the abstract corpus corresponding to the book and performing entity recognition according to the abstract corpus corresponding to the book, the method may further include:
According to the abstract corpus corresponding to at least two books, calculating an inverse text frequency index corresponding to the identified entity; according to the calculated inverse text frequency index, filtering a first universal entity included in the entity; and/or
According to books corresponding to the abstract corpus of the identified entity, counting book quantity values associated with each entity; and filtering out a second universal entity included in the entities according to the book quantity value obtained through statistics.
The method has the advantages that only the characteristic and distinguishable non-universal entities can be reserved to construct the book knowledge graph, so that the representativeness of the entities in the finally obtained book knowledge graph to books is stronger, and the final book recommendation effect is better.
Example two
Fig. 2 is a flowchart of a method for constructing a book knowledge graph in a second embodiment of the present invention, which is implemented based on the foregoing embodiment, and in this embodiment, entity recognition is performed according to abstract corpus corresponding to a book, specifically: inputting abstract corpus corresponding to books into at least one entity recognition model for entity recognition to obtain an entity matched with the entity recognition model; and
After inputting abstract corpus corresponding to books into at least one entity recognition model to perform entity recognition, obtaining an entity matched with the entity recognition model, the method further comprises the following steps: if the identified entity comprises at least two data forms, converting the obtained at least two data forms into the entity of the same data form according to the mapping relation between the entities of different data forms; and performing deduplication processing on the entities in the same data form. Correspondingly, the method of the embodiment specifically comprises the following operations:
s210, acquiring abstract corpus corresponding to at least two books respectively.
S220, inputting abstract corpus corresponding to the books into at least one entity recognition model for entity recognition to obtain an entity matched with the entity recognition model. The entity recognition models of different types are obtained through training of training data of different types.
In this embodiment, the types of the entity recognition model may include: a conceptual entity recognition model, a point of interest entity recognition model, and a nominated entity recognition model;
The conceptual entity identification model is used for identifying conceptual entities, wherein the conceptual entities are topic labels which are related with the book; the attention point entity identification model is used for identifying attention point entities, and the attention point entities are topic labels associated with interest points of users; the special name entity recognition model is used for recognizing special name entities, and the special name entities are proper nouns included in the abstract corpus.
In a specific example, the conceptual entity specifically refers to an entity directly related to the subject matter or content of the book, for example: "words" or "tactics" and so on, the focus entity specifically refers to an entity that is determined by a plurality of users for a keyword of a book search and is related to a user interest point, for example: "psychology" or "thick black", etc. The special name entity specifically refers to proper nouns such as characters, place names, literary works, professional terms and the like included in books.
In an optional implementation manner of this embodiment, different types of entity recognition models may be pre-constructed, and each entity recognition model selects a corresponding training sample labeled with an entity for training according to the entity characteristics that can be recognized by the entity recognition model.
S230, judging whether the identified entity comprises at least two data forms, if so, executing S240; otherwise, S250 is performed.
Based on different characteristics of the attention point entity and the special name entity, the entity identified by the attention point entity identification model constructed by the inventor has different data forms with the entity identified by the special name entity identification model. Typically, the proper noun identification type is code coding, the focus point diagram type is Tag value, and the data forms of the two are not uniform. In order to facilitate the subsequent operation, when determining that the identified entity has both a private entity and a point of interest entity, the data forms of the two entities need to be unified.
S240, converting the obtained at least two data forms of entities into the same data form of entities according to the mapping relation among the different data forms of entities, and executing S250.
Typically, a mapping relationship between the private entity and the point of interest entity may be established, and it is assumed that the data forms between the point of interest entity and the conceptual entity are consistent, so that the data form of the private entity in the identified entity may be converted into the data form of the point of interest entity based on the mapping relationship, so as to ensure that all the finally obtained entities are the entities with the same data form. To facilitate subsequent processing.
S250, performing de-duplication processing on the entities in the same data form.
When the data forms of the entities obtained from the entity identification models are consistent, the repeated entities can be conveniently removed.
S260, acquiring an identification weight corresponding to the identified entity, which is output after the abstract corpus is input into the entity identification model, and taking the identification weight as a first association weight between the entity and the book corresponding to the abstract corpus.
In this embodiment, when the entity recognition model is constructed, the entity recognition model can output, in addition to the correspondingly recognized entity, the recognition weight between the entity and the input abstract corpus, where the recognition weight reflects the degree of association between the output entity and the input abstract corpus, and the recognition weight can be represented by using a numerical value in the range of 0, 1.
S270, adjusting the first association weight according to a weight adjustment coefficient corresponding to the entity identification model of the identified entity. Wherein different types of entity recognition models have different weight adjustment coefficients.
In the present embodiment, three different types of entity recognition models are considered to be constructed altogether, that is: the conceptual entity recognition model, the attention point entity recognition model and the special name entity recognition model are used for recognizing three different types of entities, so that the hit rate of the entities on the actual demands of the users is higher, the weights corresponding to the different entities or the importance degree can be preset, and the entities with high weights can be preferentially used for recommending books to the users when recommending books.
Alternatively, it may be predefined that: the weights of the conceptual entity > the weights of the point of interest entity > the weights of the special name entities, and correspondingly, based on the weight setting, may be further set: the weight adjustment coefficient of the conceptual entity recognition model is larger than that of the attention point entity recognition model; the weight adjustment coefficient of the attention point entity recognition model is larger than that of the special name entity recognition model.
Correspondingly, the first association weight is adjusted according to the weight adjustment coefficient corresponding to the entity identification model of the identified entity, so that the aim of adjusting the weights of different types of entities can be fulfilled.
In one specific example: after a section of abstract corpus is input into a conceptual entity recognition model, recognizing an entity A, wherein a first association weight corresponding to the entity A is 0.9; after the abstract corpus is input into a focused point entity recognition model, recognizing an entity B, wherein the first association weight corresponding to the entity B is 0.9; after the abstract corpus is input into the special name entity recognition model, an entity C is recognized, and the first association weight corresponding to the entity C is also 0.9.
Although the first association weights of the entity a, the entity B, and the entity C obtained by different entity recognition models are the same, considering that the three entities have different importance degrees, a weight adjustment coefficient corresponding to a conceptual entity recognition model may be set to 1, a weight adjustment coefficient corresponding to a point of interest entity recognition model is 0.95, a weight adjustment coefficient corresponding to a special entity recognition model is 0.92, and accordingly, the first association weight corresponding to the entity a may be adjusted to 0.9x1=0.9 based on the weight adjustment coefficient; the first association weight corresponding to the entity B is adjusted to 0.9×0.95=0.855, and the first association weight corresponding to the entity C is adjusted to 0.9×0.92=0.828.
S280, establishing the association table of the entity and the book according to the first association weight.
In this embodiment, a book with a value of the first association weight greater than a set threshold and a corresponding entity may be selected to establish the association table between the entity and the book.
S290, constructing the book knowledge graph according to the entity and book association table.
According to the technical scheme provided by the embodiment of the invention, from an actual application scene, different types of entity identification models are constructed, various types of entities which can meet the actual demands of users can be identified based on the different types of entity identification models, and books which are actually expected by the users can be more accurately hit for recommendation by establishing book knowledge graphs based on the entities, so that the use experience of the users is further improved;
Meanwhile, by adjusting weights of the entities identified by different entity identification models, the entity with high importance can be preferentially selected from the book knowledge graph, and the hit rate of actual demands of users is higher when the book knowledge graph is used for recommending books.
Example III
Fig. 3 is a flowchart of a method for constructing a book knowledge graph according to a third embodiment of the present invention, where the method is implemented based on the above embodiment, and in this embodiment, the operation of establishing a table of entities and books according to the identified entities is implemented, and the operation of screening the entities obtained by identifying the entities according to abstract corpus corresponding to the books is added. Correspondingly, the method of the embodiment specifically comprises the following operations:
s310, acquiring abstract corpus corresponding to at least two books respectively.
S320, inputting abstract corpus corresponding to books into at least one entity recognition model for entity recognition to obtain an entity matched with the entity recognition model.
S330, calculating an inverse text frequency index corresponding to the identified entity according to abstract corpus corresponding to at least two books respectively; and filtering out a first universal entity included in the entity according to the calculated inverse text frequency index.
Reverse document frequency (Inverse Document Frequency, IDF) index is a measure of the general importance of a word. The IDF index of a particular word (entity) can be obtained by dividing the total number of documents (abstract corpus) in the corpus by the number of documents (abstract corpus) containing the word and taking the logarithm of the quotient obtained.
That is, the lower the IDF index of an entity, the greater the number of abstract corpus of that entity that appears, and the less precisely the entity can define a book. Therefore, after the IDF index of each entity is calculated, the entity with the IDF index smaller than the set threshold (i.e., the first generalization entity) may be filtered out. The obtained entities can be ranked according to the order of the IDF indexes from large to small, and after a set number of entities are reserved according to the ranking result, the rest entities are filtered.
S340, counting the book quantity value associated with each entity according to books corresponding to the abstract corpus of the identified entity; and filtering out a second universal entity included in the entities according to the book quantity value obtained through statistics.
In this embodiment, if the first association weight between the abstract corpus of the book and an entity is greater than a set threshold, for example: 0.8, the book is associated with the entity. Accordingly, the total book quantity value associated with each entity can be further counted. In this embodiment, it is considered that if the number of books associated with one entity is large, the entity cannot distinguish different books well, so that the entity whose number of books exceeds a set number threshold (for example, 20 books or 30 books) can be filtered.
S350, establishing an entity and book association table according to the identified entity, wherein the entity and book association table comprises: a first association weight between the entity and the book.
The entity is specifically an entity which is remained after the first universal entity and the second universal entity are filtered, and can accurately represent books.
S360, constructing the book knowledge graph according to the entity and book association table.
According to the technical scheme provided by the embodiment of the invention, the entity in the book knowledge graph can reflect the characteristics of different types of books better by filtering the universal entity from the entities identified by the abstract corpus, so that the description capability of the entity on the books is stronger.
Example IV
Fig. 4a is a flowchart of a method for constructing a book knowledge graph according to a fourth embodiment of the present invention, which is implemented based on the foregoing embodiment, and in this embodiment, an entity table, a book table, and an entity-entity association table are further established and added to the book knowledge graph. Correspondingly, the method of the embodiment specifically comprises the following operations:
s410, acquiring abstract corpus corresponding to at least two books respectively.
S420, establishing a book table corresponding to the at least two books, wherein the book table comprises: book, abstract corpus and corresponding relation between book attribute information.
The book specifically refers to a name of the book, or a book number for uniquely identifying the book, and the book attribute information specifically refers to information for expressing a related attribute or characteristic of the book, such as information of an author, a publishing time, a book classification, a user evaluation, or a network score.
S430, entity recognition is carried out according to abstract corpus corresponding to the books.
S440, establishing an entity and book association table according to the identified entity, wherein the entity and book association table comprises: a first association weight between the entity and the book.
S450, establishing an entity table according to the entity identified by the abstract corpus; the entity table comprises: correspondence between entities and entity types.
The entity type is a generic concept of the entity and is used for representing categories corresponding to a plurality of similar entities. Typically, the entity types may include: literature, film and television works, place names, songs, cartoons, figures, institutions, books, creatures, foods, phenomena, activities, prizes, chemicals, celestial bodies, and the like.
In an optional implementation manner of this embodiment, when the corresponding entity recognition model is constructed, after the abstract corpus is output to one entity recognition model, the entity recognition model can output not only the recognized entity but also the entity type corresponding to the recognized entity, and further, an entity table can be constructed according to the output result of the entity recognition model.
In a specific example, the data form in the entity table may be: entity name: canine night fork, entity type: cartoon characters; entity name: beijing, entity type: a place name.
Alternatively, S450 may be performed after S440 or may be performed before S440, which is not limited in this embodiment.
S460, establishing an entity-entity association table according to the identified semantic similarity among the entities, wherein the entity-entity association table comprises: and a second association weight between every two entities.
Wherein the second association weight is used for representing the association degree between two entities.
In an optional implementation manner of this embodiment, building an entity table according to the entity identified by the abstract corpus may specifically include:
inputting the two entities into a semantic similarity recognition model respectively to obtain a second association weight between the two entities; establishing the entity and entity association table according to the second association weight between every two entities;
the semantic similarity recognition model is obtained by dividing abstract corpus corresponding to at least two books into sentences and training word vectors obtained after the sentences are segmented by using each recognized entity as a segmentation dictionary.
In a specific example, the data form in the entity table may be: entity one: psychology, entity two: weber, second association weight: 0.86; entity one: bright history, entity two: wang Yangming, a second association weight: 0.82.
S470, constructing the book knowledge graph according to the entity and book association table, the entity table, the book table and the entity and entity association table.
Specifically, the book knowledge graph may be formed by the entity and book association table, the entity table, the book table, and the entity and entity association table.
The technical scheme of the embodiment of the invention further enriches the contents in the book knowledge graph, and further can accurately and efficiently recommend books to users by adopting an effective recommendation strategy based on various contents in the book knowledge graph.
Fig. 4b is a schematic diagram of a book knowledge graph constructed by the method in the fourth embodiment of the invention. As shown in fig. 4b, the reading profile comprises: the book comprises an entity, a book, an entity-to-entity relationship, an entity-to-book relationship, attribute information of the book and a category to which the book belongs. The entities are conceptual entities, point of interest entities, and proprietary entities extracted from abstract fragments of books, such as the characters "galileo", the place name "su zhou", and the time "tomorrow". The entity-to-entity relationship is a similarity weight in semantic space. The relationship between the entity and the book is the correlation weight of the entity and the book.
The books, the attribute information of the books and the classification of the books can be obtained through a pre-constructed book table, the relation between the entity and the entity can be obtained through an entity-entity association table, and the relation between the entity and the books can be obtained through the entity-book association table.
Example five
Fig. 5 is a flowchart of a book recommendation method provided in a fifth embodiment of the present invention, where the present embodiment is applicable to the case of constructing a book recommendation device for recommending books to users, the method may be performed by the book recommendation device in the embodiment of the present invention, and the device may be implemented in a software and/or hardware manner, and may be generally integrated in a device having various terminal apparatuses, for example, a mobile phone or a tablet computer. As shown in fig. 5, the method specifically includes the following operations:
S510, acquiring at least one historical attention book of a user, and inquiring a book knowledge graph according to the historical attention book, wherein the book knowledge graph comprises: the entity and book association table comprises: a first association weight between the entity and the book.
The historical attention books, particularly the books selected to be checked or searched by the user in the last period of time, represent the interest point books of the user.
S520, searching at least one target entity associated with the historical attention book in the entity and book association table.
In an optional implementation manner of this embodiment, if it is determined that the number of the target entities found is greater than a set number threshold (for example, 20 or 30, etc.), the priority order of the historical attention books is determined according to the attention time of the user to the at least one historical attention book, and the target entities are screened according to the priority order. For example, only one or two target entities corresponding to the historical focus books that the user recently focused on are selected.
S530, reversely searching at least one book corresponding to the target entity in the entity and book association table according to the target entity, and taking the book as a book to be recommended.
In this embodiment, since the association relationship between the entity and the book is recorded in the association table of the entity and the book, after determining one or more target entities, the book associated with the target entities may be further obtained as the book to be recommended.
Optionally, before providing the book to be recommended to the user, the method further includes: and carrying out de-duplication treatment on the books to be recommended.
S540, providing the books to be recommended to the user.
According to the technical scheme, a historical attention book of a user is obtained, a book knowledge graph is inquired according to the historical attention book, and at least one target entity associated with the historical attention book is searched in the entity and book association table; according to the target entity, at least one book corresponding to the target entity is reversely searched in the entity and book association table and provided for the user as a technical means for books to be recommended, so that a new mode of book recommendation based on book knowledge maps constructed by association degrees between each entity and each book is realized, the existing book recommendation technology is optimized, and the increasingly personalized and convenient book reading demands of people are met.
Example six
Fig. 6a is a flowchart of a book recommendation method according to a sixth embodiment of the present invention, which is implemented based on the foregoing embodiments, and in this embodiment, the book recommendation method is further refined based on an entity table, a book table, and an entity-entity association table included in a book knowledge graph. Correspondingly, the method of the embodiment of the invention specifically comprises the following operations:
S610, at least one historical attention book of the user is obtained, and a book knowledge graph is inquired according to the historical attention book.
The book knowledge graph comprises: entity and book association table, entity table, book table, and entity association table, wherein:
The table of the association of the entity and the book comprises: a first association weight between the entity and the book; the entity table comprises: correspondence between entities and entity types; the book table comprises the following components: book, abstract corpus and corresponding relation between book attribute information; the entity-to-entity association table includes: and a second association weight between every two entities.
S620, searching at least one target entity associated with the historical attention book in the entity and book association table.
S630, judging whether the number of the searched target entities is smaller than a first number threshold value: if yes, executing S640; otherwise, S650 is performed.
S640, acquiring an extended entity corresponding to the target entity from the entity-entity association table, adding the extended entity to the target entity, and executing S670.
S650, judging whether the number of the searched target entities is larger than a second number threshold: if yes, executing S660; otherwise, S670 is performed.
S660, determining the priority order of the historical attention books according to the attention time of the user to the at least one historical attention book, screening the target entity according to the priority order, and executing S670.
S670, reversely searching at least one book corresponding to the target entity in the entity and book association table according to the target entity to serve as a book to be recommended.
S680, screening the books to be recommended according to the book attribute information of the books to be recommended in the book table.
Typically, the book attribute information specifically refers to a score value of a book, for example, books with score values less than 8 points can be screened from all acquired books to be recommended, so as to ensure that the finally recommended books are books with higher quality and better user feedback.
S690, performing de-duplication processing on the books to be recommended.
S6100, inquiring the entity type corresponding to the at least one target entity in the entity table, and constructing a recommendation reason item corresponding to the book to be recommended according to the target entity and the entity type corresponding to the target entity.
For example: the structure comprises the following steps: the recommendation reason item of the 'name of a place-river recommendation' in the knowledge graph, wherein 'river' is a target entity, and the 'name of the place' is an entity type corresponding to the target entity.
By constructing the recommendation reason item, books recommended to users have interpretability, which is the capability of any book recommendation technology in the prior art, and the existing book recommendation technology can only recommend related books to users, but can not qualitatively describe recommendation reasons to users, but by using the entity-based recommendation method of the embodiment of the invention, the recommendation reasons of books can be qualitatively provided to users through the combination of entity types and entity names, and the book reading requirements of users can be greatly met.
And S6110, providing the books to be recommended and recommendation reason items corresponding to the books to be recommended for the user.
In the sixth embodiment of the invention, in the recommended book display interface, the books recommended in the interface box are the books recommended through the book knowledge graph.
According to the technical scheme, a historical attention book of a user is obtained, a book knowledge graph is inquired according to the historical attention book, and at least one target entity associated with the historical attention book is searched in the entity and book association table; according to the target entity, at least one book corresponding to the target entity is reversely searched in the entity and book association table and provided for the user as a technical means for books to be recommended, so that a new mode of book recommendation based on book knowledge maps constructed by association degrees between each entity and each book is realized, the existing book recommendation technology is optimized, and the increasingly personalized and convenient book reading demands of people are met.
Fig. 6b is a schematic diagram of an application scenario to which the method according to the embodiment of the present invention is applied. The application scene combines the construction of the book knowledge graph and the recommendation process of the book. As shown in fig. 6b, the present application scenario is divided into four parts: 1. processing the multi-element heterogeneous data; 2. extracting an entity and calculating an entity association; 4. and (5) calculating on line.
1. Multisource heterogeneous data processing: the processing procedure can be divided into two procedures, one procedure is a book name mapping procedure, and the procedure mainly carries out book name mapping on books containing abstract corpus of multiple sources. Typically, the above-mentioned book name mapping can be achieved by calculating the jaccard similarity between the book names; the other process is a process of cleaning the abstract, and the process mainly filters and cleans the abstract to remove special symbols, solve the problem of coding formats and the like.
Through the operation of multi-source heterogeneous data processing, a book table corresponding to a plurality of books can be correspondingly established, and the book table specifically comprises the following data items: book name, book ID, abstract book, book score, book category, etc.
2. Entity extraction
And extracting a knowledge entity from the abstract corpus. Considering a problem, which entities are entities conforming to book recommendation, and further considering two scenes in which book knowledge maps are expected to be realized in recommendation from application scenes:
(1) Implementing a finer granularity content tag recommendation system, such as a user interested in psychology, may recommend books of other psychology categories based on the entity "weber" of the character type.
(2) Cross-domain recommendation addresses the need for more users to find more books, recommending books containing the entity "Wei Jinna north orientation" based on entities [ Wei Jinna north orientation ] associated with the conceptual entity "late Qing Ming History" in books that users have browsed, exploring more points of interest for users.
Based on the two application scenes, three methods can be summed up to extract the entity as a knowledge entity suitable for the application scene of the book. Conceptual entity, point of interest entity, and nominated entity. Wherein the conceptual entity is an abstract topic tag with a topic association with the title; the focus entity is a topic label of interest to the user, and the granularity is finer than that of the conceptual entity; proper nouns contained in the private entity such as characters, place names, literary works, professional terms, and the like. Accordingly, the corresponding entity may be extracted by invoking a setup service interface program (the interface program is a different type of entity recognition model based on different training samples). And further, the entity type corresponding to the entity can be obtained through the service program result.
After extracting all the entities in the abstract corpus, an entity dictionary and an entity type mapping table can be further established. The purpose of setting up the entity type mapping table is that the proper noun entity is code coded, the focus point spectrum is Tag value, and the entity type can be unified by using the entity type mapping table which sets up the relation between the two. And then, calculating the IDF index of each entity based on the abstract corpus, and filtering the over-common entity based on the IDF index to remove the unnecessary entity in the partial reading books.
Through the entity extraction process, an entity table can be correspondingly established, and the data items specifically included in the entity table are as follows: entity name, entity ID, entity type, etc.
3. Entity association computation
In the process of entity association calculation, the relation between the entity and the book can be calculated first, firstly, the weights of the entities identified by the three types of entity identification models can be fused, and the weights can be the relation: conceptual entity > focus map spectrum identification > monograph identification; the number of books contained in the entity can be counted, the entity is ranked from big to small according to the number of books contained in the entity, and part of the entities are filtered.
Through the relation calculation process of the entity and the book, an entity and book association table can be correspondingly established, and the entity and book association table specifically comprises the following data items: entity ID, book ID, and their associated weights.
The calculation of the relationship between the entities can then be continued, and the specific process can be: firstly, the abstract of a book is split into sentences, and an input entity is used as a dictionary of the split words to obtain the split words corresponding to the split sentences; then, calculating word vectors of each word segment to generate a word vector word2vec model (one of the unsupervised learning models); and finally, based on the word vector word2vec model, calculating cosine similarity between entity words as a weight relation between the entities.
Through the above-mentioned relation calculation process of entity and entity, an entity and entity association table can be correspondingly established, and the data items specifically included in the entity and entity association table include: entity 1ID, entity 2ID and the association weight among the entities.
4. On-line computing
In one specific example, a collection of entities may be read from an association table of entities and books based on three books that were recently purchased and entered by a user. (if the number of the entities of the three books exceeds 20, preferentially taking the entity of the first book, and if the number of the entities of the three books is less than 5, reading a new entity from the entity-entity association table for supplementation); based on the read set of entities, reversely searching a set of books read from the association table of the entities and the books; the scoring sequence based on books is output from top to bottom, and entity types and entity names are adopted as recommendation reasons for output; and (5) removing the duplicate strategy if the outputted book set contains books with the same book name.
Example seven
Fig. 7 is a schematic structural diagram of a device for constructing a book knowledge graph according to a seventh embodiment of the present invention. The embodiment may be suitable for the case of building a book knowledge graph for recommending books to users, and the device may be implemented in a software and/or hardware manner, and may be integrated in any device that provides a function of building a book knowledge graph, as shown in fig. 7, where the device for building a book knowledge graph specifically includes: the abstract corpus acquisition module 710, the entity identification module 720, the entity and book association table building module 730 and the book knowledge graph building module 740. Wherein:
The abstract corpus acquisition module 710 is configured to acquire abstract corpuses corresponding to at least two books respectively;
the entity recognition module 720 is configured to perform entity recognition according to abstract corpus corresponding to the book;
The entity and book association table establishing module 730 is configured to establish an entity and book association table according to the identified entity, where the entity and book association table includes: a first association weight between the entity and the book;
and the book knowledge graph construction module 740 is used for constructing the book knowledge graph according to the entity and book association table.
According to the technical scheme, entity identification is carried out according to abstract corpus corresponding to books; according to the identified entity, an entity and book association table is established, and based on the entity and book association table, a technical means of book knowledge graph is established, a new book knowledge graph representing the association degree between the entity and books is established, and through the entity conditions included in the books, the association relationship between the books is established, the content in the book knowledge graph is redefined, and therefore a new technology for recommending books based on the book knowledge graph can be achieved.
Based on the above embodiments, the entity identification module 720 includes:
The model identification unit is used for inputting abstract corpus corresponding to books into at least one entity identification model to carry out entity identification, so as to obtain an entity matched with the entity identification model;
the entity recognition models of different types are obtained through training of training data of different types.
On the basis of the above embodiments, the types of the entity recognition model include: a conceptual entity recognition model, a point of interest entity recognition model, and a nominated entity recognition model;
the conceptual entity identification model is used for identifying conceptual entities, wherein the conceptual entities are topic labels which are related with the book;
the attention point entity identification model is used for identifying attention point entities, and the attention point entities are topic labels associated with interest points of users;
The special name entity recognition model is used for recognizing special name entities, and the special name entities are proper nouns included in the abstract corpus.
On the basis of the above embodiments, the method further comprises: the data form unification module is used for:
After inputting abstract corpus corresponding to books into at least one entity recognition model to perform entity recognition, obtaining an entity matched with the entity recognition model, if the recognized entity comprises at least two data forms, converting the obtained at least two data forms into the entity of the same data form according to the mapping relation between the entities of different data forms; performing de-duplication processing on the entities in the same data form;
The entity identified by the attention point entity identification model and the entity identified by the special name entity identification model have different data forms.
On the basis of the above embodiments, the method further comprises: the entity screening module is used for calculating an inverse text frequency index corresponding to the identified entity according to the abstract corpus corresponding to at least two books after entity identification is carried out according to the abstract corpus corresponding to the books; according to the calculated inverse text frequency index, filtering a first universal entity included in the entity; and/or
According to books corresponding to the abstract corpus of the identified entity, counting book quantity values associated with each entity; and filtering out a second universal entity included in the entities according to the book quantity value obtained through statistics.
Based on the above embodiments, the entity and book association table building module 730 specifically includes:
The first association weight determining unit is used for obtaining the identification weight corresponding to the identified entity, which is output after the abstract corpus is input into the entity identification model, and taking the identification weight as the first association weight between the entity and the book corresponding to the abstract corpus;
and the association table establishing unit is used for establishing the association table of the entity and the book according to the first association weight.
On the basis of the above embodiments, the method further comprises: the first association weight adjustment unit is used for adjusting the first association weight according to a weight adjustment coefficient corresponding to the entity identification model of the identified entity before the entity and book association table is established according to the first association weight;
wherein different types of entity recognition models have different weight adjustment coefficients.
On the basis of the above embodiments, the method further comprises: the entity table establishing module is used for establishing an entity table according to the entity identified by the abstract corpus after entity identification is carried out according to the abstract corpus corresponding to the book; the entity table comprises: correspondence between entities and entity types;
Correspondingly, the book knowledge graph construction module 740 is further configured to: and constructing the book knowledge graph according to the entity and book association table and the entity table.
On the basis of the above embodiments, the method further comprises: the book table establishing module is used for establishing a book table corresponding to at least two books after acquiring abstract corpus corresponding to the at least two books respectively, wherein the book table comprises the following components: book, abstract corpus and corresponding relation between book attribute information;
Correspondingly, the book knowledge graph construction module 740 is further configured to: and constructing the book knowledge graph according to the entity and book association table, the entity table and the book table.
On the basis of the above embodiments, the method further comprises: the entity and entity association table establishing module is used for establishing an entity and entity association table according to the semantic similarity among the identified entities after entity identification is carried out according to abstract corpus corresponding to books, and the entity and entity association table comprises: the second association weight between every two entities;
Correspondingly, the book knowledge graph construction module 740 is further configured to: and constructing the book knowledge graph according to the entity-book association table, the entity table, the book table and the entity-entity association table.
Based on the above embodiments, the entity-entity association table building module is specifically configured to: inputting the two entities into a semantic similarity recognition model respectively to obtain a second association weight between the two entities;
establishing the entity and entity association table according to the second association weight between every two entities;
The semantic similarity recognition model is obtained by dividing abstract corpus corresponding to at least two books into sentences and training word vectors obtained after the sentences are segmented by using each recognized entity as a segmentation dictionary.
The product can execute the method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example eight
Fig. 8 is a schematic structural diagram of a book recommendation device according to an eighth embodiment of the present invention. The embodiment may be suitable for recommending books to users, and the device may be implemented in a software and/or hardware manner, and may be integrated in any device that provides a book recommending function, as shown in fig. 8, where the book recommending device specifically includes: the book knowledge graph query module 810, the target entity search module 820, the to-be-recommended book search module 830, and the to-be-recommended book providing module 840, wherein:
The book knowledge graph query module 810 is configured to obtain at least one historical attention book of a user, and query a book knowledge graph according to the historical attention book, where the book knowledge graph includes: the entity and book association table comprises: a first association weight between the entity and the book;
a target entity searching module 820, configured to search at least one target entity associated with the historical attention book in the entity-book association table;
The to-be-recommended book searching module 830 is configured to reversely search, according to the target entity, at least one book corresponding to the target entity in the entity-book association table as a to-be-recommended book;
and the book to be recommended providing module 840 is configured to provide the book to be recommended to the user.
According to the technical scheme, a historical attention book of a user is obtained, a book knowledge graph is inquired according to the historical attention book, and at least one target entity associated with the historical attention book is searched in the entity and book association table; according to the target entity, at least one book corresponding to the target entity is reversely searched in the entity and book association table and provided for the user as a technical means for books to be recommended, so that a new mode of book recommendation based on book knowledge maps constructed by association degrees between each entity and each book is realized, the existing book recommendation technology is optimized, and the increasingly personalized and convenient book reading demands of people are met.
On the basis of the above embodiments, the book knowledge graph further includes: an entity table, the entity table comprising: correspondence between entities and entity types;
The apparatus further comprises: the recommendation reason item construction module is used for searching at least one book corresponding to the target entity in the entity table as a book to be recommended after reversely searching the entity and book association table according to the target entity, inquiring the entity type corresponding to the at least one target entity in the entity table, and constructing a recommendation reason item corresponding to the book to be recommended according to the target entity and the entity type corresponding to the target entity;
the book to be recommended providing module 840 is further configured to: and providing the books to be recommended and recommendation reason items corresponding to the books to be recommended for the user.
On the basis of the above embodiments, the book knowledge graph further includes: a book table, the book table comprising: book, abstract corpus and corresponding relation between book attribute information;
the apparatus further comprises: and the book screening module is used for screening the books to be recommended according to the book attribute information of the books to be recommended in the book table after reversely searching at least one book corresponding to the target entity in the entity and book association table to be used as the books to be recommended according to the target entity.
On the basis of the above embodiments, the book knowledge graph further includes: an entity-to-entity association table, the entity-to-entity association table comprising: the second association weight between every two entities;
The apparatus further comprises: and the first target entity screening module is used for acquiring an extended entity corresponding to the target entity in the entity-entity association table and adding the extended entity into the target entity if the number of the searched target entities is less than a first number threshold after searching at least one target entity associated with the historical attention book in the entity-book association table.
On the basis of the above embodiments, the device further includes: the second target entity screening module is used for determining the priority order of the historical attention books according to the attention time of the user to the at least one historical attention book if the number of the searched target entities is larger than a second number threshold after searching at least one target entity related to the historical attention books in the entity and book association table, and screening the target entities according to the priority order; and/or
And the duplicate removal module is used for carrying out duplicate removal processing on the book to be recommended before the book to be recommended is provided for the user.
The product can execute the method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example nine
Fig. 9 is a schematic structural diagram of a computer device according to a ninth embodiment of the present invention. Fig. 9 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 9 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 9, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 9, commonly referred to as a "hard disk drive"). Although not shown in fig. 9, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. In addition, in the computer device 12 of the present embodiment, the display 24 is not present as a separate body but is embedded in the mirror surface, and the display surface of the display 24 and the mirror surface are visually integrated when the display surface of the display 24 is not displayed. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing the method for constructing a book knowledge graph provided by the embodiment of the present invention:
Acquiring abstract corpus corresponding to at least two books respectively; entity identification is carried out according to abstract corpus corresponding to books; according to the identified entity, establishing an entity and book association table, wherein the entity and book association table comprises: a first association weight between the entity and the book; and constructing the book knowledge graph according to the entity and book association table.
For another example, the book recommendation method provided by the embodiment of the invention is implemented: acquiring at least one historical attention book of a user, and inquiring a book knowledge graph according to the historical attention book, wherein the book knowledge graph comprises: the entity and book association table comprises: a first association weight between the entity and the book; searching at least one target entity associated with the historical attention book in the entity and book association table; according to the target entity, reversely searching at least one book corresponding to the target entity in the entity and book association table to serve as a book to be recommended; and providing the books to be recommended to the user.
Examples ten
The tenth embodiment of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for constructing a book knowledge graph as provided in all the embodiments of the present application:
Acquiring a seed query formula, and acquiring an alternative query formula corresponding to the seed query formula by using a random walk technology; filling the domain synonyms under at least one domain category in the domain synonym table to be filled, which is matched with the seed query type, according to the alternative query type, so as to obtain the domain synonym table; obtaining a query type generating template corresponding to the alternative query type according to the alternative query type and the field synonym table; and generating an extended query formula corresponding to the alternative query formula according to the field synonym table and the query formula generation template.
Or when being executed by a processor, the program realizes the book recommendation method provided by all the application embodiments: acquiring at least one historical attention book of a user, and inquiring a book knowledge graph according to the historical attention book, wherein the book knowledge graph comprises: the entity and book association table comprises: a first association weight between the entity and the book; searching at least one target entity associated with the historical attention book in the entity and book association table; according to the target entity, reversely searching at least one book corresponding to the target entity in the entity and book association table to serve as a book to be recommended; and providing the books to be recommended to the user.
Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (17)

1. The construction method of the book knowledge graph is characterized by comprising the following steps:
acquiring abstract corpus corresponding to at least two books respectively;
Entity identification is carried out according to abstract corpus corresponding to books;
According to the abstract corpus corresponding to at least two books, calculating an inverse text frequency index corresponding to the identified entity; according to the calculated inverse text frequency index, filtering a first universal entity included in the entity; according to books corresponding to the abstract corpus of the identified entity, counting book quantity values associated with each entity; filtering out a second universal entity included in the entity according to the book quantity value obtained through statistics; wherein the first generalization entity refers to an entity with an inverse text frequency index smaller than a set threshold; the second universal entity refers to an entity with the associated book quantity value exceeding a set quantity threshold;
Establishing an entity table according to the entity identified by the abstract corpus; the entity table comprises: correspondence between entities and entity types; wherein, the entity refers to definition or abstract concept for describing the content of the book at a set angle; the entity types comprise conceptual entities, point of interest entities and special name entities;
According to the identified entity, establishing an entity and book association table, wherein the entity and book association table comprises: a first association weight between the entity and the book;
Constructing the book knowledge graph according to the entity and book association table and the entity table; the book knowledge graph comprises a plurality of books and a book association table of entities corresponding to the entities; the book knowledge graph records the association relationship between books and entities.
2. The method of claim 1, wherein performing entity recognition based on the abstract corpus corresponding to the book comprises:
inputting abstract corpus corresponding to books into at least one entity recognition model for entity recognition to obtain an entity matched with the entity recognition model;
the entity recognition models of different types are obtained through training of training data of different types.
3. The method of claim 2, wherein the entity recognition model type comprises: a conceptual entity recognition model, a point of interest entity recognition model, and a nominated entity recognition model;
the conceptual entity identification model is used for identifying conceptual entities, wherein the conceptual entities are topic labels which are related with the book;
the attention point entity identification model is used for identifying attention point entities, and the attention point entities are topic labels associated with interest points of users;
The special name entity recognition model is used for recognizing special name entities, and the special name entities are proper nouns included in the abstract corpus.
4. The method of claim 3, further comprising, after inputting the abstract corpus corresponding to the book into at least one entity recognition model to perform entity recognition to obtain an entity matched with the entity recognition model:
If the identified entity comprises at least two data forms, converting the obtained at least two data forms into the entity of the same data form according to the mapping relation between the entities of different data forms;
performing de-duplication processing on the entities in the same data form;
The entity identified by the attention point entity identification model and the entity identified by the special name entity identification model have different data forms.
5. The method of claim 3, wherein establishing an entity-book association table based on the identified entity comprises:
Acquiring an identification weight corresponding to an identified entity, which is output after the abstract corpus is input into the entity identification model, and taking the identification weight as a first association weight between the entity and a book corresponding to the abstract corpus;
and establishing an association table of the entity and the book according to the first association weight.
6. The method of claim 5, further comprising, prior to establishing the entity-book association table based on the first association weight:
Adjusting the first association weight according to a weight adjustment coefficient corresponding to the entity identification model of the identified entity;
wherein different types of entity recognition models have different weight adjustment coefficients.
7. The method of claim 1, further comprising, after obtaining the corpus of abstracts corresponding to the at least two books, respectively:
establishing a book table corresponding to the at least two books, wherein the book table comprises: book, abstract corpus and corresponding relation between book attribute information;
according to the entity and book association table and the entity table, constructing the book knowledge graph, further comprising:
and constructing the book knowledge graph according to the entity and book association table, the entity table and the book table.
8. The method of claim 7, further comprising, after entity recognition based on the abstract corpus corresponding to the book:
According to the identified semantic similarity between the entities, establishing an entity-entity association table, wherein the entity-entity association table comprises: the second association weight between every two entities;
According to the entity and book association table, the entity table and the book table, the book knowledge graph is constructed, and the method further comprises the steps of:
and constructing the book knowledge graph according to the entity-book association table, the entity table, the book table and the entity-entity association table.
9. The method of claim 8, wherein establishing the entity-to-entity association table based on the identified semantic similarity between the entities comprises:
Inputting the two entities into a semantic similarity recognition model respectively to obtain a second association weight between the two entities;
establishing the entity and entity association table according to the second association weight between every two entities;
The semantic similarity recognition model is obtained by dividing abstract corpus corresponding to at least two books into sentences and training word vectors obtained after the sentences are segmented by using each recognized entity as a segmentation dictionary.
10. A book recommendation method, comprising:
Acquiring at least one historical attention book of a user, and inquiring a book knowledge graph according to the historical attention book, wherein the book knowledge graph comprises: an entity and book association table and an entity table, wherein the entity table comprises: the corresponding relation between the entity and the entity type comprises the following components in the association table of the entity and the book: a first association weight between the entity and the book;
Searching at least one target entity associated with the historical attention book in the entity and book association table;
according to the target entity, reversely searching at least one book corresponding to the target entity in the entity and book association table to serve as a book to be recommended;
Inquiring an entity type corresponding to the at least one target entity in the entity table, and constructing a recommendation reason item corresponding to the book to be recommended according to the target entity and the entity type corresponding to the target entity;
Providing the books to be recommended and recommendation reason items corresponding to the books to be recommended to the user;
The book knowledge graph comprises a plurality of books and a book association table of entities corresponding to the entities; the book knowledge graph records the association relationship between books and entities; the book knowledge graph is constructed by adopting the construction method of the book knowledge graph in any one of claims 1-9.
11. The method of claim 10, wherein the book knowledge graph further comprises: a book table, the book table comprising: book, abstract corpus and corresponding relation between book attribute information;
After reversely searching at least one book corresponding to the target entity in the entity and book association table as a book to be recommended according to the target entity, the method further comprises the following steps:
And screening the books to be recommended according to the book attribute information of the books to be recommended in the book table.
12. The method of claim 11, wherein the book knowledge graph further comprises: an entity-to-entity association table, the entity-to-entity association table comprising: the second association weight between every two entities;
after searching at least one target entity associated with the historical attention book in the entity and book association table, the method further comprises the following steps:
And if the number of the searched target entities is smaller than a first number threshold, acquiring an extended entity corresponding to the target entity from the entity-entity association table, and adding the extended entity into the target entity.
13. The method of any of claims 10-12, further comprising, after locating at least one target entity associated with the historical focus book in the entity-book association table:
if the number of the searched target entities is determined to be greater than a second number threshold, determining the priority order of the historical attention books according to the attention time of the user to the at least one historical attention book, and screening the target entities according to the priority order; and/or
Before providing the book to be recommended to the user, the method further comprises: and carrying out de-duplication treatment on the books to be recommended.
14. The device for constructing the book knowledge graph is characterized by comprising the following components:
the abstract corpus acquisition module is used for acquiring abstract corpuses corresponding to at least two books respectively;
The entity recognition module is used for carrying out entity recognition according to abstract corpus corresponding to the book;
The entity screening module is used for calculating an inverse text frequency index corresponding to the identified entity according to abstract corpus corresponding to at least two books respectively; according to the calculated inverse text frequency index, filtering a first universal entity included in the entity; according to books corresponding to the abstract corpus of the identified entity, counting book quantity values associated with each entity; filtering out a second universal entity included in the entity according to the book quantity value obtained through statistics; wherein the first generalization entity refers to an entity with an inverse text frequency index smaller than a set threshold; the second universal entity refers to an entity with the associated book quantity value exceeding a set quantity threshold;
the entity table establishing module is used for establishing an entity table according to the entity identified by the abstract corpus; the entity table comprises: correspondence between entities and entity types; wherein, the entity refers to definition or abstract concept for describing the content of the book at a set angle; the entity types comprise conceptual entities, point of interest entities and special name entities;
the entity and book association table establishing module is used for establishing an entity and book association table according to the identified entity, wherein the entity and book association table comprises the following components: a first association weight between the entity and the book;
The book knowledge graph construction module is used for constructing the book knowledge graph according to the entity and book association table and the entity table;
the book knowledge graph comprises a plurality of books and a book association table of entities corresponding to the entities; the book knowledge graph records the association relationship between books and entities.
15. A book recommendation device, characterized by comprising:
The book knowledge graph inquiring module is used for acquiring at least one historical attention book of a user and inquiring a book knowledge graph according to the historical attention book, and the book knowledge graph comprises: an entity and book association table and an entity table, wherein the entity table comprises: the corresponding relation between the entity and the entity type comprises the following components in the association table of the entity and the book: a first association weight between the entity and the book;
The target entity searching module is used for searching at least one target entity associated with the historical attention book in the entity and book association table;
The book searching module to be recommended is used for reversely searching at least one book corresponding to the target entity in the entity and book association table according to the target entity to serve as a book to be recommended;
The recommending reason item construction module is used for inquiring the entity type corresponding to the at least one target entity in the entity table, and constructing a recommending reason item corresponding to the book to be recommended according to the target entity and the entity type corresponding to the target entity;
The book to be recommended providing module is used for providing the books to be recommended and recommendation reason items corresponding to the books to be recommended to the user;
The book knowledge graph comprises a plurality of books and a book association table of entities corresponding to the entities; the book knowledge graph records the association relationship between books and entities; the book knowledge graph is constructed by adopting the construction method of the book knowledge graph in any one of claims 1-9.
16. A computer device, the computer device comprising:
One or more processors;
Storage means for storing one or more programs,
When the one or more programs are executed by the one or more processors, the one or more processors implement the method for building a book knowledge graph according to any one of claims 1 to 9, or implement the book recommendation method according to any one of claims 10 to 13.
17. A computer storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the book knowledge graph construction method of any one of claims 1 to 9 or implements the book recommendation method of any one of claims 10 to 13.
CN201810719673.0A 2018-07-03 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium Active CN110737774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810719673.0A CN110737774B (en) 2018-07-03 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810719673.0A CN110737774B (en) 2018-07-03 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN110737774A CN110737774A (en) 2020-01-31
CN110737774B true CN110737774B (en) 2024-05-24

Family

ID=69234271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810719673.0A Active CN110737774B (en) 2018-07-03 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN110737774B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325033B (en) * 2020-03-20 2023-07-11 中国建设银行股份有限公司 Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN112434168A (en) * 2020-11-09 2021-03-02 广西壮族自治区图书馆 Knowledge graph construction method and fragmentized knowledge generation method based on library
CN112182424B (en) * 2020-11-11 2023-01-31 重庆邮电大学 Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN112818261A (en) * 2021-01-27 2021-05-18 沈阳美行科技有限公司 Navigation method and device based on POI (Point of interest) knowledge graph and electronic equipment
CN113076428A (en) * 2021-03-19 2021-07-06 北京沃东天骏信息技术有限公司 Method and device for generating book list
CN113688269B (en) * 2021-07-21 2023-05-02 北京三快在线科技有限公司 Image-text matching result determining method and device, electronic equipment and readable storage medium
CN114428864A (en) * 2022-04-01 2022-05-03 杭州未名信科科技有限公司 Knowledge graph construction method and device, electronic equipment and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826102A (en) * 2010-03-26 2010-09-08 浙江大学 Automatic book keyword generation method
CN103164405A (en) * 2011-12-08 2013-06-19 盛乐信息技术(上海)有限公司 Generation method for relevant video data bank, recommendation method and recommendation system for relevant videos
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
CN104346446A (en) * 2014-10-27 2015-02-11 百度在线网络技术(北京)有限公司 Paper associated information recommendation method and device based on mapping knowledge domain
CN105653706A (en) * 2015-12-31 2016-06-08 北京理工大学 Multilayer quotation recommendation method based on literature content mapping knowledge domain
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN107122444A (en) * 2017-04-24 2017-09-01 北京科技大学 A kind of legal knowledge collection of illustrative plates method for auto constructing
CN107729444A (en) * 2017-09-30 2018-02-23 桂林电子科技大学 Recommend method in a kind of personalized tourist attractions of knowledge based collection of illustrative plates
CN107943910A (en) * 2017-11-18 2018-04-20 电子科技大学 A kind of Individual book based on combinational algorithm recommends method
CN108009194A (en) * 2017-10-23 2018-05-08 广州星耀悦教育科技有限公司 A kind of books method for pushing, electronic equipment, storage medium and device
CN108090074A (en) * 2016-11-22 2018-05-29 上海阿法迪智能标签系统技术有限公司 Book recommendation system and method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101306667B1 (en) * 2009-12-09 2013-09-10 한국전자통신연구원 Apparatus and method for knowledge graph stabilization
CN104102713B (en) * 2014-07-16 2018-01-19 百度在线网络技术(北京)有限公司 Recommendation results show method and apparatus
US10032208B2 (en) * 2015-12-15 2018-07-24 International Business Machines Corporation Identifying recommended electronic books with detailed comparisons

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826102A (en) * 2010-03-26 2010-09-08 浙江大学 Automatic book keyword generation method
CN103164405A (en) * 2011-12-08 2013-06-19 盛乐信息技术(上海)有限公司 Generation method for relevant video data bank, recommendation method and recommendation system for relevant videos
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
CN104346446A (en) * 2014-10-27 2015-02-11 百度在线网络技术(北京)有限公司 Paper associated information recommendation method and device based on mapping knowledge domain
CN105653706A (en) * 2015-12-31 2016-06-08 北京理工大学 Multilayer quotation recommendation method based on literature content mapping knowledge domain
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN108090074A (en) * 2016-11-22 2018-05-29 上海阿法迪智能标签系统技术有限公司 Book recommendation system and method
CN107122444A (en) * 2017-04-24 2017-09-01 北京科技大学 A kind of legal knowledge collection of illustrative plates method for auto constructing
CN107729444A (en) * 2017-09-30 2018-02-23 桂林电子科技大学 Recommend method in a kind of personalized tourist attractions of knowledge based collection of illustrative plates
CN108009194A (en) * 2017-10-23 2018-05-08 广州星耀悦教育科技有限公司 A kind of books method for pushing, electronic equipment, storage medium and device
CN107943910A (en) * 2017-11-18 2018-04-20 电子科技大学 A kind of Individual book based on combinational algorithm recommends method

Also Published As

Publication number Publication date
CN110737774A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
CN110737774B (en) Book knowledge graph construction method, book recommendation method, device, equipment and medium
AU2018383346B2 (en) Domain-specific natural language understanding of customer intent in self-help
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN107436922B (en) Text label generation method and device
CN108829893B (en) Method and device for determining video label, storage medium and terminal equipment
CN106649818B (en) Application search intention identification method and device, application search method and server
CN112131350B (en) Text label determining method, device, terminal and readable storage medium
CN109190049B (en) Keyword recommendation method, system, electronic device and computer readable medium
CN109388743B (en) Language model determining method and device
CN110162768B (en) Method and device for acquiring entity relationship, computer readable medium and electronic equipment
US10915756B2 (en) Method and apparatus for determining (raw) video materials for news
CN113806588B (en) Method and device for searching video
CN113342958B (en) Question-answer matching method, text matching model training method and related equipment
CN112182145A (en) Text similarity determination method, device, equipment and storage medium
CN113806660A (en) Data evaluation method, training method, device, electronic device and storage medium
CN113314207A (en) Object recommendation method and device, storage medium and electronic equipment
CN111460224B (en) Comment data quality labeling method, comment data quality labeling device, comment data quality labeling equipment and storage medium
CN113077312A (en) Hotel recommendation method, system, equipment and storage medium
CN117131155A (en) Multi-category identification method, device, electronic equipment and storage medium
CN112231444A (en) Processing method and device for corpus data combining RPA and AI and electronic equipment
CN116796730A (en) Text error correction method, device, equipment and storage medium based on artificial intelligence
CN116662495A (en) Question-answering processing method, and method and device for training question-answering processing model
CN114201622B (en) Method and device for acquiring event information, electronic equipment and storage medium
CN116151258A (en) Text disambiguation method, electronic device and storage medium
CN112802454B (en) Method and device for recommending awakening words, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant