CN110737774A - Book knowledge graph construction method, book recommendation method, device, equipment and medium - Google Patents

Book knowledge graph construction method, book recommendation method, device, equipment and medium Download PDF

Info

Publication number
CN110737774A
CN110737774A CN201810719673.0A CN201810719673A CN110737774A CN 110737774 A CN110737774 A CN 110737774A CN 201810719673 A CN201810719673 A CN 201810719673A CN 110737774 A CN110737774 A CN 110737774A
Authority
CN
China
Prior art keywords
entity
book
books
entities
association table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810719673.0A
Other languages
Chinese (zh)
Other versions
CN110737774B (en
Inventor
许瑾
刘文昱
郝萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810719673.0A priority Critical patent/CN110737774B/en
Priority claimed from CN201810719673.0A external-priority patent/CN110737774B/en
Publication of CN110737774A publication Critical patent/CN110737774A/en
Application granted granted Critical
Publication of CN110737774B publication Critical patent/CN110737774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses book knowledge graph construction and book recommendation methods, devices, computer equipment and storage media.

Description

Book knowledge graph construction method, book recommendation method, device, equipment and medium
Technical Field
The embodiment of the invention relates to a data processing technology, in particular to a method and a device for constructing and recommending book knowledge graphs, a computer device and a storage medium.
Background
At present, with the popularization of mobile terminals such as mobile phones and the development of electronic book readers, electronic books are more and more favored by reading users, reading APP (application programs) at present recommend that certain popular books or books with higher rating are selected on a main interface for the users to read, but the popular books or books with higher rating are not favored by the users.
With the continuous improvement of the technology, the requirements of people on the book recommendation technology are also continuously improved, and the existing book recommendation technology cannot meet the ever-increasing personalized and convenient book reading requirements of people.
Disclosure of Invention
The embodiment of the invention provides construction methods and devices of book knowledge graphs and book recommendation methods, devices, computer equipment and storage media, so as to provide new ways for constructing book knowledge graphs and provide new ways for book recommendation based on the book knowledge graphs.
, the embodiment of the invention provides a method for constructing a book knowledge graph, which comprises the following steps:
acquiring abstract corpora corresponding to at least two books respectively;
performing entity recognition according to the abstract corpus corresponding to the book;
establishing an entity and book association table according to the identified entities, wherein the entity and book association table comprises -th association weight between the entities and the books;
and constructing the book knowledge graph according to the entity and book association table.
In a second aspect, an embodiment of the present invention further provides an book recommendation method, including:
acquiring at least historical attention books of a user, and inquiring a book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books;
searching at least target entities related to the history attention books in the entity-book association table;
reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity;
and providing the book to be recommended to the user.
In a third aspect, an embodiment of the present invention further provides a device for constructing kinds of book knowledge graphs, including:
the abstract corpus acquiring module is used for acquiring abstract corpuses corresponding to at least two books respectively;
the entity identification module is used for carrying out entity identification according to the abstract linguistic data corresponding to the book;
an entity and book association table establishing module, configured to establish an entity and book association table according to the identified entity, where the entity and book association table includes -th association weight between the entity and the book;
and the book knowledge graph building module is used for building the book knowledge graph according to the entity and book association table.
In a fourth aspect, an embodiment of the present invention further provides kinds of book recommendation devices, including:
the book knowledge graph query module is used for acquiring at least historical attention books of a user and querying the book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books;
the target entity searching module is used for searching at least target entities related to the history attention books in the entity-book association table;
the book to be recommended searching module is used for reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity;
and the book to be recommended providing module is used for providing the book to be recommended to the user.
In a fifth aspect, the embodiment of the present invention further provides computer devices, including a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor, when executing the program, implements the book knowledge graph constructing method according to any in the embodiment of the present invention, or implements the book recommending method according to any in the embodiment of the present invention.
In a sixth aspect, the present invention further provides computer-readable storage media, on which a computer program is stored, which when executed by a processor, implements the method for constructing a book knowledge graph as described in any in this embodiment of the present invention, or implements the method for recommending books as described in any in this embodiment of the present invention.
The technical scheme includes that entity recognition is carried out according to abstract linguistic data corresponding to books, an entity-book association table is established according to recognized entities, a book knowledge graph is established based on the entity-book association table, then the book knowledge graph is inquired according to historical attention books of a user, at least target entities associated with the historical attention books are found in the entity-book association table, at least books corresponding to the target entities are reversely found in the entity-book association table and serve as books to be recommended to the user according to the target entities, new ways of establishing the book knowledge graph according to association degrees between the entities and the books and carrying out book recommendation based on the book knowledge graph are provided, the existing book recommendation technology is optimized, and the increasingly personalized and convenient book reading requirements of people are met.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a method for building a book knowledge graph in an embodiment of the present invention;
FIG. 2 is a flow chart of the construction method of the kinds of book knowledge-graphs in the second embodiment of the invention;
FIG. 3 is a flow chart of the construction method of the book knowledge-graphs in the third embodiment of the invention;
FIG. 4a is a flow chart of the construction method of the kinds of book knowledge-graphs in the fourth embodiment of the invention;
FIG. 4b is a schematic structural diagram of a book knowledge-graph constructed by the method of the fourth embodiment of the invention;
FIG. 5 is a flow chart of book recommendation methods in the fifth embodiment of the present invention;
FIG. 6a is a flowchart of book recommendation methods in a sixth embodiment of the present invention;
FIG. 6b is a diagram showing the display interface of recommended books in the sixth embodiment of the invention;
FIG. 6c is a diagram illustrating application scenarios for which the method of the embodiment of the present invention is applicable;
FIG. 7 is a schematic structural diagram of an book knowledge-graph constructing device in the seventh embodiment of the invention;
FIG. 8 is a schematic structural diagram of an book recommendation device in an eighth embodiment of the present invention;
fig. 9 is a schematic structural diagram of computer devices in the ninth embodiment of the present invention.
Detailed Description
The present invention will now be described in further detail with reference to the drawings and examples, it being understood that the specific embodiments herein described are merely illustrative of and not restrictive on the broad invention, and it should be further noted that for the purposes of description, only some, but not all, of the structures associated with the present invention are shown in the drawings.
Before discussing exemplary embodiments in greater detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts.
For the convenience of reading, the corresponding relationship between each drawing and the main content of the application is briefly described as follows: aiming at the embodiment of the construction method of the book knowledge graph, the mainly related attached figures are as follows: fig. 1, fig. 2, fig. 3, fig. 4a and fig. 4 b; aiming at the embodiment of the book recommendation method, the mainly related drawings are as follows: fig. 5, fig. 6a, fig. 6b and fig. 6c are the following drawings mainly relating to the embodiment of the book knowledge base building device and the book recommending device: fig. 7 and 8; for an embodiment of a computer device, the figure primarily referred to is fig. 9.
Example
FIG. 1 is a flowchart of a method for constructing a book knowledge graph of books provided in of an embodiment of the present invention, which is applicable to a case of constructing a book knowledge graph for recommending books to a user, and which can be implemented by a book knowledge graph constructing apparatus in an embodiment of the present invention, which can be implemented in software and/or hardware, and , and which can be integrated into a device with a computing capability of , such as a terminal or a server, as shown in FIG. 1, the method specifically includes the following operations:
s110, abstract linguistic data corresponding to at least two books are obtained.
The abstract corpus is a short text which is concise and exactly describes the semantic coherence of the main contents of the book , so that when electronic books are recorded by the internet, the abstract corpus corresponding to the books is correspondingly recorded, and the abstract corpus is used for the user to search the books.
Typically, the corpus of the abstracts corresponding to each e-book can be obtained from or more data sources (e.g., a hundred degree library or other electronic document library, etc.), it should be noted that, if the corpus of a plurality of books is obtained from at least two data sources, the corpus of the plurality of data sources needs to be first subject to book name mapping (typically, the book name mapping can be implemented by means of similarity calculation of book names), so as to obtain or more corpus matching the book.
Optionally, after the abstract corpus corresponding to at least two books is obtained, the abstract corpus may be first subjected to data filtering and cleaning, for example, an operation of removing a special symbol or solving a problem of a coding format or the like is performed, so as to obtain the abstract corpus meeting the set data format.
And S120, performing entity identification according to the abstract corpus corresponding to the book.
In this embodiment, the entity refers to a definition or abstract concept for describing the contents of the book at a set angle. For example, the place name (e.g., the river) contained in the book, the history period (e.g., the period of the north-south orientation) corresponding to the content described in the book, the main history characters (e.g., the Zhao cloud) appearing in the book, and the like.
In the optional embodiments of this embodiment, the entity identification is performed according to the abstract corpus corresponding to the book, which may be that the abstract corpus corresponding to the book is participled, the participle result is matched with a set entity lexicon, and an entity included in the abstract corpus is determined according to the matching result;
in another optional embodiments of this embodiment, the entity recognition is performed according to the abstract corpus corresponding to the book, or the abstract corpus corresponding to the book is input into the entity recognition model for entity recognition, so as to obtain an entity matching the entity recognition model.
The entity recognition model can be trained in advance by setting a training sample marked with an entity.
S130, establishing an entity and book association table according to the identified entities, wherein the entity and book association table comprises -th association weight between the entities and the books.
The association weight is used for measuring the association degree between the entity and the book, and the higher the association degree between the entity and the book is, the larger the value of the association weight is, typically, the value range of the association weight can be between [0 and 1 ].
Alternatively, only the entity and book with the th association weight value greater than the set threshold (e.g., 0.8) may be obtained, and the entity and book association table may be constructed, so as to ensure that there is an explicit association relationship between the entity and the book included in the illustrated association table.
In this embodiment, the th association weight between the entity and the book can be determined according to the frequency of the entity appearing in the abstract corpus of the book;
at step , since the digest corpus of the book reflects the main content of the book to a certain extent , the association weight between the entity and the digest corpus can be simply used as the th association weight between the entity and the book, and therefore, in the process of outputting the entity corresponding to the digest corpus through the entity recognition model in S120, the recognition weight between the digest corpus and the entity can be simultaneously output, and the recognition weight can be used as the th association weight between the entity and the book.
By way of example and not limitation, the data in the entity-book association table may be in the form of an entity name of Mingqing History, a book name of MingChao those matters, an th association weight of 0.85, an entity name of Zurich, a book name of Januse Otsuriensis, and a association weight of 0.92.
It is understood that in the entity-book association table, the entity names and the book names may be expressed in the form of a Chinese name or an English name, but may also be in the form of an entity name number (ID) or a book name number, as long as it is ensured that the same number can determine entity names or book names only in , which is not limited herein.
S140, the book knowledge graph is constructed according to the entity and book association table.
In this embodiment, book knowledge maps in which the association relationship between books and entities is recorded may be constructed based on the entity-book association tables corresponding to a plurality of books and a plurality of entities.
Step , when a book needs to be recommended to a user, the book knowledge graph may be first queried according to the history of the user concerning the book (for example, the browsed or searched book), the entity included in the history concerning the book is determined, after the entity is obtained, the book knowledge graph is reversely searched, and other books corresponding to the entity are determined, so that a new book recommendation method based on the entity may be implemented.
According to the technical scheme of the embodiment of the invention, entity recognition is carried out according to the abstract linguistic data corresponding to the book; according to the identified entities, an entity-book association table is established, a book knowledge graph is established based on the entity-book association table, a new book knowledge graph representing the association degree between the entities and the books is established, the association relation between the books is established through the entity conditions in the books, the content in the book knowledge graph is redefined, and therefore a new technology for book recommendation based on the book knowledge graph can be achieved.
On the basis of the foregoing embodiments, after performing entity recognition according to the digest corpus corresponding to the book and performing entity recognition according to the digest corpus corresponding to the book, the method may further include:
calculating an inverse text frequency index corresponding to the identified entity according to the corpus of abstracts corresponding to at least two books, respectively, filtering -th common entities included in the entities according to the inverse text frequency index obtained by calculation, and/or filtering
Counting the number value of books related to each entity according to the books corresponding to the abstract corpus of the identified entity; and filtering out a second general entity included in the entity according to the book quantity value obtained by statistics.
The book knowledge graph is constructed by only the characteristic and distinguishable non-universal entities, so that the representativeness of the entities in the book knowledge graph to the book is stronger, and the book recommendation effect is better.
Example two
FIG. 2 is a flowchart of a method for constructing a knowledge graph of books according to a second embodiment of the present invention, which is embodied on the basis of the above-mentioned embodiments, and in which the entity recognition is performed according to the corpus of the abstract corresponding to the books, specifically, the corpus of the abstract corresponding to the books is inputted into at least entity recognition models for entity recognition, so as to obtain entities matching with the entity recognition models, and,
after inputting the abstract corpus corresponding to the book into at least entity recognition models for entity recognition to obtain entities matching the entity recognition models, the method further specifically includes, if the recognized entities include at least two data forms, converting the obtained entities in the at least two data forms into entities in the same data form according to the mapping relationship between the entities in different data forms, and performing deduplication processing on the entities in the same data form.
S210, abstract linguistic data corresponding to at least two books are obtained.
S220, inputting the abstract corpus corresponding to the book into at least entity recognition models for entity recognition to obtain entities matched with the entity recognition models, wherein the entity recognition models of different types are obtained by training different types of training data.
In this embodiment, the types of the entity recognition model may include: a conceptual entity identification model, a point of interest entity identification model and a proper name entity identification model;
wherein the conceptual entity identification model is used for identifying a conceptual entity, and the conceptual entity is a topic tag which is associated with the book in a theme way; the point of interest entity identification model is used for identifying a point of interest entity, and the point of interest entity is a topic tag associated with a user point of interest; the named entity identification model is used for identifying a named entity, and the named entity is a proper noun included in the abstract corpus.
In specific examples, the conceptual entity specifically refers to an entity directly related to the subject or content of the book, such as "say novel" or "accomplishment novel", and the point of interest entity specifically refers to an entity related to the user's point of interest determined by a plurality of users for the keywords searched by book, such as "psychology" or "blackness" and the like.
In optional implementation manners of this embodiment, different types of entity recognition models may be pre-constructed, and each entity recognition model is obtained by selecting, for the characteristics of an entity that can be recognized by the entity recognition model, a corresponding training sample labeled with the entity for training.
S230, judging whether the identified entity comprises at least two data forms, if so, executing S240; otherwise, S250 is executed.
Typically, the proper noun recognition type is code encoding, the focus point spectrum type is Tag value, the data forms of the two are not unified , for the convenience of subsequent operation, when the recognized entities are determined to have the proper noun entity and the focus point entity at the same time, the data forms of the two entities need to be unified .
S240, converting the obtained entities in at least two data forms into entities in the same data form according to the mapping relation among the entities in different data forms, and executing S250.
Typically, a mapping relationship between the named entities and the point of interest entities may be established, and assuming that the data form between the point of interest entities and the conceptual entities is , the data form of the named entities in the identified entities may be converted into the data form of the point of interest entities based on the mapping relationship, so as to ensure that all the finally obtained entities are entities with the same data form .
And S250, carrying out deduplication processing on the entity in the same data form as .
When the data forms of the entities obtained by the respective entity recognition models are determined to be , the repeated entities included therein can be conveniently removed.
S260, acquiring an identification weight corresponding to the identified entity and output after the abstract corpus is input into the entity identification model, and taking the identification weight as th association weight between the entity and the book corresponding to the abstract corpus.
In this embodiment, when constructing the entity recognition model, in addition to outputting the corresponding recognized entity when the entity recognition model inputs the abstract corpus, the entity recognition model may also output a recognition weight between the entity and the input abstract corpus at the same time, where the recognition weight reflects a degree of association between the output entity and the input abstract corpus, and the recognition weight may be represented by a value with a value range between [0, 1 ].
And S270, adjusting the associated weight according to the weight adjusting coefficient corresponding to the entity identification model for identifying the entity, wherein different types of entity identification models have different weight adjusting coefficients.
In this embodiment, , three different types of entity recognition models, that is, a conceptual entity recognition model, a point-of-interest entity recognition model, and a named entity recognition model, are configured in common for recognizing the three different types of entities, so that in order to make the hit rate of the entity on the actual needs of the user higher, weights or importance levels corresponding to the different entities may be predefined, and further, when book recommendation is performed, the entity with the higher weight may be preferentially used for book recommendation to the user.
Optionally, it may be predefined: the weight of the conceptual entity > the weight of the entity of interest > the weight of the named entity, and accordingly, based on the weight setting, the following can be further set: the weight adjustment coefficient of the conceptual entity identification model is greater than that of the entity identification model of the attention point; and the weight adjustment coefficient of the entity identification model of the attention point is greater than that of the named entity identification model.
Correspondingly, the th associated weight is adjusted according to the weight adjustment coefficient corresponding to the entity identification model for identifying the entity, so that the aim of adjusting the weights of different types of entities can be fulfilled.
In concrete examples, corpus identified entity A after being input into the conceptual entity recognition model, the associated weight of corresponding to the entity A was 0.9, the corpus identified entity B after being input into the point-of-interest entity recognition model, the associated weight of corresponding to the entity B was 0.9, and the corpus identified entity C after being input into the named entity recognition model, the associated weight of corresponding to the entity C was also 0.9.
Although the -th association weights of the entity a, the entity B, and the entity C obtained by different entity recognition models are the same, considering that the three entities have different degrees of importance, the weight adjustment coefficient corresponding to the conceptual entity recognition model may be set to 1, the weight adjustment coefficient corresponding to the point-of-interest entity recognition model may be set to 0.95, the weight adjustment coefficient corresponding to the named entity recognition model may be set to 0.92, and accordingly, the -th association weight corresponding to the entity a may be adjusted to 0.9 ═ 1 ═ 0.9, the -th association weight corresponding to the entity B may be adjusted to 0.9 × 0.95 ═ 0.855, and the -th association weight corresponding to the entity C may be adjusted to 0.9 ═ 0.92 ═ 0.828, based on the weight adjustment coefficients.
S280, establishing the entity and book association table according to the th association weight.
In this embodiment, books with the value of the th association weight being greater than the set threshold and corresponding entities may be selected to establish the entity-book association table.
And S290, constructing the book knowledge graph according to the entity and book association table.
According to the technical scheme, different types of entity recognition models are constructed from actual application scenes, various types of entities which can meet actual requirements of users can be recognized based on the different types of entity recognition models, and books which are actually expected by the users can be accurately hit for recommendation by establishing the book knowledge graph based on the entities, so that the use experience of the users is further improved ;
meanwhile, the entity identified by different entity identification models is subjected to weight adjustment, so that the entity with high importance can be preferentially selected in the book knowledge graph, and the hit rate of actual needs of a user is higher when the book knowledge graph is used for book recommendation.
EXAMPLE III
Fig. 3 is a flowchart of a method for constructing knowledge maps of books according to a third embodiment of the present invention, which is embodied based on the above-described embodiments, and in this embodiment, an operation of creating an association table between an entity and a book according to an identified entity is embodied, and an operation of screening an entity obtained by identifying the entity according to a corpus of a digest corresponding to the book is added.
S310, abstract linguistic data corresponding to at least two books are obtained.
And S320, inputting the abstract corpus corresponding to the book into at least entity recognition models for entity recognition to obtain an entity matched with the entity recognition models.
S330, calculating the inverse text frequency index corresponding to the identified entity according to the abstract linguistic data respectively corresponding to at least two books, and filtering -th universal entities included in the entity according to the calculated inverse text frequency index.
The IDF index of a specific term (entity) can be obtained by dividing the total number of documents (corpus) by the number of documents (corpus) containing the term, and taking the obtained quotient logarithm.
Therefore, after calculating the IDF index of each entity, the entity with the IDF index smaller than the set threshold (i.e., the generalized entity) can be filtered out, and the obtained entities can be sorted in the order of the IDF indexes from large to small, and the remaining entities are filtered out after the set number of entities are reserved according to the sorting result.
S340, counting the number value of books related to each entity according to the books corresponding to the abstract corpus of the identified entity; and filtering out a second general entity included in the entity according to the book quantity value obtained by statistics.
In this embodiment, it may be preset that if the -th association weight between the digest corpus of books and entities is greater than a set threshold, for example, 0.8, the book is associated with the entities, and accordingly, the total number of books associated with each entity may be further counted.
S350, establishing an entity and book association table according to the identified entities, wherein the entity and book association table comprises th association weight between the entities and the books.
The entity is specifically the entity which is remained after the th pervasive entity and the second pervasive entity are filtered out and can accurately represent the book.
And S360, constructing the book knowledge graph according to the entity and book association table.
According to the technical scheme of the embodiment of the invention, the universal entity is filtered from the entities identified from the abstract corpus, so that the entities in the book knowledge graph can reflect the characteristics of different types of books, and the description capability of the entities to the books is stronger.
Example four
FIG. 4a is a flowchart of a method for constructing book knowledge graphs in the fourth embodiment of the present invention, which is embodied on the basis of the above embodiments, and in this embodiment, step is further performed to establish an entity table, a book table, and an entity-entity association table, which are added to the book knowledge graphs.
S410, abstract linguistic data respectively corresponding to at least two books are obtained.
S420, establishing a book table corresponding to the at least two books, wherein the book table comprises: correspondence among books, abstract linguistic data and book attribute information.
The book specifically refers to names of books, or book numbers for only recognition of books, and the book attribute information specifically refers to information for expressing related attributes or characteristics of books, such as information of authors, publication time, book categories, user ratings or network scores.
And S430, performing entity identification according to the abstract corpus corresponding to the book.
S440, establishing an entity and book association table according to the identified entities, wherein the entity and book association table comprises th association weight between the entities and the books.
S450, establishing an entity table according to the entities identified by the abstract corpus; the entity table comprises: and the corresponding relation between the entity and the entity type.
The entity type is a generic concept of the entity and is used for representing categories corresponding to a plurality of homogeneous entities. Typically, the entity types may include: literary works, movie works, place names, songs, cartoons, characters, organizations, books, creatures, foods, phenomena, activities, awards, chemicals, celestial objects, and the like.
In optional embodiments of this embodiment, when constructing corresponding entity recognition models, after outputting the digest corpus to entity recognition models, the entity recognition models can output entity types corresponding to the recognized entities in addition to the recognized entities, and further can establish entity tables according to output results of the entity recognition models.
In specific examples, the data form in the entity table can be dog night fork, cartoon character, Beijing and geographical name.
Optionally, S450 may be executed after S440 or before S440, which is not limited in this embodiment.
S460, establishing an entity-entity association table according to the identified semantic similarity between the entities, wherein the entity-entity association table comprises: a second association weight between two entities.
Wherein the second association weight is used to characterize a degree of association between two entities.
In optional implementation manners of this embodiment, establishing the entity table according to the entity identified by the digest corpus may include:
inputting every two entities into a semantic similarity recognition model respectively to obtain second association weight between every two entities; establishing an entity-entity association table according to the second association weight between every two entities;
the semantic similarity recognition model is obtained by segmenting abstract linguistic data corresponding to at least two books into sentences and training word vectors obtained after segmenting the sentences by using the recognized entities as a word segmentation dictionary.
In specific examples, the data in the entity table may be in the form of entity : psychology, entity two: weber, second association weight: 0.86, entity : mingzhou, entity two: royal yangming, second association weight: 0.82.
S470, according to the entity and book association table, the entity table, the book table and the entity and entity association table, the book knowledge graph is constructed.
Specifically, the book knowledge graph may be formed by the entity-book association table, the entity table, the book table, and the entity-entity association table.
The technical scheme of the embodiment of the invention further enriches the contents in the book knowledge graph, and further can adopt an effective recommendation strategy to accurately and efficiently recommend books to users based on the contents in the book knowledge graph.
FIG. 4b is a schematic structural diagram of a book knowledge-graph constructed by the method of the fourth embodiment of the invention. As shown in fig. 4b, the reading map comprises: entity, book, entity-entity relationship, entity-book relationship, book attribute information, and book category. Entities are conceptual entities, point of interest entities, and named entities extracted from the abstract sections of the book, such as the people "galileo", the place name "suzhou", and the time "mingzhou". Entity-to-entity relationships are similarity weights in semantic space. The relationship of the entity to the book is the relevance weight of the entity to the book.
The book, the attribute information of the book and the classification of the book can be obtained through a pre-constructed book table, the entity-entity relationship can be obtained through an entity-entity association table, and the entity-book relationship can be obtained through an entity-book association table.
EXAMPLE five
Fig. 5 is a flowchart of an book recommendation method according to a fifth embodiment of the present invention, where this embodiment is applicable to a case of constructing a recommendation apparatus for recommending books to a user, and the method may be executed by the book recommendation apparatus according to the fifth embodiment of the present invention, and the apparatus may be implemented in software and/or hardware, and may be generally integrated in various terminal devices, such as a mobile phone or a tablet computer, as shown in fig. 5, the method specifically includes the following operations:
s510, obtaining at least historical attention books of a user, and inquiring a book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books.
The history interest books particularly refer to books selected to be viewed or searched by the user in the latest time period and represent the books of the interest points of the user.
S520, at least target entities related to the history attention books are searched in the entity and book association table.
In optional embodiments of this embodiment, if it is determined that the number of the found target entities is greater than a set number threshold (e.g., 20 or 30), determining a priority order of the history books according to the attention time of the user to at least history books, and screening the target entities according to the priority order.
And S530, reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity.
In this embodiment, since the association relationship between the entity and the book is recorded in the entity-book association table, after or more target entities are determined, the book associated with the target entities can be further acquired as the book to be recommended.
Optionally, before providing the book to be recommended to the user, the method further includes: and performing duplicate removal processing on the book to be recommended.
And S540, providing the book to be recommended to the user.
The technical scheme of the embodiment of the invention realizes a new book recommendation mode based on the book knowledge graph established by the association degree between each entity and each book, optimizes the existing book recommendation technology, and meets the increasing personalized and convenient book reading requirements of people.
EXAMPLE six
Fig. 6 is a flowchart of book recommendation methods in a sixth embodiment of the present invention, which is embodied on the basis of the above embodiments, and in this embodiment, the book recommendation method is refined in step based on an entity table, a book table, and an entity-entity association table included in a book knowledge graph, and accordingly, the method in the sixth embodiment of the present invention specifically includes the following operations:
s610, obtaining at least historical attention books of the user, and inquiring the book knowledge graph according to the historical attention books.
The book knowledge graph comprises: entity and book association table, entity table, book table and entity association table, wherein:
the entity and book association table comprises th association weight between entities and books, the entity table comprises corresponding relation between the entities and entity types, the book table comprises corresponding relation between books, abstract linguistic data and book attribute information, and the entity and entity association table comprises second association weight between every two entities.
S620, at least target entities related to the history attention books are searched in the entity-book association table.
S630, judging whether the number of the searched target entities is smaller than th number threshold, if so, executing S640, otherwise, executing S650.
And S640, acquiring an extended entity corresponding to the target entity from the entity-entity association table, adding the extended entity into the target entity, and executing S670.
S650, judging whether the number of the searched target entities is larger than a second number threshold: if yes, go to S660; otherwise, S670 is performed.
S660, determining the priority sequence of the history attention books according to the attention time of the user to at least history attention books, screening the target entities according to the priority sequence, and executing S670.
And S670, reversely searching at least books corresponding to the target entity in the entity and book association table as books to be recommended according to the target entity.
And S680, screening the books to be recommended according to the book attribute information of the books to be recommended in the book table.
Typically, the book attribute information specifically refers to a score of a book, for example, books with a score smaller than 8 points may be screened from all obtained books to be recommended, so as to ensure that the finally recommended books are books with high quality and good user feedback.
And S690, performing duplicate removal processing on the book to be recommended.
S6100, inquiring entity types corresponding to the at least target entities in the entity table, and constructing recommendation reason items corresponding to the books to be recommended according to the target entities and the entity types corresponding to the target entities.
For example: the structure is as follows: the recommendation reason item of ' recommending according to place name-river in the knowledge graph ', wherein ' river-approaching ' is a target entity, and the place name ' is an entity type corresponding to the target entity.
By constructing the recommendation reason items, the books recommended to the user have interpretability, which is the capability that any book recommendation technologies in the prior art do not have, the existing book recommendation technologies can only recommend the associated books to the user, but cannot describe the recommendation reason to the user qualitatively, but by using the entity-based recommendation method of the embodiment of the invention, the recommendation reason of the books can be provided to the user qualitatively through the combination of the entity types and the entity names, and the book reading requirements of the user can be greatly met.
S6110, the book to be recommended and the recommendation reason item corresponding to the book to be recommended are provided for the user.
Fig. 6b is a schematic diagram of a display interface of recommended books in the sixth embodiment of the invention, wherein the books recommended in the boxes in fig. 6b are the books recommended by the book knowledge graph.
The technical scheme of the embodiment of the invention realizes a new book recommendation mode based on the book knowledge graph established by the association degree between each entity and each book, optimizes the existing book recommendation technology, and meets the increasing personalized and convenient book reading requirements of people.
FIG. 6c is a schematic diagram of application scenarios applicable to the method of the embodiment of the present invention, the application scenarios combine the book knowledge graph construction and the book recommendation process, and as shown in FIG. 6c, the application scenarios are divided into four parts, 1, multi-element heterogeneous data processing, 2, entity extraction 3, entity association calculation, and 4, online calculation.
1. The multi-source heterogeneous data processing method comprises the following steps of dividing the processing process into two processes, wherein processes are book name mapping processes, the processes are mainly used for carrying out book name mapping on books containing abstract linguistic data from multiple sources, typically, the book name mapping can be realized in a mode of calculating the similarity of jaccard among book names, and processes are abstract cleaning processes, and the processes are mainly used for filtering and cleaning the book extracts, removing special symbols, solving the problem of coding formats and the like.
Through the operation of multi-source heterogeneous data processing, a book table corresponding to multiple books can be correspondingly established, and the data items specifically included in the book table are as follows: book name, book ID, excerpt bibliography, book score, and book category.
2. Entity extraction
problems are considered, which entities are entities according with book recommendation, and then two scenes that book knowledge maps are expected to be realized in recommendation are considered from an application scene:
(1) a finer-grained content tag recommendation system is implemented, for example, users are interested in psychology, and books of other psychology categories can be recommended based on an entity weber of a person type (weber, psychologist).
(2) Cross domain recommendations address the need for more users to discover more books, recommending books containing the entity "wei jin nan bei dynasty" based on the user having browsed conceptual entities in the books [ wei jin nan bei dynasty ] associated with the late-heat history, exploring more points of interest to the user.
Three methods can be summarized to extract the entity based on the two application scenes, and the entity can be used as a knowledge entity suitable for the application scene of the book. Conceptual entities, point of interest entities, and named entities. Wherein the conceptual entity is an abstract topic tag having a subject association with the book title; the focus entity is a topic tag which is interesting to the user, and the granularity is finer than that of a conceptual entity; proper nouns included in the proper name entity are, for example, characters, place names, literary works, professional terms, and the like. Accordingly, the corresponding entity can be extracted by calling a set service interface program (the interface program is different types of entity recognition models obtained based on different training samples). And further obtaining the entity type corresponding to the entity through the service program result.
The purpose of establishing the entity type mapping table is that the type of a proper noun entity is code coding, the type of a point diagram spectrum is concerned to be a Tag value, and the entity type mapping table establishing the relationship between the point diagram spectrum and the Tag value is used to unify the entity type.
Through the entity extraction process, the entity table can be correspondingly established, and the data items specifically included in the entity table are as follows: entity name, entity ID, and entity type.
3. Entity association computation
In the process of entity association calculation, the relationship between the entity and the book may be calculated first, and first, the weights of the entities identified by the three types of entity identification models may be fused, where the weights may be the relationship: conceptual entity > attention point diagram spectrum recognition > proper name recognition; and then the number of books contained in the entity can be counted, the entities are sorted from large to small in weight based on the number of books contained in the entity, and part of the entities are filtered.
Through the relation calculation process of the entity and the book, an entity and book association table can be correspondingly established, and the data items specifically included in the entity and book association table are as follows: an entity ID, a book ID, and an association weight of the two.
And then, continuously calculating the relation calculation between the entities, wherein the specific process can be that firstly, the abstract of the book is divided into sentences, the entities are input as a dictionary of word segmentation to obtain word segmentation corresponding to the divided sentences, then word vectors of all the word segmentation are calculated to generate word vector word2vec models ( types of unsupervised learning models), and finally, based on the word vector word2vec models, the cosine similarity between the entity words is calculated to serve as the weight relation between the entities.
Through the above-mentioned relation calculation process of the entity and the entity, the entity and entity association table can be correspondingly established, and the data items specifically included in the entity and entity association table include: entity 1ID, entity 2ID, and the associated weights between the above entities.
4. On-line computing
In concrete examples, based on three books recently purchased and input by the user, a set of entities can be read from the entity-book association table (if the number of the entities of the three books exceeds 20, the entities of the th book are preferentially taken, if the number of the entities of the three books is less than 5, new entities are read from the entity-book association table for supplement), based on the set of the read entities, the set of the books read from the entity-book association table is reversely searched, based on the grading sorting of the books from top to bottom, the entity type + the entity name is output as a recommendation reason, and a deduplication strategy is removed if the output book set contains the books with the same book name.
EXAMPLE seven
Fig. 7 is a schematic structural diagram of an kinds of book knowledge graph constructing apparatus according to a seventh embodiment of the present invention, which may be applied to the case of constructing a book knowledge graph for recommending books to users, the apparatus may be implemented in software and/or hardware, and the apparatus may be integrated into any device providing the book knowledge graph constructing function, as shown in fig. 7, the book knowledge graph constructing apparatus specifically includes a digest corpus obtaining module 710, an entity identifying module 720, an entity-book association table establishing module 730, and a book knowledge graph constructing module 740, where:
a digest corpus acquiring module 710 configured to acquire digest corpuses corresponding to at least two books, respectively;
an entity identification module 720, configured to perform entity identification according to the abstract corpus corresponding to the book;
an entity and book association table establishing module 730, configured to establish an entity and book association table according to the identified entity, where the entity and book association table includes -th association weight between the entity and the book;
and a book knowledge graph building module 740, configured to build the book knowledge graph according to the entity and book association table.
According to the technical scheme of the embodiment of the invention, entity recognition is carried out according to the abstract linguistic data corresponding to the book; establishing an entity-book association table according to the identified entities, establishing a book knowledge graph based on the entity-book association table, establishing a new book knowledge graph representing the association degree between the entities and the books, establishing the association relation between the books according to the entity conditions in the books, redefining the contents in the book knowledge graph, and further realizing a new technology for book recommendation based on the book knowledge graph.
On the basis of the above embodiments, the entity identification module 720 includes:
the model identification unit is used for inputting the abstract linguistic data corresponding to the book into at least entity identification models for entity identification to obtain entities matched with the entity identification models;
wherein, different types of entity recognition models are obtained by training different types of training data.
On the basis of the above embodiments, the types of the entity recognition model include: a conceptual entity identification model, a point of interest entity identification model and a proper name entity identification model;
wherein the conceptual entity identification model is used for identifying a conceptual entity, and the conceptual entity is a topic tag which is associated with the book in a theme way;
the point of interest entity identification model is used for identifying a point of interest entity, and the point of interest entity is a topic tag associated with a user point of interest;
the named entity identification model is used for identifying a named entity, and the named entity is a proper noun included in the abstract corpus.
On the basis of the above embodiments, the data format system module is further included to:
after the abstract linguistic data corresponding to the book are input into at least entity recognition models for entity recognition to obtain entities matched with the entity recognition models, if the recognized entities comprise at least two data forms, the obtained entities in the at least two data forms are converted into entities in the same data form according to the mapping relation between the entities in the different data forms;
and the entity identified by the entity identification model of the interest point and the entity identified by the named entity identification model have different data forms.
On the basis of the embodiments, the system further comprises an entity screening module, a frequency index calculation module and a frequency filtering module, wherein the entity screening module is used for calculating the inverse text frequency index corresponding to the identified entity according to the abstract linguistic data corresponding to at least two books after the entity identification is carried out according to the abstract linguistic data corresponding to the books, filtering -th universal entity and/or -th universal entity included in the entity according to the calculated inverse text frequency index
Counting the number value of books related to each entity according to the books corresponding to the abstract corpus of the identified entity; and filtering out a second general entity included in the entity according to the book quantity value obtained by statistics.
On the basis of the foregoing embodiments, the entity-book association table establishing module 730 specifically includes:
, an associated weight determining unit, configured to obtain an identification weight corresponding to the identified entity, which is output after the abstract corpus is input to the entity identification model, and use the identification weight as a -th associated weight between the entity and the book corresponding to the abstract corpus;
and the association table establishing unit is used for establishing the entity and book association table according to the th association weight.
On the basis of the above embodiments, the book management system further comprises an th association weight adjusting unit, which is used for adjusting the th association weight according to a weight adjusting coefficient corresponding to an entity identification model identifying an entity before establishing the entity-book association table according to the th association weight;
wherein different types of entity recognition models have different weight adjustment coefficients.
On the basis of the above embodiments, the method further includes: the entity table establishing module is used for establishing an entity table according to the entity identified by the abstract linguistic data after the entity identification is carried out according to the abstract linguistic data corresponding to the book; the entity table comprises: the corresponding relation between the entity and the entity type;
correspondingly, the book knowledge graph building module 740 is further configured to build the book knowledge graph according to the entity-book association table and the entity table.
On the basis of the above embodiments, the method further includes: the book table establishing module is used for establishing a book table corresponding to at least two books after acquiring abstract linguistic data respectively corresponding to the at least two books, and the book table comprises: corresponding relations among books, abstract linguistic data and book attribute information;
correspondingly, the book knowledge graph building module 740 is further configured to build the book knowledge graph according to the entity-book association table, the entity table, and the book table.
On the basis of the above embodiments, the method further includes: an entity-entity association table establishing module, configured to establish an entity-entity association table according to semantic similarity between identified entities after entity identification is performed according to the digest corpus corresponding to the book, where the entity-entity association table includes: a second association weight between each two entities;
correspondingly, the book knowledge graph building module 740 is further configured to build the book knowledge graph according to the entity-book association table, the entity table, the book table, and the entity-entity association table.
On the basis of the foregoing embodiments, the entity-to-entity association table establishing module is specifically configured to: inputting every two entities into a semantic similarity recognition model respectively to obtain second association weight between every two entities;
establishing an entity-entity association table according to the second association weight between every two entities;
the semantic similarity recognition model is obtained by segmenting abstract linguistic data corresponding to at least two books into sentences and training word vectors obtained after segmenting the sentences by using the recognized entities as a word segmentation dictionary.
The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example eight
Fig. 8 is a schematic structural diagram of book recommendation devices according to an eighth embodiment of the present invention, where the present embodiment is applicable to a case of recommending books to a user, the device may be implemented in a software and/or hardware manner, and the device may be integrated into any device providing book recommendation functions, as shown in fig. 8, the book recommendation device specifically includes a book knowledge graph query module 810, a target entity search module 820, a book to be recommended search module 830, and a book to be recommended providing module 840, where:
the book knowledge graph query module 810 is used for acquiring at least historical attention books of a user, and querying the book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books;
a target entity searching module 820, configured to search at least target entities associated with the history attention book in the entity-book association table;
a book to be recommended searching module 830, configured to reversely search, according to the target entity, at least books corresponding to the target entity in the entity-book association table as books to be recommended;
a book to be recommended providing module 840, configured to provide the book to be recommended to the user.
The technical scheme of the embodiment of the invention realizes a new book recommendation mode based on the book knowledge graph established by the association degree between each entity and each book, optimizes the existing book recommendation technology, and meets the increasing personalized and convenient book reading requirements of people.
On the basis of the above embodiments, the book knowledge-graph further comprises: an entity table comprising: the corresponding relation between the entity and the entity type;
the device also comprises a recommendation reason item construction module, a recommendation reason item construction module and a recommendation reason item construction module, wherein the recommendation reason item construction module is used for searching at least books corresponding to the target entity in the entity-book association table reversely as books to be recommended according to the target entity, inquiring entity types corresponding to at least target entities in the entity table, and constructing recommendation reason items corresponding to the books to be recommended according to the target entity and the entity types corresponding to the target entity;
the book to be recommended providing module 840 is further for providing the book to be recommended and the recommendation reason item corresponding to the book to be recommended to the user.
On the basis of the above embodiments, the book knowledge-graph further comprises: the book table, include in the book table: corresponding relations among books, abstract linguistic data and book attribute information;
the device also comprises a book screening module used for screening the books to be recommended according to the book attribute information of the books to be recommended in the book table after reversely searching at least books corresponding to the target entity in the entity and book association table as the books to be recommended according to the target entity.
On the basis of the above embodiments, the book knowledge-graph further comprises: an entity-to-entity association table, the entity-to-entity association table comprising: a second association weight between each two entities;
the device also comprises an target entity screening module, which is used for searching at least target entities related to the history attention books in the entity and book association table, and acquiring an extended entity corresponding to the target entity in the entity and entity association table and adding the extended entity into the target entity if the number of the searched target entities is determined to be less than a number threshold.
On the basis of the above embodiments, the apparatus further includes a second target entity screening module, configured to, after at least target entities associated with the history attention books are searched in the entity-book association table, determine a priority order of the history attention books according to attention time of a user to the at least history attention books if it is determined that the number of the searched target entities is greater than a second number threshold, and/or screen the target entities according to the priority order, and/or
And the duplication elimination module is used for carrying out duplication elimination treatment on the book to be recommended before the book to be recommended is provided for the user.
The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example nine
FIG. 9 is a schematic diagram of computer devices in the ninth embodiment of the present invention, FIG. 9 is a block diagram of an exemplary computer device 12 suitable for implementing the embodiments of the present invention, and the computer device 12 shown in FIG. 9 is only examples and should not bring any limitations to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 9, computer device 12 is embodied in a general purpose computing device, the components of computer device 12 may include, but are not limited to or more processors or processing units 16, a system memory 28, and a bus 18 that couples the various system components including the system memory 28 and processing unit 16.
Bus 18 represents or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures, including, but not limited to, an Industry Standard Architecture (ISA) bus, a micro-channel architecture (MAC) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus, to name a few.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
System memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32 computer device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media storage systems 34 may be used, by way of example only, to read from and write to non-removable, non-volatile magnetic media (not shown in fig. 9, and commonly referred to as a "hard disk drive"). although not shown in fig. 9, magnetic disk drives may be provided for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and optical disk drives may be provided for reading from and writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media). in these cases, each drive may be connected to bus 18 by or more data media interfaces.memory 28 may include at least program products having sets (e.g., at least ) of program modules configured to perform the functions of embodiments of the present invention.
Program/utility 40 having sets (at least ) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, or more application programs, other program modules, and program data, each or some combination of these examples possibly including implementation of a network environment.
Computer device 12 may also communicate with or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), and may also communicate with or more devices that enable a user to interact with the computer device 12, and/or with any device (e.g., network card, modem, etc.) that enables the computer device 12 to communicate with or more other computing devices.this communication may be through an input/output (I/O) interface 22. additionally, computer device 12 in this embodiment, display 24 may not exist as a separate entity, but is embedded in a mirror, where the display surface of display 24 is not shown, the display surface of display 24 visually blends into , and computer device 12 may also communicate with or more networks (e.g., Local Area Network (LAN), domain network (WAN) and/or public networks, such as the figure) through network adapter 20. it is understood that network adapter 20 communicates with other network devices 12 through bus 18, although other external data processing devices, such as disk drive modules, disk drive systems, and other data processing systems, including, disk drive modules, and other like.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing the book knowledge graph construction method provided by the embodiment of the present invention:
the method comprises the steps of obtaining abstract linguistic data corresponding to at least two books respectively, carrying out entity identification according to the abstract linguistic data corresponding to the books, establishing an entity-book association table according to identified entities, wherein the entity-book association table comprises -th association weight between the entities and the books, and establishing the book knowledge graph according to the entity-book association table.
For another example, the book recommendation method provided by the embodiment of the invention is realized by obtaining at least history attention books of a user, inquiring a book knowledge graph according to the history attention books, wherein the book knowledge graph comprises an entity-book association table, the entity-book association table comprises association weight between an entity and a book, searching at least target entities associated with the history attention books in the entity-book association table, reversely searching at least books corresponding to the target entities in the entity-book association table as books to be recommended according to the target entities, and providing the books to be recommended to the user.
Example ten
An embodiment of the present invention provides computer-readable storage media, on which a computer program is stored, where the computer program, when executed by a processor, implements the book knowledge graph building method provided by all the inventive embodiments of the present application:
the method comprises the steps of obtaining a seed query formula, obtaining an alternative query formula corresponding to the seed query formula by using a random walk technology, filling at least field synonyms under the field categories in a field synonym table to be filled matched with the seed query formula according to the alternative query formula to obtain the field synonym table, obtaining a query formula generation template corresponding to the alternative query formula according to the alternative query formula and the field synonym table, and generating an expanded query formula corresponding to the alternative query formula according to the field synonym table and the query formula generation template.
Or when the program is executed by a processor, the book recommendation method provided by the embodiments of the invention comprises the steps of obtaining at least historical attention books of a user, inquiring a book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, the entity and book association table comprises association weight between entities and books, searching at least target entities associated with the historical attention books in the entity and book association table, reversely searching at least books corresponding to the target entities in the entity and book association table as books to be recommended according to the target entities, and providing the books to be recommended to the user.
A more specific example (a non-exhaustive list) of the computer readable storage medium includes an electrical connection having or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave .
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or a combination thereof, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (20)

1, kinds of book knowledge graph construction method, characterized by, including:
acquiring abstract corpora corresponding to at least two books respectively;
performing entity recognition according to the abstract corpus corresponding to the book;
establishing an entity and book association table according to the identified entities, wherein the entity and book association table comprises -th association weight between the entities and the books;
and constructing the book knowledge graph according to the entity and book association table.
2. The method of claim 1, wherein performing entity recognition based on the corpus of abstracts corresponding to the book comprises:
inputting the abstract corpus corresponding to the book into at least entity recognition models for entity recognition to obtain entities matched with the entity recognition models;
wherein, different types of entity recognition models are obtained by training different types of training data.
3. The method of claim 2, wherein the type of entity recognition model comprises: a conceptual entity identification model, a point of interest entity identification model and a proper name entity identification model;
wherein the conceptual entity identification model is used for identifying a conceptual entity, and the conceptual entity is a topic tag which is associated with the book in a theme way;
the point of interest entity identification model is used for identifying a point of interest entity, and the point of interest entity is a topic tag associated with a user point of interest;
the named entity identification model is used for identifying a named entity, and the named entity is a proper noun included in the abstract corpus.
4. The method of claim 3, wherein after inputting the corpus of abstracts corresponding to the book into at least entity recognition models for entity recognition, obtaining entities matching the entity recognition models, further comprising:
if the identified entity comprises at least two data forms, converting the obtained entities in the at least two data forms into entities in the same data forms according to the mapping relation between the entities in different data forms;
de-duplicating said entity in data form ;
and the entity identified by the entity identification model of the interest point and the entity identified by the named entity identification model have different data forms.
5. The method of claim 1, after performing entity recognition based on the corpus of abstracts corresponding to the book, further comprising:
calculating an inverse text frequency index corresponding to the identified entity according to the corpus of abstracts corresponding to at least two books, respectively, filtering -th common entities included in the entities according to the inverse text frequency index obtained by calculation, and/or filtering
Counting the number value of books related to each entity according to the books corresponding to the abstract corpus of the identified entity; and filtering out a second general entity included in the entity according to the book quantity value obtained by statistics.
6. The method of claim 3, wherein building an entity-to-book association table based on the identified entities comprises:
acquiring an identification weight corresponding to the identified entity, which is output after the abstract corpus is input into the entity identification model, and taking the identification weight as an th association weight between the entity and the book corresponding to the abstract corpus;
and establishing the entity and book association table according to the th association weight.
7. The method as recited in claim 6, further comprising, before establishing the entity-to-book association table according to the association weight:
adjusting the th associated weight according to a weight adjustment coefficient corresponding to an entity identification model for identifying the entity;
wherein different types of entity recognition models have different weight adjustment coefficients.
8. The method of any one of claims 1-7 and , wherein after the entity identification based on the corpus of abstracts corresponding to the book, further comprising:
establishing an entity table according to the entities identified by the abstract corpus; the entity table comprises: the corresponding relation between the entity and the entity type;
constructing the book knowledge graph according to the entity and book association table, wherein the step comprises:
and constructing the book knowledge graph according to the entity and book association table and the entity table.
9. The method of claim 8, further comprising, after obtaining the corpus of abstracts corresponding to the at least two books, respectively:
establishing a book table corresponding to the at least two books, wherein the book table comprises: corresponding relations among books, abstract linguistic data and book attribute information;
according to the entity and book association table and the entity table, the book knowledge graph is constructed, and the step includes:
and constructing the book knowledge graph according to the entity and book association table, the entity table and the book table.
10. The method of claim 9, after performing entity recognition based on the corpus of abstracts corresponding to the book, further comprising:
establishing an entity-entity association table according to the identified semantic similarity between the entities, wherein the entity-entity association table comprises: a second association weight between each two entities;
constructing the book knowledge graph according to the entity-book association table, the entity table and the book table, wherein the step comprises:
and constructing the book knowledge graph according to the entity-book association table, the entity table, the book table and the entity-entity association table.
11. The method of claim 10, wherein establishing an entity-to-entity association table based on semantic similarity between identified entities comprises:
inputting every two entities into a semantic similarity recognition model respectively to obtain second association weight between every two entities;
establishing an entity-entity association table according to the second association weight between every two entities;
the semantic similarity recognition model is obtained by segmenting abstract linguistic data corresponding to at least two books into sentences and training word vectors obtained after segmenting the sentences by using the recognized entities as a word segmentation dictionary.
12, book recommendation method, which is characterized in that it comprises:
acquiring at least historical attention books of a user, and inquiring a book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books;
searching at least target entities related to the history attention books in the entity-book association table;
reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity;
and providing the book to be recommended to the user.
13. The method of claim 12, wherein the book knowledge-graph further comprises: an entity table comprising: the corresponding relation between the entity and the entity type;
after reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity, the method further comprises the following steps:
inquiring entity types corresponding to the at least target entities in the entity table, and constructing recommendation reason items corresponding to the books to be recommended according to the target entities and the entity types corresponding to the target entities;
providing the book to be recommended to the user, and step includes:
and providing the books to be recommended and recommendation reason items corresponding to the books to be recommended to the user.
14. The method of claim 13, wherein the book knowledge-graph further comprises: the book table, include in the book table: corresponding relations among books, abstract linguistic data and book attribute information;
after reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity, the method further comprises the following steps:
and screening the books to be recommended according to the book attribute information of the books to be recommended in the book table.
15. The method of claim 14, wherein the book knowledge-graph further comprises: an entity-to-entity association table, the entity-to-entity association table comprising: a second association weight between each two entities;
after at least target entities associated with the history attention book are searched in the entity-book association table, the method further comprises the following steps:
and if the number of the searched target entities is determined to be smaller than the th number threshold, acquiring an extended entity corresponding to the target entity from the entity-entity association table, and adding the extended entity into the target entity.
16. The method of any of claims 12-15, wherein after finding at least target entities associated with the historical attention book in the entity to book association table, further comprising:
if the number of the searched target entities is determined to be larger than a second number threshold, determining the priority sequence of the history attention books according to the attention time of the user to at least history attention books, and screening the target entities according to the priority sequence, and/or
Before providing the book to be recommended to the user, the method further comprises the following steps: and performing duplicate removal processing on the book to be recommended.
17, kinds of books knowledge map's construction equipment, characterized by, include:
the abstract corpus acquiring module is used for acquiring abstract corpuses corresponding to at least two books respectively;
the entity identification module is used for carrying out entity identification according to the abstract linguistic data corresponding to the book;
an entity and book association table establishing module, configured to establish an entity and book association table according to the identified entity, where the entity and book association table includes -th association weight between the entity and the book;
and the book knowledge graph building module is used for building the book knowledge graph according to the entity and book association table.
18, A book recommendation device, comprising:
the book knowledge graph query module is used for acquiring at least historical attention books of a user and querying the book knowledge graph according to the historical attention books, wherein the book knowledge graph comprises an entity and book association table, and the entity and book association table comprises association weight between the entity and the books;
the target entity searching module is used for searching at least target entities related to the history attention books in the entity-book association table;
the book to be recommended searching module is used for reversely searching at least books corresponding to the target entity in the entity-book association table as books to be recommended according to the target entity;
and the book to be recommended providing module is used for providing the book to be recommended to the user.
A computer device of the type 19, , wherein the computer device comprises:
or more processors;
a storage device for storing or more programs,
when the or more programs are executed by the or more processors, the or more processors implement the method for building a book knowledge graph as claimed in any of claims 1-11 or the method for recommending books as claimed in any of claims 12-16.
20, computer storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, implements the method for building a book knowledge graph as claimed in any of claims 1-11 or implements the method for recommending books as claimed in any of claims 12-16.
CN201810719673.0A 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium Active CN110737774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810719673.0A CN110737774B (en) 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810719673.0A CN110737774B (en) 2018-07-03 Book knowledge graph construction method, book recommendation method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN110737774A true CN110737774A (en) 2020-01-31
CN110737774B CN110737774B (en) 2024-05-24

Family

ID=

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325033A (en) * 2020-03-20 2020-06-23 中国建设银行股份有限公司 Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN112182424A (en) * 2020-11-11 2021-01-05 重庆邮电大学 Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN112434168A (en) * 2020-11-09 2021-03-02 广西壮族自治区图书馆 Knowledge graph construction method and fragmentized knowledge generation method based on library
CN112818261A (en) * 2021-01-27 2021-05-18 沈阳美行科技有限公司 Navigation method and device based on POI (Point of interest) knowledge graph and electronic equipment
CN113076428A (en) * 2021-03-19 2021-07-06 北京沃东天骏信息技术有限公司 Method and device for generating book list
CN113688269A (en) * 2021-07-21 2021-11-23 北京三快在线科技有限公司 Image-text matching result determining method and device, electronic equipment and readable storage medium
CN114428864A (en) * 2022-04-01 2022-05-03 杭州未名信科科技有限公司 Knowledge graph construction method and device, electronic equipment and medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826102A (en) * 2010-03-26 2010-09-08 浙江大学 Automatic book keyword generation method
US20110137919A1 (en) * 2009-12-09 2011-06-09 Electronics And Telecommunications Research Institute Apparatus and method for knowledge graph stabilization
CN103164405A (en) * 2011-12-08 2013-06-19 盛乐信息技术(上海)有限公司 Generation method for relevant video data bank, recommendation method and recommendation system for relevant videos
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
CN104346446A (en) * 2014-10-27 2015-02-11 百度在线网络技术(北京)有限公司 Paper associated information recommendation method and device based on mapping knowledge domain
CN105653706A (en) * 2015-12-31 2016-06-08 北京理工大学 Multilayer quotation recommendation method based on literature content mapping knowledge domain
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
US20170169498A1 (en) * 2015-12-15 2017-06-15 International Business Machines Corporation Identifying recommended electronic books with detailed comparisons
US20170249399A1 (en) * 2014-07-16 2017-08-31 Baidu Online Network Technology (Beijing) Co., Ltd Method And Apparatus For Displaying Recommendation Result
CN107122444A (en) * 2017-04-24 2017-09-01 北京科技大学 A kind of legal knowledge collection of illustrative plates method for auto constructing
CN107729444A (en) * 2017-09-30 2018-02-23 桂林电子科技大学 Recommend method in a kind of personalized tourist attractions of knowledge based collection of illustrative plates
CN107943910A (en) * 2017-11-18 2018-04-20 电子科技大学 A kind of Individual book based on combinational algorithm recommends method
CN108009194A (en) * 2017-10-23 2018-05-08 广州星耀悦教育科技有限公司 A kind of books method for pushing, electronic equipment, storage medium and device
CN108090074A (en) * 2016-11-22 2018-05-29 上海阿法迪智能标签系统技术有限公司 Book recommendation system and method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110137919A1 (en) * 2009-12-09 2011-06-09 Electronics And Telecommunications Research Institute Apparatus and method for knowledge graph stabilization
CN101826102A (en) * 2010-03-26 2010-09-08 浙江大学 Automatic book keyword generation method
CN103164405A (en) * 2011-12-08 2013-06-19 盛乐信息技术(上海)有限公司 Generation method for relevant video data bank, recommendation method and recommendation system for relevant videos
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
US20170249399A1 (en) * 2014-07-16 2017-08-31 Baidu Online Network Technology (Beijing) Co., Ltd Method And Apparatus For Displaying Recommendation Result
CN104346446A (en) * 2014-10-27 2015-02-11 百度在线网络技术(北京)有限公司 Paper associated information recommendation method and device based on mapping knowledge domain
US20170169498A1 (en) * 2015-12-15 2017-06-15 International Business Machines Corporation Identifying recommended electronic books with detailed comparisons
CN105653706A (en) * 2015-12-31 2016-06-08 北京理工大学 Multilayer quotation recommendation method based on literature content mapping knowledge domain
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN108090074A (en) * 2016-11-22 2018-05-29 上海阿法迪智能标签系统技术有限公司 Book recommendation system and method
CN107122444A (en) * 2017-04-24 2017-09-01 北京科技大学 A kind of legal knowledge collection of illustrative plates method for auto constructing
CN107729444A (en) * 2017-09-30 2018-02-23 桂林电子科技大学 Recommend method in a kind of personalized tourist attractions of knowledge based collection of illustrative plates
CN108009194A (en) * 2017-10-23 2018-05-08 广州星耀悦教育科技有限公司 A kind of books method for pushing, electronic equipment, storage medium and device
CN107943910A (en) * 2017-11-18 2018-04-20 电子科技大学 A kind of Individual book based on combinational algorithm recommends method

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325033A (en) * 2020-03-20 2020-06-23 中国建设银行股份有限公司 Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN112434168A (en) * 2020-11-09 2021-03-02 广西壮族自治区图书馆 Knowledge graph construction method and fragmentized knowledge generation method based on library
CN112182424A (en) * 2020-11-11 2021-01-05 重庆邮电大学 Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN112182424B (en) * 2020-11-11 2023-01-31 重庆邮电大学 Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN112818261A (en) * 2021-01-27 2021-05-18 沈阳美行科技有限公司 Navigation method and device based on POI (Point of interest) knowledge graph and electronic equipment
CN113076428A (en) * 2021-03-19 2021-07-06 北京沃东天骏信息技术有限公司 Method and device for generating book list
CN113688269A (en) * 2021-07-21 2021-11-23 北京三快在线科技有限公司 Image-text matching result determining method and device, electronic equipment and readable storage medium
CN114428864A (en) * 2022-04-01 2022-05-03 杭州未名信科科技有限公司 Knowledge graph construction method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN111753060B (en) Information retrieval method, apparatus, device and computer readable storage medium
CN109657054B (en) Abstract generation method, device, server and storage medium
CN106897428B (en) Text classification feature extraction method and text classification method and device
CN110019732B (en) Intelligent question answering method and related device
CN106649603B (en) Designated information pushing method based on emotion classification of webpage text data
CN111488426A (en) Query intention determining method and device and processing equipment
CN107491477B (en) Emotion symbol searching method and device
CN112395506A (en) Information recommendation method and device, electronic equipment and storage medium
CN109388743B (en) Language model determining method and device
CN110619051A (en) Question and sentence classification method and device, electronic equipment and storage medium
CN111414561B (en) Method and device for presenting information
CN110287405B (en) Emotion analysis method, emotion analysis device and storage medium
CN113342958B (en) Question-answer matching method, text matching model training method and related equipment
CN115438166A (en) Keyword and semantic-based searching method, device, equipment and storage medium
CN111753082A (en) Text classification method and device based on comment data, equipment and medium
CN107861948B (en) Label extraction method, device, equipment and medium
CN111813993A (en) Video content expanding method and device, terminal equipment and storage medium
CN115238039A (en) Text generation method, electronic device and computer-readable storage medium
CN111460224B (en) Comment data quality labeling method, comment data quality labeling device, comment data quality labeling equipment and storage medium
CN111143515B (en) Text matching method and device
CN112988784A (en) Data query method, query statement generation method and device
CN116932730A (en) Document question-answering method and related equipment based on multi-way tree and large-scale language model
CN117131155A (en) Multi-category identification method, device, electronic equipment and storage medium
CN112163415A (en) User intention identification method and device for feedback content and electronic equipment
CN111737607A (en) Data processing method, data processing device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant