CN113204578A - Content association method, system, device, electronic equipment and storage medium - Google Patents

Content association method, system, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113204578A
CN113204578A CN202110472315.6A CN202110472315A CN113204578A CN 113204578 A CN113204578 A CN 113204578A CN 202110472315 A CN202110472315 A CN 202110472315A CN 113204578 A CN113204578 A CN 113204578A
Authority
CN
China
Prior art keywords
content
document
processed
target
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110472315.6A
Other languages
Chinese (zh)
Inventor
薛凌霄
李长亮
卢晓栋
郭馨泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Software Co Ltd
Beijing Kingsoft Digital Entertainment Co Ltd
Original Assignee
Beijing Kingsoft Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Software Co Ltd filed Critical Beijing Kingsoft Software Co Ltd
Priority to CN202110472315.6A priority Critical patent/CN113204578A/en
Publication of CN113204578A publication Critical patent/CN113204578A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a content association method, a system, a device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a title in a target document as a content to be processed; determining relevant words about the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document; and displaying the determined associated words aiming at the target document. Therefore, the document can be effectively associated with the related content in other documents through the scheme.

Description

Content association method, system, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of document processing technologies, and in particular, to a content association method, system, apparatus, electronic device, and storage medium.
Background
Users typically perform document editing of documents through document processing clients, where the document content is edited according to the actual needs of the user. For example: the document content is note content in the learning process, and at this time, one document is one note.
Typically, the content of some documents of the same user is relevant. Then, for the document written by the user, if knowing which contents of other documents of the user it is related to, it is easier for the user to gather and grasp the contents of the documents.
Based on this, how to effectively associate the document with the related content in other documents is a problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention aims to provide a content association method, a content association system, a content association device, electronic equipment and a storage medium, so as to effectively associate a document with related content in other documents. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a content association method, which is applied to an electronic device, and the method includes:
acquiring a title in a target document as a content to be processed;
determining relevant words about the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
and displaying the determined associated words aiming at the target document.
Optionally, after the step of presenting the determined related words for the target document, the method further includes:
acquiring target content as new content to be processed, and executing the step of determining relevant words corresponding to the content to be processed in each specified document;
wherein the target content is the displayed associated word.
Optionally, any content to be processed belonging to the target content and the corresponding associated word are subjected to associated display according to a predetermined multi-level display mode.
Optionally, the searching manner of the relevant word about the content to be processed in each specified document includes:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
Optionally, the obtaining a title in a target document of a user as a content to be processed includes:
when a first content association operation aiming at a title of a target document of a user is detected, the title in the target document is obtained and is used as a content to be processed.
Optionally, the obtaining of the target content as a new content to be processed includes:
and when a second content association operation aiming at the target content is detected, acquiring the target content as new content to be processed.
In a second aspect, an embodiment of the present invention provides a content association system, where the system includes: the system comprises a document processing client and a preset server;
the document processing client is used for acquiring a title in a target document, taking the title as a content to be processed, constructing a content association request carrying the content to be processed, and sending the content association request to a predetermined server;
the predetermined server is used for searching the associated words related to the content to be processed in each specified document after receiving the content association request, and feeding back the searched associated words to the document processing client; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
the document processing client is further configured to receive the relevant words fed back by the predetermined server, and display the determined relevant words for the target document.
Optionally, the document processing client is further configured to, after the step of displaying the determined associated word for the target document, acquire a target content as a new content to be processed, and execute the step of constructing a content association request carrying the content to be processed;
wherein the target content is the displayed associated word.
Optionally, any content to be processed belonging to the target content and the corresponding associated word are subjected to associated display according to a predetermined multi-level display mode.
Optionally, the searching, by the predetermined server, for the associated word about the content to be processed in each specified document includes:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
In a third aspect, an embodiment of the present invention provides a content association apparatus, which is applied to an electronic device, and includes:
the acquisition module is used for acquiring a title in the target document as the content to be processed;
the determining module is used for determining relevant words related to the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
and the display module is used for displaying the determined associated words aiming at the target document.
Optionally, the obtaining module is further configured to obtain, after the step of displaying the determined associated word for the target document, a target content as a new content to be processed, and trigger the determining module;
wherein the target content is the displayed associated word.
Optionally, any content to be processed belonging to the target content and the corresponding associated word are subjected to associated display according to a predetermined multi-level display mode.
Optionally, the searching manner of the relevant word about the content to be processed in each specified document includes:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
a processor, configured to implement the method steps provided by the first aspect when executing the program stored in the memory.
In a fifth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps provided in the first aspect.
Embodiments of the present invention also provide a computer program product containing instructions, which when run on a computer, cause the computer to perform the steps of any of the above-mentioned content association methods.
In the scheme provided by the embodiment of the invention, the title of the target document is taken as a correlation basis in consideration of the fact that the title of the target document can well represent the content recorded in the target document; moreover, considering that the entity words can represent words with specific meanings and the relevance of the contents in the same content block is strong, after the to-be-processed contents serving as the relevance basis are determined, relevant words related to the to-be-processed contents in each specified document, namely the relevant contents of target documents in other documents, can be determined; and displaying the determined associated words aiming at the target document. Therefore, the document can be effectively associated with the related content in other documents through the scheme.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.
Fig. 1 is a flowchart of a content association method according to an embodiment of the present invention;
FIG. 2 is another flow chart of a content association method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a content association method according to an embodiment of the present invention;
FIG. 4(a) is a schematic diagram of an interface with dots as predetermined identification symbols;
FIG. 4(b) is a diagram illustrating an exemplary graph menu in a document processing client;
4(c) -4 (f) schematic diagrams of exemplary tab interfaces presented through a map menu;
FIG. 4(g) is a schematic diagram showing interaction among the terminal device, the reservation server and the intelligent processing terminal;
fig. 5 is a schematic structural diagram of a content association system according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a content association apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.
In order to achieve the purpose of effectively associating a document with related content in other documents, embodiments of the present invention provide a content association method, system, apparatus, electronic device, and storage medium.
The following first describes a content association method provided in an embodiment of the present invention.
The content association method provided by the embodiment of the invention is applied to the electronic equipment. In a specific application, the electronic device may be a terminal device, for example: devices such as smart phones, tablet computers, notebook computers, and the like; of course, the electronic device is not limited to the terminal device, and for example, the electronic device may be a server, which is also reasonable.
Moreover, the functional software for implementing the content association method provided by the embodiment of the present invention may be a content association apparatus running in the electronic device. If the electronic device is a terminal device, the content association apparatus may be a function module of a document processing client running in the terminal device; if the electronic device is a server, the content association apparatus may be a function module of a predetermined server running in the server, and the predetermined server is a server corresponding to the document processing client on the user side.
In which, a user typically edits document content through a document processing client, and the document content is edited according to the actual needs of the user. Illustratively, the document processing client may be a specialized logging client for recording notes/logs edited by a user. It should be emphasized that any client for document editing, which has a content association requirement, may be used as the document processing client according to the embodiment of the present invention. In addition, the document processing client may be a client in an APP (Application) form, or may be a client in a web page form, and the embodiment of the present invention does not limit the specific form of the client. Typically, the content of some documents of the same user is relevant. Then, for a document written by a user, it is easier for the user to aggregate and master knowledge points if knowing which contents of other documents of the user they are related to.
In order to effectively associate a document with related content in other documents, so as to improve the use experience of a user and the use viscosity of a document processing client, a content association method provided by an embodiment of the present invention may include the following steps:
acquiring a title in a target document as a content to be processed;
determining relevant words related to the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated word is an entity word which belongs to the same content block as the content to be processed and does not appear in the target document;
and displaying the determined related words aiming at the target document.
In the scheme provided by the embodiment of the invention, the title of the target document is taken as a correlation basis in consideration of the fact that the title of the target document can well represent the content recorded in the target document; moreover, considering that the entity words can represent words with specific meanings and the relevance of the contents in the same content block is strong, after the to-be-processed contents serving as the relevance basis are determined, relevant words related to the to-be-processed contents in each specified document, namely the relevant contents of target documents in other documents, can be determined; and displaying the determined associated words aiming at the target document. Therefore, the document can be effectively associated with the related content in other documents through the scheme.
The content association method provided by the embodiment of the invention is described below with reference to the accompanying drawings.
As shown in fig. 1, a content association method provided in an embodiment of the present invention may include the following steps:
s101, acquiring a title in a target document as a content to be processed;
considering that the title of the target document can better represent the content recorded in the target document, if the target document needs to be effectively associated with the related content in other documents of the user, the title of the target document can be used as an association basis, and then the subsequent processing association process is executed. The target document is any document to be associated with the content of any user.
Optionally, in an implementation manner, acquiring a title in the target document as the content to be processed may include:
when a first content association operation aiming at the title of a target document is detected, acquiring the title in the target document as a content to be processed;
wherein the first content association operation is issued by the user in a presentation interface of the target document.
In this implementation manner, if the electronic device detects a first content association operation, which is issued by a user to which the target document belongs and is directed at a title of the target document, that the user wishes to perform content association between the target document and another document, the title in the target document, that is, the title of the executed content association operation, may be acquired as the content to be processed. The embodiment of the present invention is not limited by the triggering display mode of the display interface of the target document; the presentation interface of the target document may be a document interface of the target document, or may also be a tab interface of the target document, and so on.
For example, the first content association operation may be a selection operation of an exploration identifier corresponding to a title in the target document. Wherein the exploration identifier can be displayed in association with a title in the target document, such as: any one of the upper, lower, left, and right sides of the title in the target document is displayed. The search indicator may be an indicator that starts to be continuously displayed at the presentation start time of the presentation interface, or may be an indicator that is displayed when a designation operation for the title of the target document is detected. For example: if the target document is displayed in a device such as a computer which performs document operation through a mouse, the search mark can be displayed in a display interface of the target document when the mouse is detected to be suspended above the title. If the target document is displayed in a device such as a smart phone which performs a document operation through a touch operation, the exploration identifier may be displayed in a display interface of the target document when a long-press operation, a double-click operation, or the like, for a title of the target document is detected.
Illustratively, the first content association operation may also be: performing a predetermined operation on the title itself in the target document, for example: if the target document is displayed in a device such as a smartphone which performs a document operation through a touch operation, the first content association operation may be a long-press operation, a double-click operation, or the like for a title of the target document.
The specific content of the first content association operation is only used as an example, and should not be construed as limiting the embodiments of the present invention.
Optionally, in another implementation manner, acquiring a title in the target document as the content to be processed may include: and when the display interface of the target document is detected to be displayed or the target document is detected to be edited by a user, acquiring the title in the target document. In this implementation, the user may not need to perform any operation, and the electronic device triggers the content association for the target document by itself.
It should be emphasized that the above-described specific implementation of obtaining the title in the target document of the user is merely an example, and should not be construed as a limitation to the embodiments of the present invention.
S102, determining relevant words related to the content to be processed in each specified document;
the specified document is a document containing the content to be processed, and the associated word is an entity word which belongs to the same content block as the content to be processed and does not appear in the target document. That is, the entity words belonging to the same content block as the content to be processed and not appearing in the target document in each designated document are determined as the related words related to the processing content. It should be noted that each document contains at least one content block. Illustratively, each content block may be a paragraph; alternatively, each content block is a text block of the document content with a predetermined identifier as the starting content. For example: in the interface shown in fig. 4(a), a dot is used as a predetermined identification symbol, each content block is started by a dot, and three content blocks are shown in fig. 4 (a).
Also, the entity word is a word representing an entity thing, and includes a noun and a pronoun, and the category of the entity word may include a plurality of kinds, for example: name of a person, place name, organization name, and other proper nouns, and so forth. Based on the introduction of the entity words, considering that the entity words can represent words with specific meanings and the relevance of contents located in the same content block is strong, after the to-be-processed contents serving as the association basis are determined, the associated words related to the to-be-processed contents in each specified document can be determined. The number of the entity words which belong to the same content block with the content to be processed and do not appear in the target document can be at least one, and the number of the associated words related to the content to be processed can be at least one.
In order to better embody the relevance, in an alternative implementation manner, the relevant word may be an entity word belonging to the same content block as the content to be processed and not appearing in the specified category in the target document. Illustratively, specifying categories may include: one or more of a name of a person, a name of a place, a name of an organization, and other proper nouns, and the specified category may be set according to actual circumstances, which is not limited in the embodiment of the present invention.
In addition, if the electronic device is a terminal device, considering that the predetermined server holds all documents of the user and considering device performance, the electronic device may determine, with the assistance of the predetermined server, the relevant words in each specified document about the content to be processed, specifically: and sending a content association request carrying the content to be processed to a predetermined server, so that the predetermined server searches for associated words related to the content to be processed from each specified document after receiving the content association request and feeds the associated words back to the electronic equipment. If the electronic device is a terminal device, if the processing performance of the electronic device is better, or the electronic device locally stores a plurality of documents of a user, or the network performance between the electronic device and a predetermined server is better, at this time, the electronic device may determine, without assistance of the predetermined server, a relevant word related to the content to be processed in each specified document, specifically: and searching the relevant words related to the content to be processed from each specified document to obtain the relevant words related to the content to be processed.
If the electronic device is a server, at this time, the electronic device may search for a relevant word related to the content to be processed from each specified document, and send the obtained relevant word to the document processing client, so that the document processing client displays the determined relevant word for the target document.
It should be noted that, whether the electronic device is a terminal device or a server, the manner of searching for the relevant word related to the content to be processed in each designated document may be the same. For clarity of the scheme and clear layout, the following description exemplarily describes the searching manner of the related words about the content to be processed in each designated document.
S103, displaying the determined related words aiming at the target document.
Optionally, in an implementation manner, presenting the determined related word for the target document may include: and displaying the determined related words in a display interface of the target document. In this implementation, the determined related words may be presented at a predetermined position in the presentation interface of the target document. For example, the determined related word is presented after the document content in the presentation interface of the target document, or the determined related word is presented at a position between the title and the document content in the presentation interface of the target document, which is not limited to this. In addition, the related words presented in the presentation interface of the target document may be set to a predetermined color, boldface, slant, or the like, to achieve the effect of highlighting. Optionally, in another implementation manner, presenting the determined related words for the target document may include: and displaying the determined related words in a newly-created document aiming at the target document by the electronic equipment, or any other document selected by the user. And displaying the title of the target document and the determined related words in the newly-created document or any other document correspondingly. Therefore, through the content recorded in the newly created document or any other document, the related content of the target document can be known to be the determined related word.
The specific implementation manner of presenting the determined related words for the target document is only an example, and should not be construed as a limitation to the embodiment of the present invention.
In the scheme provided by the embodiment of the invention, the title of the target document is taken as a correlation basis in consideration of the fact that the title of the target document can well represent the content recorded in the target document; in addition, considering that the entity words can represent words with specific meanings and the relevance of the contents in the same content block is strong, after the to-be-processed contents serving as the relevance basis are determined, relevant words related to the to-be-processed contents in each specified document, namely the relevant contents of target documents in other documents, can be determined; and displaying the determined associated words aiming at the target document. Therefore, the document can be effectively associated with the related content in other documents through the scheme.
Based on any embodiment, the relevance exploration can be continued on the exhibited relevant words, so that the deeper relevant content of the target document can be obtained. In another embodiment of the present invention, after presenting the determined related word for the target document as described above, as shown in fig. 2, the content association method may further include:
s104, acquiring target content as new content to be processed, and returning to the step of executing S102;
wherein, the target content is the displayed associated word.
In order to obtain the deeper associated content of the target document, after the determined associated words are displayed for the target document, the target content can be obtained, and a new round of content association operation is triggered. And after the relevant words about the determined target content are presented for the target document, the relevant words about the target content can still be taken as the target content, so as to obtain new content to be processed, and the steps of S102-S103 are continuously executed.
Optionally, in an implementation manner, the obtaining of the target content as the new content to be processed may include:
and when the second content association operation aiming at the target content is detected, acquiring the target content as new content to be processed.
In this implementation manner, if the electronic device detects that the user to which the target user belongs issues a second content association operation for the target content, it indicates that the user wishes to continue association search, and therefore, the target content may be acquired as a new content to be processed.
For example, the second content association operation may be a selection operation of an exploration identifier corresponding to the target content. Wherein, the exploration identification can be displayed in association with the target content, such as: displayed on either one of the upper, lower, left, and right sides of the target content. The search indicator may be an indicator that starts to be continuously displayed at the presentation start time of the target content, or may be an indicator that is displayed when a specific operation for the target content is detected. For example: if the target document is displayed in a computer or other equipment for performing document operation through a mouse, the exploration identification can be displayed when the mouse is detected to be suspended above the target content; if the target document is displayed in a device such as a smart phone which performs document operation through touch operation, the exploration identifier may be displayed when a long-press operation, a double-click operation, or the like for the target content is detected.
Illustratively, the second content association operation may also be: performing a predetermined operation on the target content itself, for example: if the target document is displayed in a device such as a smartphone for performing document operation through touch operation, the second content-related operation may be a long-press operation, a double-click operation, or the like for the target content.
The specific content of the second content association operation described above is only an example, and should not be construed as a limitation to the embodiments of the present invention.
Optionally, in another implementation manner, the obtaining of the target content as the new content to be processed may include: and when the target content is detected to be displayed, acquiring the target content. In this implementation, the user may not need to perform any operation, and the electronic device triggers content association for the target content by itself. It is to be understood that, in this implementation manner, each time a new related word is presented, the new related word may serve as new content to be processed, and when no related word of the new content to be processed is detected, the continuous search for the related word is stopped.
It should be emphasized that the above-mentioned acquisition target content, as a specific implementation manner of the new to-be-processed content, is only an example, and should not be construed as a limitation to the embodiment of the present invention.
In addition, in order to intuitively represent the relationship between the target content and the related word corresponding to the target content, that is, to determine the related word corresponding to the target content based on the target content, optionally, any content to be processed belonging to the target content and the corresponding related word are subjected to related display according to a predetermined hierarchical display manner. For example: and displaying the related words of the target content as the next level content of the target content according to a preset multi-level display mode.
In this embodiment, the document may be effectively associated with related content in other documents; in addition, since the associated search can be further performed on the associated word, the associated content deeper than the target document can be obtained, and the use experience of the user can be further improved.
For clarity of the scheme and clear layout, the following exemplary description describes the manner of searching for the relevant words in each specified document with respect to the content to be processed.
Optionally, in an implementation manner, each of the designated documents may include the target document;
correspondingly, the searching mode of the relevant words about the content to be processed in each specified document can comprise the following steps of A-C:
step A, determining each appointed document from a plurality of documents;
the specific implementation manner of whether any document contains the content to be processed may be the same as the implementation manner of identifying whether the text content contains the specified content in the prior art, and details of the implementation manner are not repeated in the embodiments of the present invention.
Step B, aiming at each document in each appointed document, extracting entity words from a content block of the document, which contains the content to be processed, and establishing a corresponding relation between the extracted entity words and the document identification of the document;
after each specified document is determined, a content block containing content to be processed can be determined for each document in each specified document, entity words are extracted from the content block containing the content to be processed of the document, and a corresponding relation between the extracted entity words and the document identification of the document is established. The document identifier of any document is used for uniquely identifying the document, and may be a title of the document, or a digital identifier generated by a predetermined server or a document processing client based on a preset digital identifier generation algorithm.
For example, the step of extracting entity words from the content blocks of the document containing the content to be processed may include:
inputting a pre-trained language model into a content block of the document, wherein the content block comprises the processing title, so as to perform character-level coding on the content block, and classifying the coded text information through a CRF (conditional random field) algorithm, so as to obtain the entity category of each character in the content block; and splicing adjacent characters in the same category to obtain the entity category of each word, and obtaining the required entity words in the content block based on the entity category of each word.
The CRF is a sequence tagging algorithm and can be used for tasks such as part-of-speech tagging, word segmentation, named entity recognition and the like. It is to be understood that, if the related word is an entity word in a specified category, the entity word in the specified category may be selected from the multiple words based on the entity category of each word, so as to obtain the desired entity word in the content block.
And step C, determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain related words related to the content to be processed.
By the implementation mode, the relevant words about the content to be processed can be found from each specified document.
Optionally, in an implementation manner, the target document may not be included in each of the designated documents;
correspondingly, the searching mode of the relevant words about the content to be processed in each specified document may include steps D-E:
step D, determining documents containing the content to be processed except the target document from the plurality of documents to obtain each appointed document; and the plurality of documents are all the documents of the user to which the target document belongs.
Step C, aiming at each document in each specified document, extracting entity words from the content block of the document containing the content to be processed;
step E, aiming at each extracted entity word, identifying whether the entity word appears in the target document, and if the entity word appears, deleting the entity word; and determining the remaining entity words as the associated words related to the content to be processed. The method for identifying whether any entity word appears in the target document may be the same as the implementation method for identifying whether the text content includes the specified content in the prior art, and the specific implementation method is not described in detail in the embodiments of the present invention.
It should be emphasized that the above-mentioned manner for searching for the relevant word of the content to be processed in each specific document is only an example, and the process should not be limited to the embodiment of the present invention.
The content association method provided by the embodiment of the present invention is described below with reference to a specific embodiment.
As shown in fig. 3, a content association method provided in an embodiment of the present invention may include the following steps:
s301, when detecting the selection operation of the exploration identification corresponding to the title of the target document, the terminal equipment acquires the title in the target document as the content to be processed;
wherein the selection operation is sent out by the user in the display interface of the target document.
For clarity of the scheme, the following describes an exemplary implementation manner of S301 in conjunction with the user interaction with the electronic device:
when a display instruction given by a user for a graph menu is received, displaying the graph menu of the user, wherein each graph node in the graph menu can represent one document in a plurality of documents of the user, and each graph node can be identified by a title of the represented document; illustratively, FIG. 4(b) shows the contents of a graph menu containing graph nodes, each representing a document and identified by its title. In addition, the graph nodes of the documents with the association may be connected, such as the graph nodes marked with the title 6, the title 7 and the title 8; whether the documents have the association or not can be analyzed through a preset association analysis mode, and for example, whether the documents have the association or not is analyzed through analyzing whether the titles have the same content or not; alternatively, the documents are analyzed for association by analyzing whether the documents belong to the same domain, and the domain of each document may be specified by the user when creating the document, for example: literature areas, mathematics areas, and the like.
When the selected operation given by a user for a graph node is detected, displaying the document content of a target document represented by the graph node through a tab interface; illustratively, FIG. 4(c) shows the tab interface of the document indicated by heading 1;
when the mouse is detected to be suspended at the title in the tab interface, an exploration mark appears at the specified associated position of the title; and if the selected operation aiming at the exploration identification is detected, acquiring a title in the target document as the content to be processed. Illustratively, fig. 4(d) shows that a "exploration" identifier is shown on the right side of title 1, and the user clicks the "exploration" identifier to trigger the content association processing about the log shown in title 1, where title 1 is the content to be processed.
S302, after acquiring the content to be processed, the terminal device constructs a content association request carrying the content to be processed and the document identifier of the target document, and sends the constructed content association request to a predetermined server;
correspondingly, when the predetermined server receives the content association request, the response process for the content association request is as follows:
determining each appointed document containing the content to be processed from the documents of the user to which the target document belongs, wherein each appointed document contains the target document;
for each document in each specified document, extracting entity words from a content block of the document, which contains the content to be processed, and establishing a corresponding relation between the extracted entity words and the document identification of the document;
and obtaining a document identifier of the target document, determining an entity word corresponding to the document identifier of the target document from the extracted entity words based on the corresponding relation, eliminating the determined entity word to obtain a relevant word related to the content to be processed, and feeding the obtained relevant word back to the terminal equipment.
For example, as shown in fig. 4(g), the predetermined server may transmit a content block containing the content to be processed to the intelligent processing end through a request interface provided by the intelligent processing end, and the intelligent processing end identifies an entity word in the content block and feeds back the entity word to the predetermined server; and then, the preset server selects entity words which do not appear in the target document from the received entity words to obtain associated words, and feeds the associated words back to the terminal equipment so that the terminal equipment can display the associated words. For example, the predetermined server may send an entity word identification request carrying a content block to the intelligent processing end, where the content block carried in the entity word identification request is a content block containing content to be processed; after receiving the entity word recognition request, the intelligent processing terminal can extract the content block from the entity word recognition request, recognize the entity words from the content block, and feed back response results carrying the entity words to the predetermined service terminal, so as to feed back the recognized entity words to the predetermined service terminal in batch.
And S303, the terminal equipment receives the relevant words fed back by the preset server and displays the received relevant words in a display interface of the target document.
If the user issues by sending in the tab interface of the target document: and aiming at the selection operation of the exploration mark corresponding to the title of the target document, displaying the determined related words in a tab interface of the target document. Illustratively, fig. 4(e) shows a schematic diagram of showing each relevant word after the document content shown by the tab interface.
S304, when the terminal device detects the selection operation of the exploration identification corresponding to the target content, the terminal device obtains the target content as a new content to be processed, and the step returns to S302.
The target content is any relevant word in the display interface of the target document. For example, fig. 4(f) shows a method for presenting a related word related to a target content, where the target content is related to a related word 1, the related words related to the target content are related to a related word 11 and a related word 12, and the related word 11 and the related word 12 are presented as contents next to the related word 1.
In this embodiment, the document may be effectively associated with related content in other documents; in addition, since the associated search can be further performed on the associated word, the associated content deeper than the target document can be obtained, and the use experience of the user can be further improved.
Based on the above method embodiment, the embodiment of the invention also provides a content association system. As shown in fig. 5, a content association system may include: a document processing client 510 and a reservation server 520;
the document processing client 510 is configured to obtain a title in a target document, serve as a content to be processed, construct a content association request carrying the content to be processed, and send the content association request to a predetermined server 520;
the predetermined server 520 is configured to, after receiving the content association request, search for associated words in each specified document about the content to be processed, and feed back the searched associated words to the document processing client; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
the document processing client 530 is further configured to receive the relevant word fed back by the predetermined server, and display the determined relevant word for the target document.
Optionally, in another embodiment of the present invention, the document processing client 510 is further configured to, after displaying the determined associated word for the target document, obtain a target content as a new content to be processed, and execute the step of constructing a content association request carrying the content to be processed; wherein the target content is the displayed associated word.
Optionally, in another embodiment of the present invention, any content to be processed belonging to the target content and the corresponding related word are subjected to related display according to a predetermined multi-level display manner.
Optionally, in another embodiment of the present invention, the searching, by the predetermined server 520, for the relevant word about the content to be processed in each specified document includes:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
Optionally, in another embodiment of the present invention, the acquiring, by the document processing client 510, a title in the target document as the content to be processed includes:
when a first content association operation aiming at a title of a target document is detected, the title in the target document is obtained and is used as a content to be processed.
Optionally, in another embodiment of the present invention, the acquiring, by the document processing client 510, the target content in the presentation interface as the new content to be processed includes:
and when the second content association operation aiming at the target content is detected, acquiring the target content as new content to be processed.
The specific implementation manner of each step executed by the document processing client 510 and the predetermined server 520 may refer to the corresponding step in the foregoing method embodiment, which is not described herein again.
Based on the above method embodiment, the embodiment of the invention also provides a content correlation device. As shown in fig. 6, the content association apparatus may include:
an obtaining module 610, configured to obtain a title in a target document as a content to be processed;
a determining module 620, configured to determine relevant words in each specified document about the content to be processed; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
a display module 630, configured to display the determined related word for the target document.
Optionally, in another embodiment of the present invention, the obtaining module 610 is further configured to obtain, after the determined related word is shown for the target document, a target content as a new content to be processed, and trigger the determining module;
wherein the target content is the displayed associated word.
Optionally, in another embodiment of the present invention, any content to be processed belonging to the target content and the corresponding related word are subjected to related display according to a predetermined multi-level display manner.
Optionally, in another embodiment of the present invention, a manner of searching for a relevant word about the content to be processed in each specified document includes:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
Optionally, in another embodiment of the present invention, the acquiring module 610 acquires a title in a target document of a user as the content to be processed, and may include:
when a first content association operation aiming at a title of a target document of a user is detected, the title in the target document is obtained and is used as a content to be processed.
Optionally, in another embodiment of the present invention, the obtaining module 610 obtains the target content in the presentation interface as the new content to be processed, which may include: and when a second content association operation aiming at the target content is detected, acquiring the target content as new content to be processed.
An embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the steps of the content association method provided in the embodiment of the present invention when executing the program stored in the memory 703.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In a further embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above-mentioned content association methods.
In a further embodiment provided by the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of any of the above-described embodiments of the content association method.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (16)

1. A method for associating content, the method comprising:
acquiring a title in a target document as a content to be processed;
determining relevant words about the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
and displaying the determined associated words aiming at the target document.
2. The method according to claim 1, wherein after the step of presenting the determined relevant words for the target document, the method further comprises:
acquiring target content as new content to be processed, and executing the step of determining relevant words corresponding to the content to be processed in each specified document; wherein the target content is the displayed associated word.
3. The method according to claim 2, wherein any content to be processed belonging to the target content and the corresponding associated word are associated and displayed according to a predetermined multi-level display mode.
4. The method according to any one of claims 1 to 3, wherein the manner of searching for the relevant word about the content to be processed in each specified document comprises:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
5. The method according to any one of claims 1 to 3, wherein the obtaining of the title in the target document as the content to be processed comprises:
when a first content association operation aiming at a title of a target document is detected, the title in the target document is obtained and is used as a content to be processed.
6. The method according to claim 2, wherein the acquiring the target content as the new content to be processed comprises:
and when a second content association operation aiming at the target content is detected, acquiring the target content as new content to be processed.
7. A content association system, the system comprising: the system comprises a document processing client and a preset server;
the document processing client is used for acquiring a title in a target document, taking the title as a content to be processed, constructing a content association request carrying the content to be processed, and sending the content association request to a predetermined server;
the predetermined server is used for searching the associated words related to the content to be processed in each specified document after receiving the content association request, and feeding back the searched associated words to the document processing client; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
the document processing client is further configured to receive the relevant words fed back by the predetermined server, and display the determined relevant words for the target document.
8. The system according to claim 7, wherein the document processing client is further configured to, after the step of presenting the determined associated word for the target document, obtain a target content as a new content to be processed, and execute the step of constructing a content association request carrying the content to be processed;
wherein the target content is the displayed associated word.
9. The system according to claim 8, wherein any content to be processed belonging to the target content and the corresponding associated word are associated and displayed in a predetermined multi-level display manner.
10. The system according to any one of claims 7 to 9, wherein the searching for the relevant word about the content to be processed in each specified document by the predetermined server comprises:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
11. A content association apparatus, applied to an electronic device, the apparatus comprising:
the acquisition module is used for acquiring a title in the target document as the content to be processed;
the determining module is used for determining relevant words related to the content to be processed in each specified document; the specified document is a document containing the content to be processed, and the associated words are entity words which belong to the same content block as the content to be processed and do not appear in the target document;
and the display module is used for displaying the determined associated words aiming at the target document.
12. The apparatus according to claim 11, wherein the obtaining module is further configured to, after the step of presenting the determined related word for the target document, obtain target content as new content to be processed, and trigger the determining module; wherein the target content is the displayed associated word.
13. The apparatus according to claim 12, wherein any content to be processed belonging to the target content and the corresponding related word are related-displayed in a predetermined multi-level display manner.
14. The apparatus according to any one of claims 11 to 13, wherein the manner of searching for the relevant word in each specified document with respect to the content to be processed comprises:
determining each specified document from a plurality of documents;
for each document in the specified documents, extracting entity words from the content blocks of the document containing the content to be processed, and establishing the corresponding relation between the extracted entity words and the document identification of the document;
and determining entity words corresponding to the document identification of the target document from the extracted entity words based on the corresponding relation, and eliminating the determined entity words to obtain associated words related to the content to be processed.
15. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.
16. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.
CN202110472315.6A 2021-04-29 2021-04-29 Content association method, system, device, electronic equipment and storage medium Pending CN113204578A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110472315.6A CN113204578A (en) 2021-04-29 2021-04-29 Content association method, system, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110472315.6A CN113204578A (en) 2021-04-29 2021-04-29 Content association method, system, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113204578A true CN113204578A (en) 2021-08-03

Family

ID=77029353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110472315.6A Pending CN113204578A (en) 2021-04-29 2021-04-29 Content association method, system, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113204578A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814089A (en) * 2009-02-23 2010-08-25 富士胶片株式会社 Related content display device and system
CN103678263A (en) * 2013-12-31 2014-03-26 吕奇森 Graphical interface display method and system for incidence relations among document chapters
CN104077011A (en) * 2013-03-26 2014-10-01 北京三星通信技术研究有限公司 Method for associating documents in same type and terminal equipment
CN110321561A (en) * 2019-06-27 2019-10-11 腾讯科技(深圳)有限公司 A kind of keyword extracting method and device
CN110990696A (en) * 2019-11-25 2020-04-10 三角兽(北京)科技有限公司 Method and device for recommending search intention
CN111045836A (en) * 2019-11-25 2020-04-21 三角兽(北京)科技有限公司 Search method, search device, electronic equipment and computer-readable storage medium
CN111368185A (en) * 2020-02-25 2020-07-03 北京字节跳动网络技术有限公司 Data display method and device, storage medium and electronic equipment
CN111475729A (en) * 2020-04-07 2020-07-31 腾讯科技(深圳)有限公司 Search content recommendation method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814089A (en) * 2009-02-23 2010-08-25 富士胶片株式会社 Related content display device and system
CN104077011A (en) * 2013-03-26 2014-10-01 北京三星通信技术研究有限公司 Method for associating documents in same type and terminal equipment
CN103678263A (en) * 2013-12-31 2014-03-26 吕奇森 Graphical interface display method and system for incidence relations among document chapters
CN110321561A (en) * 2019-06-27 2019-10-11 腾讯科技(深圳)有限公司 A kind of keyword extracting method and device
CN110990696A (en) * 2019-11-25 2020-04-10 三角兽(北京)科技有限公司 Method and device for recommending search intention
CN111045836A (en) * 2019-11-25 2020-04-21 三角兽(北京)科技有限公司 Search method, search device, electronic equipment and computer-readable storage medium
CN111368185A (en) * 2020-02-25 2020-07-03 北京字节跳动网络技术有限公司 Data display method and device, storage medium and electronic equipment
CN111475729A (en) * 2020-04-07 2020-07-31 腾讯科技(深圳)有限公司 Search content recommendation method and device

Similar Documents

Publication Publication Date Title
US10795939B2 (en) Query method and apparatus
CN107832433B (en) Information recommendation method, device, server and storage medium based on conversation interaction
US11023505B2 (en) Method and apparatus for pushing information
US11669579B2 (en) Method and apparatus for providing search results
CN109726274B (en) Question generation method, device and storage medium
CN109190049B (en) Keyword recommendation method, system, electronic device and computer readable medium
US11176453B2 (en) System and method for detangling of interleaved conversations in communication platforms
CN105518661B (en) Segment via the hyperlink text of excavation carrys out image browsing
CN109918555B (en) Method, apparatus, device and medium for providing search suggestions
CN110888990A (en) Text recommendation method, device, equipment and medium
CN106959976B (en) Search processing method and device
US20140181099A1 (en) User management of electronic documents
JP7069802B2 (en) Systems and methods for user-oriented topic selection and browsing, how to display multiple content items, programs, and computing devices.
CN104462056A (en) Active knowledge guidance based on deep document analysis
CN112579937A (en) Character highlight display method and device
CN111126058A (en) Text information automatic extraction method and device, readable storage medium and electronic equipment
WO2024087821A1 (en) Information processing method and apparatus, and electronic device
US20170293683A1 (en) Method and system for providing contextual information
CN104240107A (en) Community data screening system and method thereof
CN107908792B (en) Information pushing method and device
CN113204578A (en) Content association method, system, device, electronic equipment and storage medium
CN113204579B (en) Content association method, system, device, electronic equipment and storage medium
US20160117352A1 (en) Apparatus and method for supporting visualization of connection relationship
CN113922979B (en) Network security equipment configuration system, configuration method and computer equipment
JP4722819B2 (en) Information disclosure system and information disclosure method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination