CN116663529A - Entry generation method and device and electronic equipment - Google Patents

Entry generation method and device and electronic equipment Download PDF

Info

Publication number
CN116663529A
CN116663529A CN202310645305.7A CN202310645305A CN116663529A CN 116663529 A CN116663529 A CN 116663529A CN 202310645305 A CN202310645305 A CN 202310645305A CN 116663529 A CN116663529 A CN 116663529A
Authority
CN
China
Prior art keywords
document
entry
term
generating
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310645305.7A
Other languages
Chinese (zh)
Inventor
胡元瑞
赵立悦
陈爽
刘玉琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202310645305.7A priority Critical patent/CN116663529A/en
Publication of CN116663529A publication Critical patent/CN116663529A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The embodiment of the disclosure provides a method and a device for generating an entry and electronic equipment, wherein the method comprises the following steps: receiving information of a document used for generating an entry and sent by a document server; extracting term elements from the document; generating a term based on the term element. According to the scheme, the entry is generated according to the document of the document service end, the user does not need to manually create the entry, the workload of creating the entry is reduced, and the labor cost of entry maintenance can be reduced.

Description

Entry generation method and device and electronic equipment
Technical Field
The embodiment of the disclosure relates to the technical field of internet, in particular to a method and a device for generating an entry and electronic equipment.
Background
To facilitate the accumulation of knowledge, multiple documents may typically be created and maintained in a knowledge base. Different knowledge content may be recorded in the plurality of documents.
Further, to facilitate user understanding of knowledge content, multiple terms may be created and maintained. Each term may provide a corresponding paraphrase, example, or related document, etc. When a user browses text content, information such as paraphrasing, examples and the like of words can be determined by searching for terms for words which are not understood or are difficult to understand, so that knowledge content can be understood.
Disclosure of Invention
The embodiment of the disclosure provides a term generation method, a term generation device and electronic equipment.
In a first aspect, an embodiment of the present disclosure provides a method for generating an entry, the method including: receiving information of a document used for generating an entry and sent by a document server; extracting term elements from the document; generating a term based on the term element.
In a second aspect, an embodiment of the present disclosure provides a term generating method, including: determining a first document needing to generate an entry; the information of the first document is sent to an entry server, and the entry server executes the following entry generation operation: entry elements are extracted from the first document, and entries are generated based on the entry elements.
In a third aspect, an embodiment of the present disclosure provides a term generating device, including: the receiving unit is used for receiving the information of the document used for generating the entry and sent by the document server; an extraction unit for extracting entry elements from the document; and the generating unit is used for generating the entry based on the entry element.
In a fourth aspect, an embodiment of the present disclosure provides a term generating device, including: a determining unit, configured to obtain a first document in which an entry needs to be generated; the sending unit is used for sending the information of the first document to the vocabulary entry server, and the vocabulary entry server executes the following vocabulary entry generating operation: entry elements are extracted from the first document, and entries are generated based on the entry elements.
In a fifth aspect, embodiments of the present disclosure provide an electronic device, including: a processor and a memory; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored by the memory such that the at least one processor performs the above first aspect, second aspect, and various possible entry generation methods of the first aspect, second aspect.
In a sixth aspect, embodiments of the present disclosure provide a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, implement the above first aspect, second aspect, and various possible entry generation methods of the first aspect, second aspect.
In a seventh aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the above first aspect, second aspect and various possible entry generation methods of the first aspect, second aspect.
According to the entry generation method, the entry generation device and the electronic equipment, information of a document for generating the entry is sent by a document server side; extracting term elements from the document; and generating the vocabulary entry based on the vocabulary entry element, so that the vocabulary entry is generated according to the document of the document service end, and compared with the manual vocabulary entry creation of a user, the vocabulary entry creation workload is reduced, and the manpower cost for maintaining the vocabulary entry is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the description of the prior art, it being obvious that the drawings in the following description are some embodiments of the present disclosure, and that other drawings may be obtained from these drawings without inventive effort to a person of ordinary skill in the art.
FIG. 1 is a schematic flow chart of a vocabulary entry generation method provided by the present disclosure;
FIG. 2 is another schematic flow chart diagram of a vocabulary entry generation method provided by the present disclosure;
FIG. 3 is a schematic flow chart of a vocabulary entry generation method provided by the present disclosure;
FIG. 4A is a schematic diagram of a document display interface;
FIG. 4B is a schematic diagram of authorizing a document to generate an entry;
FIG. 5 is a schematic block diagram of a term generation system provided by the present disclosure;
FIG. 6 is a schematic block diagram of an entry generating device provided by the present disclosure;
FIG. 7 is a schematic block diagram of an entry generating device provided by the present disclosure;
fig. 8 is a schematic hardware structure of an electronic device according to an embodiment of the disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.
To accumulate knowledge information, multiple documents may be aggregated in a knowledge base. The plurality of documents may constitute a document collection. In the knowledge base, a catalog of documents may be generated from a collection of documents in order to facilitate the management and lookup of the documents. The user can find the corresponding document through the document directory.
In addition, in order to facilitate the user's understanding of text content (e.g., text content in a document or instant messaging), a dictionary for carrying entries is provided. When the term name is included in the text content, the term name may be highlighted in the text content, and the user may view term information through the term name in the text content. The use of entries in the dictionary is more flexible than the documents in the knowledge base.
There is currently a need to maintain documents (e.g., document creation, document modification) in the knowledge base and entries (e.g., entry creation and entry modification) in the dictionary, respectively.
In some application scenarios, the knowledge content that is identical to the term may be included in the document of the knowledge base. In these application scenarios, it is still necessary to maintain documents in the knowledge base and entries in the dictionary, respectively. The current maintenance of documents and entries is mainly manual maintenance by users. The document and the entry are maintained respectively, the repeatability of the work is higher, and the labor cost is higher.
In order to improve the problems, in the term generation method provided by the disclosure, the document server side sends the information of the document of the term to be generated to the term server side, the term server side extracts term elements from the document and generates the term based on the term elements, so that the term is automatically generated by the document, the term corresponding to the document is not required to be manually created by a user, the workload of term maintenance for user creation is reduced, and the labor cost is saved.
Referring to fig. 1, a schematic flowchart of a term generation method provided in the present disclosure is shown.
As shown in fig. 1, the term generating method includes the steps of:
s101: and receiving information of the document used for generating the entry and sent by the document server.
In this embodiment, the execution subject of the term generation method may be a term server that provides a term service. Such as the server side of a dictionary application.
The document server may be a server that stores or may acquire a plurality of documents. Such as a server providing document storage services for a knowledge base. The knowledge base may correspond to a plurality of documents. The document server may obtain the plurality of documents and determine a document from which an entry is to be generated. And sending the information of the document of the entry to be generated to the execution main body.
The documents include newly created documents and/or existing documents.
In some application scenarios, the document includes an existing document. The document server can take the document existing in the server as the document of the vocabulary entry to be generated, and send the information of the existing document to the vocabulary entry server. And generating the vocabulary entry by the vocabulary entry server according to the information of the existing document.
In some application scenarios, the document includes a new document. The document server sends the new document to the vocabulary entry server, and the vocabulary entry server generates vocabulary entries according to the document information of the new document.
The document service and the entry service are usually independent from each other. The user needs to maintain the document and the term in the document application and the term application, respectively. In this embodiment, the execution body may acquire a document from a document server and execute an operation of generating an entry according to the document.
The documents may include various types of documents, such as a table document, a multidimensional table, excel, word document, and the like.
The information of the document may include the document content and/or a storage address (e.g., link) of the document.
S102: entry elements are extracted from the document.
After receiving the information of the document, the execution body may determine the document content of the document. If the information of the document includes document contents, the execution body may directly read the document contents from the information of the document. If the storage address of the packet document of the document is the storage address, the execution body may access the document from the storage address.
The term elements herein may be different elements that make up the term.
The term elements include, but are not limited to, at least one of: term name, paraphrasing, associated document, related picture.
The term name is also known as the term. The term names may be used to distinguish between different terms.
Paraphrasing may be text content describing the meaning of the term or summarizing the content of the term.
The associated document may include, for example, a term source document, a document including a term name, or a document associated with a term, etc.
Related pictures, such as pictures that may assist in understanding the entry, etc.
The document may be analyzed using various methods to extract the term elements from the document.
In some embodiments, the step S102 includes: inputting the document (document content) into a pre-trained entry element extraction model, and outputting entry elements by the entry element extraction model; the term element extraction model is used for extracting term elements from an input document.
In these embodiments, the term element extraction model may be various machine learning models, such as a neural network model, a deep learning model, and the like.
Training the term element extraction model may be trained using supervised methods. The training data may include a plurality of training data pairs. Each training data pair comprises a document and a plurality of entry elements corresponding to the document. During training, a document is used as input, and the entry element corresponding to the document is used as output to train the entry element extraction model, so that a trained entry element extraction model is obtained.
S103: the term is generated based on the term element.
Multiple term elements corresponding to one term can be stored in an associated mode, and then the term is obtained.
As an implementation, different term elements to be generated into a term may be written into corresponding fields in the term template, thereby obtaining a term.
In the embodiment, receiving information of a document for generating an entry, which is sent by a document server; then extracting entry elements from the document according to the information of the document; the term is generated based on the term element. Therefore, the automatic generation of the vocabulary entry according to the document of the document service end is realized, the vocabulary entry is not required to be manually created by a user, the workload of creating the vocabulary entry is reduced, and the labor cost of vocabulary entry maintenance is reduced.
In some embodiments, the step S102 includes: and extracting entry elements for generating entries from the document based on extraction rules matched with the document.
In some application scenarios, the document is a document that describes unstructured content. Such as doc-format documents and pdf-format documents. In these application scenarios, the extraction rules may be determined from the features of the document. The above extraction rules are used to extract the components of the document. The components include, but are not limited to: one or more of a title, a brief introduction, a abstract, or a text.
Taking doc format documents as an example, component recognition rules may be used to identify components of the document. Entry elements are then extracted from the constituent parts.
Illustratively, a paragraph of Doc-format document may be monopolized, and the text content in the first paragraph is identified as a title. The content of the summary or profile section may be identified based on keywords such as "summary" or "profile". The paragraphs following the abstract and brief introduction are identified as the body of the document. In addition, the document can also comprise pictures, associated document information, corresponding link information and the like. The above-described picture, associated document information, related links, and the like may be determined according to respective keywords or display styles.
After identifying the individual component parts of the document, entry elements may be intercepted from the different component parts using preset extraction rules. Illustratively, for a title, at least a portion of text may be intercepted from the title as an entry name according to a preset title intercept rule. At least some of the text is intercepted in the abstract or brief introduction as paraphrasing, etc.
As an exemplary illustration, the extraction rule may include, for example, intercepting the first N words or words, or intercepting the last N words or words, or intercepting the N words or words from a middle designated location.
In some application scenarios, the document may include a plurality of side-by-side structures. In these application scenarios, the step S102 may further include: for each structure, extracting an entry element corresponding to the structure from the structure. Accordingly, the step S103 may further include: and generating the vocabulary entry corresponding to each structure according to the vocabulary entry element corresponding to the structure.
Each structure may record one information content. The structure may include a plurality of structure members. In the structure, the constituent parts of the information contents respectively corresponding to the different members of the structure may be set in advance. Illustratively, for example, the structural body member 1 is set in advance as a title of information content. The structural member 2 is preset as a brief introduction or summary of the information content. The structural member 3 is previously set as a body of information content or the like.
In one example, the document type may be a table, and each row of the table may be a structure. Accordingly, each structure may include a plurality of cells, each of which may be a member of the structure. The fields corresponding to the cells may be preset. For example, the field corresponding to cell 1 is a title describing the information content, and the field corresponding to cell 2 is a brief introduction or abstract describing the information content. The field corresponding to the cell 3 is a text content or the like describing the information content.
Thus, in the above example, different constituent parts of the information content in the structure can be extracted from the fields corresponding to the respective members of the structure. Taking a two-dimensional table as an example, each row may be a structure. For each row in the two-dimensional table, extracting the entry element corresponding to the row. Assuming that the field corresponding to column 1 is used to record a title (illustratively, the identification of column 1 may be set to "title"), the field corresponding to column 2 is used to record a profile or abstract (illustratively, the identification of column 1 is set to "profile" or "abstract"). The field corresponding to column 3 is used for text content (schematically, the identification of column 3 is set to "text") or the like that describes information content. Column 4 sets the corresponding field for recording the corresponding link of the picture (illustratively, the identification of column 4 is set to "picture link"); the field corresponding to column 5 is used to record a link corresponding to the associated document (illustratively, the identification of column 5 is set to "related content link"), and the like. For each row, extracting a title corresponding to the row from the column 1, and then intercepting the term name from the title according to a preset extraction rule; the "brief introduction" or "abstract" corresponding to the row is extracted from column 2, and the paraphrasing is extracted from the "brief introduction" or "abstract". Text content is extracted from column 3, and examples, or related content, are extracted according to preset extraction rules. Extracting a picture link from column 4; links corresponding to the relevant content are extracted from column 5. And associating the extracted entry elements of each row, thereby obtaining the entry corresponding to the row.
In some embodiments, the document is a newly created document. The step S101 includes: and the receiving document server responds to the information sent by the newly built document. The information is used for indicating the generation of the entry corresponding to the new document.
In these embodiments, the document server may send the information to the term server once when receiving the new document created therein. And then generating the vocabulary entry of the newly built document by the vocabulary entry server according to the information. When the document server side stores the newly created document, the entry corresponding to the newly created document is generated in real time.
In some embodiments, after generating the entry corresponding to the document, tagging information indicating that the entry has been generated may be set for the document. The document server can traverse the document set once at preset time intervals to detect whether each document has the marking information. And determining the document without the labeling information as the document of the entry to be generated. The document of the vocabulary entry to be generated can be sent to the vocabulary entry server, and the corresponding vocabulary entry is generated by the vocabulary entry server.
Referring to fig. 2, another schematic flow chart of the entry generation method provided by the present disclosure is shown. As shown in fig. 2, the term generation method includes the steps of:
s201: and receiving information of the document used for generating the entry and sent by the document server.
S202: entry elements are extracted from the document.
S203: the term is generated based on the term element.
The specific implementation of the steps S201 to S203 may refer to the steps S101 to S103 in the embodiment shown in fig. 1, which is not described herein.
S204: and establishing an association relation between the entry and the document.
S205: and updating the document and the entry in a linkage way according to the association relation.
For a document and terms generated by the document, an association between the document and the corresponding terms may be established. The association relationship may be stored in the document server or in the entry server. And the system can be stored in other electronic equipment which is in communication connection with the document service end and/or the entry service end.
In one example, one-way linkage updating from document to term may be implemented according to the above-described association. For example, when a document is updated, the update may be synchronized to the term according to the above-described association relationship to update the term. Or a one-way linkage update from entry to document is implemented, for example, when an entry is updated, the update may be synchronized to the document to update the document.
In another example, the two-way linkage update between the entry and the document may be implemented according to the association relationship described above. For example, when any one of the entry or document is updated, the update may be synchronized to the other of the entry or document to update the other.
As a schematic illustration, the association between the document and the entry is saved at the document server. Assume that an entry A2 is generated from the document A1. If the profile or abstract information in the document A1 is updated, the document server may send update request information to the entry server. The update request information includes association between the document A1 and the entry A2, and profile or abstract information after the document A1 is changed. After the entry server receives the update request information, determining an entry A2 associated with the A1 according to the association relation; and updating corresponding content (such as term paraphrasing) in the term A2 according to the abstract or profile information updated by the document A1. Similarly, if the user updates the content of the term A2, for example, updates a link of the associated document of the term A2, the term server may send update request information to the document server, where the update request information includes information of the term A2 and the link information. The document server side can search the document information with the association relation with the entry A2 according to the information of the entry A2, and determine the document corresponding to the entry as the document A1. The document server may then modify or add links to the associated document in document A1.
In one example, a document includes a plurality of juxtaposed structures therein. Each structure may generate a corresponding entry. The establishing the association relation between the entry and the document comprises the following steps: and establishing an association relationship between the entry and a structural body generating the entry in the document.
In particular, a unique structure identifier for a structure in a document collection may be formed from a document identifier and an identifier for the structure in the document. For the term generated by the structure, an association between the term name and the structure identifier may be established.
Taking a document including a plurality of juxtaposed structures as an example, a two-dimensional table. The document has a corresponding document identification, and each line may be a structure. Each structure may correspond to a row identifier. The structure body identification of the structure body corresponding to each row in the document set can be determined according to the document identification and the row identification of the document. The structure identifier of the structure may be associated with the entry name corresponding to the structure, and the association relationship may be stored.
The one-way linkage update or the two-way linkage update between the structural body and the entry can be realized according to the association relation.
Compared with the embodiment shown in fig. 1, the present embodiment increases the establishment of the association relationship between the document and the corresponding entry, and realizes the linkage update between the document and the entry according to the association relationship. Thereby further reducing the maintenance costs of documents and entries.
Furthermore, the dictionary application providing the vocabulary entries and the knowledge base application providing the documents may be integrated in the same integrated application. Since both the entry and the document bear knowledge points, the entry and the document with the association relationship bear descriptions of the same knowledge point. Under different scenes using the integrated application, the user may find the bearing form of the entry or the bearing form of the document for the same knowledge point. Through the two-way linkage updating of the two, a user can acquire the latest knowledge content aiming at the same knowledge point no matter which bearing form is searched, and the consistency of the knowledge content in the integrated application is ensured.
Referring to fig. 3, another schematic flow chart of the entry generation method provided by the present disclosure is shown. As shown in fig. 3, the term generating method includes the steps of:
s301: a first document that requires the generation of an entry is determined.
In this embodiment, the execution subject of the term generation method may be a document server. The document server may determine the first document currently requiring entry generation from a collection of documents stored therein or from an electronic device with which the collection of documents is communicatively coupled.
The first document is any document in the document set.
For a document in the document set, if an entry corresponding to the document is generated, an association relationship between the entry and the document may be established, or tagging information for indicating that the entry has been generated may be added to the document in the document set. The document service end can determine a first document needing to generate an entry in the document set according to the association relation or the labeling information. As an implementation manner, the document server may determine, in the document set, a first document that needs to generate an entry once every a preset period of time.
It is understood that the first document may include at least one document for which an entry is to be generated.
In some embodiments, the first document may be a newly created document of the document server. The document service end can determine the newly created document of the local end as the first document of the entry to be generated, so that the entry corresponding to the newly created document is generated in real time when the document is newly created.
S302: the information of the first document is sent to an entry server, and the entry server executes the following entry generation operation: entry elements are extracted from the first document, and entries are generated based on the entry elements.
After determining the document information of the first document, the document server may send the document information of the first document to the entry server. And executing the vocabulary entry generating operation by the vocabulary entry server.
The term server may refer to the relevant description of the embodiment shown in fig. 1 or fig. 2 for generating terms. And are not described in detail herein.
In some embodiments, the first document is a document authorized by a user with a preset authority to generate an entry.
The user having the preset authority may be a user who creates a document, or other user having the management authority for the document.
The user with the preset authority can perform the authorization operation on the document, so that the document server takes the document as a first document needing to generate the entry according to the authorization information corresponding to the authorization operation.
Illustratively, the user with the preset authority may authorize the document according to a preset interactive window provided by the document application.
Turning to fig. 4A and 4B, fig. 4A shows a schematic diagram of a document display interface. FIG. 4B illustrates a schematic diagram of authorizing the generation of terms for a document.
As shown in fig. 4A, a document is displayed in a display interface of the user terminal device. The document comprises a document title 1, a brief introduction, a text and the like. An edit control 41 for editing the document can also be displayed in the display interface. The user may click on the edit control 41 to display a plurality of edit selections for editing the document. Included in the above editing options are editing options "add document application" 42 for indicating that the document is authorized for other applications.
With continued reference to FIG. 4B, the user may perform a selection operation on the "Add document application" edit selection item 42, thereby displaying a document application list 43. The document application list may include applications to which at least one document may be authorized. The dictionary application 44 may be displayed in the above-described document application list 43. The user performs a selection operation on the dictionary application 44, thereby completing the authorization of the document to generate a dictionary.
The terminal device may send the authorization information to a document server, and the document server may use a document with a title of "document title 1" as the first document of the entry to be generated.
In the embodiment, determining a first document needing to generate an entry; the information of the first document is sent to an entry server, and the entry server executes the following entry generation operation: entry elements are extracted from the first document, and entries are generated based on the entry elements. Therefore, the automatic generation of the vocabulary entry according to the document of the document service end is realized, the vocabulary entry is not required to be manually created by a user, the workload of creating the vocabulary entry is reduced, and the labor cost of vocabulary entry maintenance is reduced.
Referring to fig. 5, a schematic block diagram of a term generation system is shown. As shown in fig. 5, the term generation system 50 includes a document service terminal 51 and a term service terminal 52.
The document service terminal 51 and the entry service terminal 52 realize communication connection through a wired or wireless communication manner.
A document server 51, configured to traverse a plurality of documents in the document set to determine a first document that needs to generate an entry; and sending the document information of the first document to an entry server.
The term server 52 is configured to receive information of a first document sent by the document server, and extract term elements from the first document; the term is generated based on the term element.
When generating the vocabulary entry corresponding to the document, the vocabulary entry server and the operation content corresponding to the document server may refer to relevant parts of the embodiments shown in fig. 1, fig. 2 and fig. 3, which are not described herein.
Corresponding to the vocabulary entry generating method of the embodiment shown in fig. 1 above, fig. 6 is a block diagram of the vocabulary entry generating apparatus provided in the embodiment of the present disclosure. For ease of illustration, only portions relevant to embodiments of the present disclosure are shown. Referring to fig. 6, the apparatus 60 includes: a receiving unit 601, an extracting unit 602, and a generating unit 603. Wherein,,
a receiving unit 601, configured to receive information of a document for generating an entry sent by a document server;
an extraction unit 602, configured to extract term elements from a document;
the generating unit 603 is configured to generate an entry based on the entry element.
In an embodiment of the present disclosure, the apparatus 60 further comprises an updating unit (not shown in the figure) for:
establishing an association relationship between the entry and the document;
and updating the document and the entry in a linkage way according to the association relation.
In one embodiment of the present disclosure, the extraction unit 602 is further configured to: entry elements for generating entries are extracted from the document based on extraction rules that match the document.
In one embodiment of the present disclosure, the extraction unit 602 is further configured to: inputting the target document into a pre-trained entry element extraction model, and outputting entry elements by the entry element extraction model; the term element extraction model is used for extracting term elements from an input document.
In one embodiment of the present disclosure, the receiving unit 601 is further configured to: the receiving document server side responds to the information sent by the new document, and the information indicates to generate an entry corresponding to the new document.
In one embodiment of the present disclosure, a document includes a plurality of juxtaposed structures; the extraction unit 602 is further configured to: for each structure, extracting an entry element corresponding to the structure from the structure; the generating unit is used for: and generating the vocabulary entry corresponding to each structure according to the vocabulary entry element corresponding to the structure.
In one embodiment of the present disclosure, the term elements include at least one of:
term name, paraphrasing, associated document, related picture.
Corresponding to the term generation method of the embodiment shown in fig. 3 above, fig. 7 is a block diagram of the term generation device provided in the embodiment of the present disclosure. For ease of illustration, only portions relevant to embodiments of the present disclosure are shown. Referring to fig. 7, the apparatus 70 includes: a determination unit 701 and a transmission unit 702. Wherein,,
a determining unit 701, configured to determine a first document in which an entry needs to be generated;
a sending unit 702, configured to send information of the first document to an entry server, where the entry server performs the following entry generation operations: entry elements are extracted from the first document, and entries are generated based on the entry elements.
In one embodiment of the present disclosure, the first document is a document authorized by a user having a preset authority to generate an entry.
In order to achieve the above embodiments, the embodiments of the present disclosure further provide an electronic device.
Referring to fig. 8, there is shown a schematic structural diagram of an electronic device 800 suitable for use in implementing embodiments of the present disclosure, which electronic device 800 may be a terminal device or a server. The terminal device may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a personal digital assistant (Personal Digital Assistant, PDA for short), a tablet (Portable Android Device, PAD for short), a portable multimedia player (Portable Media Player, PMP for short), an in-vehicle terminal (e.g., an in-vehicle navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 8 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 8, the electronic device 800 may include a processing means (e.g., a central processor, a graphics processor, etc.) 801 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage 808 into a random access Memory (Random Access Memory, RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic device 800 are also stored. The processing device 801, the ROM802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
In general, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 807 including, for example, a liquid crystal display (Liquid Crystal Display, LCD for short), a speaker, a vibrator, and the like; storage 808 including, for example, magnetic tape, hard disk, etc.; communication means 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 8 shows an electronic device 800 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 809, or installed from storage device 808, or installed from ROM 802. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 801.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program (computer-executable instructions) for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods shown in the above-described embodiments.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (Local Area Network, LAN for short) or a wide area network (Wide Area Network, WAN for short), or it may be connected to an external computer (e.g., connected via the internet using an internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not constitute a limitation of the unit itself in some cases, and for example, the receiving unit may also be described as "a unit that receives information of a document for generating an entry sent by a document server".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (14)

1. A method of generating an entry, comprising:
receiving information of a document used for generating an entry and sent by a document server;
extracting term elements from the document;
generating a term based on the term element.
2. The method according to claim 1, wherein the method further comprises:
establishing an association relationship between the entry and the document;
and updating the document and the entry in a linkage way according to the association relation.
3. The method of claim 1, wherein the extracting term elements from the document comprises:
and extracting entry elements for generating entries from the document based on extraction rules matched with the document.
4. The method of claim 1, wherein the extracting term elements from the document comprises:
inputting the document into a pre-trained vocabulary entry element extraction model, and outputting vocabulary entry elements by the vocabulary entry element extraction model; the term element extraction model is used for extracting term elements from an input document.
5. The method of claim 1, wherein the receiving the information of the document for generating the term sent by the document server includes:
and the receiving document server side responds to the information sent by the receiving document server side in response to the receiving of the new document, wherein the information is used for indicating the generation of the entry corresponding to the new document.
6. The method of claim 1, wherein the document comprises a plurality of juxtaposed structures; and said extracting term elements from said document, comprising:
for each structure, extracting an entry element corresponding to the structure from the structure; and
the generating the entry based on the entry element includes:
and generating the vocabulary entry corresponding to each structure according to the vocabulary entry element corresponding to the structure.
7. The method of any one of claims 1 to 6, wherein the term elements include at least one of:
term name, paraphrasing, associated document, related picture.
8. A method of generating an entry, comprising:
determining a first document needing to generate an entry;
the information of the first document is sent to an entry server, and the entry server executes the following entry generation operation: extracting term elements from the first document, and generating terms based on the term elements.
9. The method of claim 8, wherein the first document is a document authorized to generate terms by a user having a preset authority.
10. A term generation system, comprising: a document service end and a term service end; wherein,,
the document service end is used for traversing a plurality of documents in the document set to detect a first document needing to generate an entry; transmitting the information of the first document to an entry server;
the term server is used for receiving the information of the first document sent by the document server and extracting term elements from the first document; generating a term based on the term element.
11. A term generating device, comprising:
the receiving unit is used for receiving the information of the document used for generating the entry and sent by the document server;
an extraction unit for extracting entry elements from the document;
and the generating unit is used for generating the entry based on the entry element.
12. A term generating device, comprising:
a determining unit for determining a first document in which an entry is required to be generated;
the sending unit is used for sending the information of the first document to the vocabulary entry server, and the vocabulary entry server executes the following vocabulary entry generating operation: entry elements are extracted from the first document, and entries are generated based on the entry elements.
13. An electronic device, comprising: a processor and a memory;
the memory stores computer-executable instructions;
the processor executing computer-executable instructions stored in the memory, causing the processor to perform the entry generation method of any one of claims 1 to 9.
14. A computer readable storage medium having stored therein computer executable instructions which, when executed by a processor, implement the entry generation method of any one of claims 1 to 9.
CN202310645305.7A 2023-06-01 2023-06-01 Entry generation method and device and electronic equipment Pending CN116663529A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310645305.7A CN116663529A (en) 2023-06-01 2023-06-01 Entry generation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310645305.7A CN116663529A (en) 2023-06-01 2023-06-01 Entry generation method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN116663529A true CN116663529A (en) 2023-08-29

Family

ID=87722036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310645305.7A Pending CN116663529A (en) 2023-06-01 2023-06-01 Entry generation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN116663529A (en)

Similar Documents

Publication Publication Date Title
CN110969012A (en) Text error correction method and device, storage medium and electronic equipment
CN110781658B (en) Resume analysis method, resume analysis device, electronic equipment and storage medium
CN110275962B (en) Method and apparatus for outputting information
CN115757400B (en) Data table processing method, device, electronic equipment and computer readable medium
CN111738010B (en) Method and device for generating semantic matching model
CN111857720B (en) User interface state information generation method and device, electronic equipment and medium
CN111738316B (en) Zero sample learning image classification method and device and electronic equipment
US20240160650A1 (en) Document processing method and apparatus, device, and medium
US20240202249A1 (en) Information processing method and apparatus, terminal, and storage medium
US20240079002A1 (en) Minutes of meeting processing method and apparatus, device, and medium
CN113919320A (en) Method, system and equipment for detecting early rumors of heteromorphic neural network
CN111026849B (en) Data processing method and device
CN112182255A (en) Method and apparatus for storing media files and for retrieving media files
CN112084441A (en) Information retrieval method and device and electronic equipment
CN116663529A (en) Entry generation method and device and electronic equipment
CN111597441B (en) Information processing method and device and electronic equipment
CN114239501A (en) Contract generation method, apparatus, device and medium
CN115017149A (en) Data processing method and device, electronic equipment and storage medium
CN112307723A (en) Method and device for generating code document and electronic equipment
CN112380476A (en) Information display method and device and electronic equipment
CN112182290A (en) Information processing method and device and electronic equipment
CN111787043A (en) Data request method and device
CN114697760B (en) Processing method, processing device, electronic equipment and medium
CN114613355B (en) Video processing method and device, readable medium and electronic equipment
CN112307245B (en) Method and apparatus for processing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination