WO2022184048A1 - 文档标签的生成方法、装置、终端和存储介质 - Google Patents

文档标签的生成方法、装置、终端和存储介质 Download PDF

Info

Publication number
WO2022184048A1
WO2022184048A1 PCT/CN2022/078560 CN2022078560W WO2022184048A1 WO 2022184048 A1 WO2022184048 A1 WO 2022184048A1 CN 2022078560 W CN2022078560 W CN 2022078560W WO 2022184048 A1 WO2022184048 A1 WO 2022184048A1
Authority
WO
WIPO (PCT)
Prior art keywords
document
tag
content
generating
candidate
Prior art date
Application number
PCT/CN2022/078560
Other languages
English (en)
French (fr)
Inventor
林诗苑
蔡煜圳
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to US18/279,378 priority Critical patent/US20240184973A1/en
Publication of WO2022184048A1 publication Critical patent/WO2022184048A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/908Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance

Definitions

  • the present disclosure relates to the field of information technology, and in particular, to a method and device, a terminal, and a storage medium for generating a document label.
  • tags keyword information
  • the tag is just a custom field value, and it cannot be extended in any way to serve more different application scenarios.
  • the present disclosure provides a method and device, a terminal and a storage medium for generating a document label.
  • the present disclosure adopts the following technical solutions.
  • An embodiment of the present disclosure provides a method for generating a document tag, the generating method comprising: providing a candidate tag for the first document based on the content of the first document; in response to a first preset operation on the candidate tag , generating a tag corresponding to the first document; wherein the tag is associated with a second document, and the second document includes document content corresponding to the tag in the first document.
  • Another embodiment of the present disclosure provides an apparatus for generating a document tag
  • the generating apparatus includes: a candidate tag providing module configured to provide candidate tags for the first document based on the content of the first document; a tag generating module , configured to generate a label corresponding to the first document in response to a first preset operation on the candidate label; wherein the label is associated with a second document, and the second document includes the first document The document content in the document corresponding to the tag.
  • the present disclosure provides a terminal, comprising: at least one memory and at least one processor; wherein the memory is used for storing program codes, and the processor is used for calling the program code stored in the memory to execute the above-mentioned document tagging process. Generate method.
  • the present disclosure provides a storage medium for storing a program code for executing the above-described method for generating a document tag.
  • tags are associated with a second document, and the second document includes document content in the first document corresponding to the tags, so that tag clustering is performed in the second document associated with the tags , and the document content corresponding to the tag can be viewed in the second document, which achieves the effect of automatic classification and auxiliary sorting, improves the aggregation of knowledge graphs, and enhances the value of this document structure.
  • FIG. 1 is a flowchart of a method for generating a document tag according to an embodiment of the present disclosure.
  • FIGS. 2A and 2B illustrate schematic diagrams of a second document corresponding to a tag of some embodiments of the present disclosure.
  • FIG. 3 is a partial module of an apparatus for generating a document label according to another embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • documents may include, but are not limited to, online documents.
  • an embodiment of the present disclosure provides a flowchart of a method for generating a document tag.
  • the method of generating document tags of the present disclosure may include
  • Step S101 providing candidate tags for the first document based on the content of the first document.
  • candidate tags may be recommended for the first document.
  • tags refer to keywords, which are used to indicate the content that may be involved in the document.
  • candidate tags of the present disclosure are available for selection by the user.
  • providing candidate tags for the first document can be performed either during editing of the first document or after the first document has been edited (eg, while reading the first document).
  • the method for generating a document tag of the present disclosure may further include
  • Step S102 in response to the first preset operation on the candidate tag, generate a tag corresponding to the first document.
  • the first preset operation includes a click operation, a voice command, and the like.
  • the candidate tags may be one or more. For example, after providing candidate tags for the first document, if the user clicks one of the candidate tags, it is considered that the clicked candidate tag is adopted, that is, a tag corresponding to the first document is generated.
  • the tags are associated with a second document that includes specific content corresponding to the tags. Unlike keywords or tags of general documents, tags of the present disclosure may be associated with a second document, or the tags may be considered to correspond to links to the second document.
  • the content of the content block (or block) corresponding to the tag in the first document can be displayed, and the position where the tag appears in the block content can be highlighted.
  • a document may be composed of corresponding content blocks (or blocks), and a content block is a part of a document and may carry various data types, and a document may be considered to be composed of one or more content blocks. As shown in FIG.
  • the document to which the label is applied and the corresponding content are included, and it can be indicated in which block of the document the label appears.
  • tag clustering is performed in the second document associated with the tag, and the content of the document corresponding to the tag can be viewed in the second document, achieving the effect of automatic classification and assisted sorting, and improving the knowledge graph. Aggregation increases the value of this document structure.
  • providing candidate tags for the first document based on the content of the first document includes: extracting keywords of the content, and providing the first document based on the extracted keywords or network keywords related to the extracted keywords. candidate labels.
  • the keyword algorithm capabilities of artificial intelligence (AI) are utilized to extract document keywords.
  • external entity words may be used as a filter for word segmentation and recommended to users as candidate tags. External entity words refer to public keywords commonly found in Baidu Encyclopedia and Wikipedia. By matching and filtering the content that has been segmented using these public entries, it can greatly improve the recommendation candidates for different types of document content. The accuracy of the label to cover a variety of different scenarios.
  • providing candidate tags for the first document based on the content of the first document includes providing candidate tags for the first document based on tags of the third document when the first document references the third document.
  • referencing the third document may mean presenting in the second document in any suitable form such as a link/preview in the second document. If the third document is referenced, and the third document has been tagged, the tag of the third document can also be recommended as a candidate tag of the first document. In this way, the range of recommended candidate tags is expanded, in addition, the correlation between the first document and the third document it refers to is enhanced, and the network of tag clustering is expanded.
  • providing the candidate tag for the first document based on the content of the first document includes: when the first document references the content block of the fourth document, based on the fourth document corresponding to the content block of the fourth document , which provides candidate tags for the first document. If the content block of the fourth document is referenced, and the content of the content block of the fourth document contains tags that have been marked by the fourth document, these tags of the fourth document can also be recommended as candidate tags of the first document. In this way, the range of recommended candidate tags is expanded, in addition, the correlation between the first document and the content blocks of the fourth document referenced by it is enhanced, and the network of tag clustering is expanded.
  • the title of the second document is a tag, or the content of the second document is a tag.
  • the label of the first document is "reading notes”
  • the display title in the second document is "reading notes”.
  • the content of the second document is the content of the tag. In this way, the association relationship between the tag and the second document can be better reflected, and the corresponding tag can be seen after entering the second document.
  • associating the tag with the second document includes creating the second document associated with the tag based on the tag when the second document does not exist. That is, for a new tag, an associated second document can be created.
  • an association relationship between the tag and the second document is established. For example, when the tag has been used in other documents, if the tag is used in the first document, the tag can be directly reused, and the second document associated with the tag can also be used accordingly. The association relationship of the second document, so that the tag is associated with the existing second document. In this way, the efficiency of tag generation and the creation of its associated second document is also improved.
  • the method for generating a document tag of the present disclosure may further include: displaying the document content of the second document in response to the second preset operation on the tag.
  • the second preset operation may include a click operation, a voice command, and the like. For example, by clicking on a label, the second document associated with the label can be opened, thereby displaying the document content of the second document. In this way, the association between the tag and the second document is enhanced, which facilitates the user's viewing.
  • the second document may further include information of the fifth document corresponding to the tag.
  • some other documents also include the content of the tag, but may not employ the tag.
  • document information corresponding to the tag but not using the tag can be listed separately, which is convenient for the user's reference and expands the knowledge graph of the tag.
  • a plurality of fifth documents eg, X documents and Y documents are shown that include the content of the tags but do not employ the tags.
  • the fifth document includes documents that have employed tags, and/or documents that have not employed tags.
  • the document content corresponding to the tag is included in the content of the document that does not employ the tag.
  • the third document in response to a third preset operation of a preset button displayed in the second document when the third document does not employ the tag, the third document is made to use the tag of the first document.
  • the third preset operation may include a click operation, a voice command, and the like.
  • FIG. 2B shows the preset button "connect”, it should be understood that this is only exemplary, and other suitable buttons may also be used.
  • the third document can be made to adopt the tag, ie, the tag can be added to the location where the tag was placed by the third document. In this way, the processing of the tags of the third document is implemented in the second document, which greatly facilitates the management and operation of the document.
  • the title of the fifth document and the document content of the content block of the fifth document where the tag is located are displayed at a preset position in the second document.
  • the content block in which the tag of the first document is located and the corresponding document content are shown.
  • the fifth document when it has adopted the tag, it can be displayed in the form of a list at a preset position in the second document (for example, displayed at a position below the content block of the first document where the tag is located), for example, following
  • the content block of the first document where the tag is located is shown in the form of a list in the following position.
  • the displayed content may include the title of the fifth document and the document content of the content block of the fifth document where the tag is located, and the location where the tag appears may be highlighted (eg, highlighted).
  • Embodiments of the present disclosure also provide an apparatus 300 for generating a document tag.
  • the document tag generating apparatus 300 includes a candidate tag providing module 301 and a tag generating module 302 .
  • the candidate tag providing module 301 is configured to provide candidate tags for the first document based on the content of the first document.
  • the tag generation module 302 is configured to generate a tag corresponding to the first document in response to the first preset operation on the candidate tag, wherein the tag is associated with the second document, and the second document includes the first document The document content corresponding to the tag in .
  • providing candidate tags for the first document based on the content of the first document includes: extracting keywords of the content; providing the first document based on the extracted keywords or network keywords related to the extracted keywords candidate labels.
  • providing candidate tags for the first document based on the content of the first document includes providing candidate tags for the first document based on tags of the third document when the first document references the third document.
  • providing the candidate tag for the first document based on the content of the first document includes: when the first document references the content block of the fourth document, based on the tag of the fourth document corresponding to the content block of the fourth document , which provides candidate labels for the first document.
  • the title of the second document is a tag, or the content of the second document is a tag.
  • associating the tag with the second document includes: when the second document does not exist, creating a second document associated with the tag based on the tag; or when the second document exists, establishing a relationship between the tag and the second document connection relation.
  • the generating apparatus further includes a label document display module configured to display the document content of the second document in response to the second preset operation on the label.
  • the second document further includes information of the fifth document corresponding to the tag.
  • the fifth document includes a document that has adopted a tag, and/or a document that has not adopted a tag; wherein the content of the document that has not adopted a tag includes document content corresponding to the tag.
  • the fifth document in response to a third preset operation of a preset button displayed in the second document, when the fifth document does not employ the tag, the fifth document is caused to employ the tag. In some embodiments, when the fifth document has adopted the tag, the title of the fifth document and the document content of the content block of the fifth document where the tag is located are displayed at a preset position in the second document.
  • the present disclosure also provides a terminal, comprising: at least one memory and at least one processor; wherein the memory is used for storing program codes, and the processor is used for calling the program codes stored in the memory to execute the above document How to generate labels.
  • the present disclosure also provides a computer storage medium, where program codes are stored in the computer storage medium, and the program codes are used to execute the above-mentioned method for generating a document tag.
  • the present disclosure also provides a terminal and a storage medium, which are described below.
  • FIG. 4 it shows a schematic structural diagram of an electronic device (eg, a terminal device or a server) 400 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 400 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 401 that may be loaded into random access according to a program stored in a read only memory (ROM) 402 or from a storage device 408 Various appropriate actions and processes are executed by the programs in the memory (RAM) 403 . In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored.
  • the processing device 401, the ROM 402, and the RAM 403 are connected to each other through a bus 404.
  • An input/output (I/O) interface 405 is also connected to bus 404 .
  • I/O interface 405 the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 407 of a computer, etc.; a storage device 408 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 409. Communication means 409 may allow electronic device 400 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 4 shows electronic device 400 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 409, or from the storage device 408, or from the ROM 402.
  • the processing apparatus 401 When the computer program is executed by the processing apparatus 401, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the aforementioned computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to execute the aforementioned method of the present disclosure.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • a method for generating a document tag comprising: providing a candidate tag for the first document based on the content of the first document; The first preset operation of the label generates a label corresponding to the first document; wherein the label is associated with a second document, and the second document includes a label corresponding to the label in the first document document content.
  • providing the candidate tags for the first document based on the content of the first document includes: extracting keywords of the content; based on the extracted keywords or network keywords related to the extracted keywords to provide candidate tags for the first document.
  • providing the candidate tag for the first document based on the content of the first document includes: when the first document references a third document, based on the The label of the third document, providing the candidate label for the first document.
  • providing the candidate tag for the first document based on the content of the first document includes: when the first document references a content block of a fourth document, The candidate tags are provided for the first document based on tags of the fourth document corresponding to the content blocks of the fourth document.
  • the title of the second document is the tag, or the content of the second document is the tag.
  • associating the tag with the second document includes: when the second document does not exist, creating a second document associated with the tag based on the tag; Or when the second document exists, establish an association relationship between the tag and the second document.
  • the second document further includes information of a fifth document corresponding to the tag.
  • the fifth document includes a document that has adopted the tag, and/or a document that has not adopted the tag; wherein, in the content of the document that has not adopted the tag The document content corresponding to the tag is included.
  • the fifth document when the tag is not adopted in the fifth document, in response to a third preset operation on a preset button displayed in the second document, the fifth document is made The document adopts the tag.
  • the title of the fifth document and the location where the label is located are displayed at a preset position in the second document.
  • the document content of the content block of the fifth document is displayed at a preset position in the second document.
  • an apparatus for generating a document tag includes: a candidate tag providing module configured to provide candidate tags for the first document based on the content of the first document; the tag generating module , configured to generate a label corresponding to the first document in response to a first preset operation on the candidate label; wherein the label is associated with a second document, and the second document includes the first document The document content in the document corresponding to the tag.
  • a terminal comprising: at least one memory and at least one processor; wherein the at least one memory is used to store program codes, and the at least one processor is used to call the The program code stored in the at least one memory executes the method described in any one of the above.
  • a storage medium for storing a program code for executing the above-described method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开提供文档标签的生成方法及装置、终端和存储介质。文档标签的生成方法包括:基于第一文档的内容,为第一文档提供候选标签;响应于对候选标签的第一预设操作,生成与第一文档对应的标签;其中,标签与第二文档相关联,第二文档包括第一文档中的与标签对应的文档内容。在本公开的一些实施例中,标签与第二文档相关联,第二文档包括第一文档中的与标签对应的文档内容,如此,在与标签相关联的第二文档中进行了标签聚类,并且可以在第二文档中查看与该标签对应的文档内容,达到了自动归类、辅助整理的效果,提升了知识图谱的聚合,提升了这种文档结构的价值。

Description

文档标签的生成方法、装置、终端和存储介质
相关申请的交叉引用
本申请基于申请号为202110226955.9、申请日为2021年03月01日,名称为“文档标签的生成方法、装置、终端和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及信息技术领域,尤其涉及文档标签的生成方法及装置、终端和存储介质。
背景技术
在阅读或编辑文档的过程中,会出现一些关键词,用于归类及整理文档内容,但这些关键词信息(标签)是在当前文档的编辑过程中无法提供的。此外,标签也只是个自定义的字段值,无法通过任何方式将它进行拓展,服务更多不同的应用场景。
另外,如果用户去对文档进行归类及整理,经常会出现散乱、不一致、重叠的情况。而且,目前的标签的归类及整理也仅能用于搜索、过滤,功能较为单一。
发明内容
为解决现有问题,本公开提供一种文档标签的生成方法及装置、终端和存储介质。
本公开采用以下的技术方案。
本公开的实施例提供一种文档标签的生成方法,所述生成方法包括:基于第一文档的内容,为所述第一文档提供候选标签;响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
本公开的另一实施例提供了一种文档标签的生成装置,所述生成装置包括:候选标签提供模块,配置为基于第一文档的内容,为所述第一文档提供候选标签;标签生成模块,配置为响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
在一些实施例中,本公开提供一种终端,包括:至少一个存储器和至少一个处理器;其中,存储器用于存储程序代码,处理器用于调用所述存储器所存储的程序代码执行上述文档标签的生成方法。
在一些实施例中,本公开提供一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行上述文档标签的生成方法。
在本公开的一些实施例中,标签与第二文档相关联,第二文档包括第一文档中的与标签对应的文档内容,如此,在与标签相关联的第二文档中进行了标签聚类,并且可以在第二文档中查看与该标签对应的文档内容,达到了自动归类、辅助整理的效果,提升了知识图谱的聚合,提升了这种文档结构的价值。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附标签记表示相同或相似的元素。应当理解附图是示意性的,元件和元素不一定按照比例绘制。
图1是本公开的实施例的文档标签的生成方法的流程图。
图2A和图2B示出了本公开的一些实施例的与标签对应的第二文档的示意图。
图3是本公开的另一实施例的用于文档标签的生成装置的部分模块。
图4是本公开实施例的电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加 透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。在本公开的实施例中,文档可以包括但不限于在线文档。
如图1所示,本公开的实施例提供了一种文档标签的生成方法的流程图。本公开的文档标签的生成方法可以包括
步骤S101,基于第一文档的内容,为第一文档提供候选标签。在一些实施例中,在用户编辑文档过程中或之后,或在阅读第一文档的过程中,基于第一文档中的内容,可以给该第一文档推荐候选标签。在本公开的实施例中,标签是指关键词,用于指示文档中可能涉及的内容。在一些实施例中,本公开的候选标签供用户进行选择。在一些实施例中,为第一文档提供候选标签既可以在第一文档的编辑过程中进行,也可以在第一文档已经编辑完成之后(例如,阅读第一文档时)进行。
在一些实施例中,本公开的文档标签的生成方法还可以包括
步骤S102,响应于对候选标签的第一预设操作,生成与第一文档对应的标签。在一些实施例中,第一预设操作包括点击操作、语音指令等。在一些 实施例中,候选标签可以为一个或多个。例如,在为第一文档提供候选标签之后,如果用户点击其中一个候选标签,则认为采用该点击的候选标签,即生成了一个与第一文档对应的标签。
在一些实施例中,标签与第二文档相关联,第二文档包括与标签对应的具体内容。不同于通常的文档的关键词或标签,本公开的标签可以与一个第二文档相关联,或者可以认为该标签对应于第二文档的链接。在该第二文档中,可以显示第一文档中的与该标签对应的内容块(或区块)内容,并且该区块内容中出现标签的位置可以高亮显示。应该理解,文档可以由相应的内容块(或区块)构成,内容块是文档的一部分,可以承载多种数据类型,可以认为文档是由一个或多个内容块组成的。如图2A所示,在第二文档中,包括应用该标签的文档和对应的内容,并且可以指示是在文档的哪个区块出现该标签。如此,在与标签相关联的第二文档中进行了标签聚类,并且可以在第二文档中查看与该标签对应的文档内容,达到了自动归类、辅助整理的效果,提升了知识图谱的聚合,提升了这种文档结构的价值。
在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:提取内容的关键词,基于提取的关键词或与提取的关键词相关的网络关键词,为第一文档提供候选标签。在一些实施例中,利用人工智能(AI)的关键词算法能力抽取文档关键词。在一些实施例中,可以利用外部实体词作为分词的筛选,推荐给用户作为候选标签。外部实体词指的是普遍在百度百科及维基百科的公开关键词,通过把已进行分词的内容利用这些公开词条做匹配、筛选,针对于不同类型的文档内容,可大幅度地提升推荐候选标签的准确率,以覆盖各种不同的场景。
在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:当第一文档引用第三文档时,基于第三文档的标签,为第一文档提供候选标签。在一些实施例中,引用第三文档可以是指在第二文档中以链接/预览等任何合适的形式在第二文档中呈现。如果引用了第三文档,并且该第三文档中已经打过标签,那么第三文档的标签也可以作为第一文档的候选标签推荐。如此,扩展了推荐的候选标签的范围,另外,增强了第一文档和其引用的第三文档的关联性,扩展了标签聚类的网络。
在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:当第一文档引用第四文档的内容块时,基于与第四文档的内容块对应的所述第四文档的标签,为第一文档提供候选标签。如果引用了第四文档的内容块, 并且该第四文档的内容块的内容中存在第四文档已经打过的标签,那么第四文档的这些标签也可以作为第一文档的候选标签推荐。如此,扩展了推荐的候选标签的范围,另外,增强了第一文档和其引用的第四文档的内容块的关联性,扩展了标签聚类的网络。
在一些实施例中,第二文档的标题为标签,或者第二文档的内容为标签。例如,如图2A所示,假设第一文档的标签为“阅读笔记”,则第二文档中的显示标题为“阅读笔记”。在一些实施例中,第二文档的内容为标签的内容。如此,更能体现标签与第二文档的关联关系,进入第二文档就能看到对应的标签。
在一些实施例中,标签与第二文档相关联包括:当不存在第二文档时,基于标签创建与标签相关联的第二文档。即,针对一个新标签,可以创建相关联的第二文档。在一些实施例中,当第二文档已经存在时,建立标签与第二文档的关联关系。例如,在其他文档中已经采用该标签时,如果在第一文档中采用该标签,可以直接复用该标签即可,同时与该标签相关联的第二文档也可以相应地采用,建立标签与第二文档的关联关系,使得该标签与已经存在的第二文档相关联。如此,也提高了标签生成及其相关联的第二文档的创建的效率。
在一些实施例中,本公开的文档标签的生成方法还可以包括:响应于对标签的第二预设操作,显示第二文档的文档内容。在一些实施例中,第二预设操作可以包括点击操作、语音指令等。例如,点击标签,则可以打开与该标签相关联的第二文档,从而显示第二文档的文档内容。如此,增强了标签与第二文档的关联性,便利用户的查看。
在一些实施例中,第二文档还可以包括与标签对应的第五文档的信息。在一些实施例中,有些其他文档也包括该标签的内容,但是可能并没有采用该标签。而在第二文档中,可以单独列出与该标签对应但是并没有采用该标签的文档信息,方便用户的参考,并且扩展了该标签的知识图谱。如图2B所示,示出了多个包括标签的内容但是并未采用该标签的多个第五文档(例如,X文档和Y文档)。
在一些实施例中,第五文档包括已采用标签的文档,和/或未采用标签的文档。在一些实施例中,未采用该标签的文档的内容中包括与标签对应的文档内容。
在一些实施例中,在第三文档未采用标签时,响应于对第二文档中显示的预设按钮的第三预设操作,使第三文档采用第一文档的该标签。在一些实施例中,第三预设操作可以包括点击操作、语音指令等。图2B示出了预设按钮“连接”,应该理解,这仅是示例性的,还可以采用其他合适的按钮。例如,通过对“连接”按钮点击,可以使该第三文档采用该标签,即,将该标签添加到第三文档放置标签的位置处。如此,在第二文档中实现了第三文档的标签的处理,极大地便利了文件的管理和操作。
在一些实施例中,在第五文档已采用标签时,在第二文档中的预设位置处显示第五文档的标题以及标签所在的第五文档的内容块的文档内容。参见图2A,示出了第一文档的标签所在的内容块以及相应的文档内容。同样地,在第五文档已采用标签时,可以以列表的形式在第二文档中的预设位置(例如,显示在标签所在的第一文档的内容块的下面的位置)处,例如,跟随标签所在的第一文档的内容块,在下面的位置以列表的形式示出。显示的内容可以包括第五文档的标题以及标签所在的第五文档的内容块的文档内容,并且出现标签的位置可以突出(例如,高亮)显示。
本公开的实施例还提供了一种文档标签的生成装置300。文档标签的生成装置300包括候选标签提供模块301和标签生成模块302。在一些实施例中,候选标签提供模块301配置为基于第一文档的内容,为第一文档提供候选标签。在一些实施例中,标签生成模块302配置为响应于对候选标签的第一预设操作,生成与第一文档对应的标签,其中,标签与第二文档相关联,第二文档包括第一文档中的与标签对应的文档内容。
应该理解,关于文档标签的生成方法描述的内容也适用于此处的用于文档标签的生成装置300,为了简单的目的,在此不进行详细描述。
在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:提取内容的关键词;基于提取的关键词或与提取的关键词相关的网络关键词,为第一文档提供候选标签。在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:当第一文档引用第三文档时,基于第三文档的标签,为第一文档提供候选标签。在一些实施例中,基于第一文档的内容,为第一文档提供候选标签包括:当第一文档引用第四文档的内容块时,基于与第四文档的内容块对应的第四文档的标签,为第一文档提供候选标签。在一些实施例中,第二文档的标题为标签,或者第二文档的内容为标签。在一些实施例中,标签与第二文档相关联包括:当不存在第二文档时,基于标签创建与 标签相关联的第二文档;或者当存在第二文档时,建立标签与第二文档的关联关系。在一些实施例中,生成装置还包括标签文档显示模块,配置为响应于对标签的第二预设操作,显示第二文档的文档内容。在一些实施例中,第二文档还包括与标签对应的第五文档的信息。在一些实施例中,第五文档包括已采用标签的文档,和/或未采用标签的文档;其中,未采用标签的文档的内容中包括与标签对应的文档内容。在一些实施例中,在第五文档未采用标签时,响应于对第二文档中显示的预设按钮的第三预设操作,使第五文档采用标签。在一些实施例中,在第五文档已采用标签时,在第二文档中的预设位置处显示第五文档的标题以及标签所在的第五文档的内容块的文档内容。
此外,本公开还提供一种终端,包括:至少一个存储器和至少一个处理器;其中,所述存储器用于存储程序代码,所述处理器用于调用所述存储器所存储的程序代码以执行上述文档标签的生成方法。
此外,本公开还提供一种计算机存储介质,该计算机存储介质存储有程序代码,程序代码用于执行上述文档标签的生成方法。
以上,基于实施例和应用例说明了本公开的文档标签的生成方法及装置。此外,本公开还提供一种终端及存储介质,以下说明这些终端和存储介质。
下面参考图4,其示出了适于用来实现本公开实施例的电子设备(例如终端设备或服务器)400的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图4示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图4所示,电子设备400可以包括处理装置(例如中央处理器、图形处理器等)401,其可以根据存储在只读存储器(ROM)402中的程序或者从存储装置408加载到随机访问存储器(RAM)403中的程序而执行各种适当的动作和处理。在RAM403中,还存储有电子设备400操作所需的各种程序和数据。处理装置401、ROM 402以及RAM 403通过总线404彼此相连。输入/输出(I/O)接口405也连接至总线404。
通常,以下装置可以连接至I/O接口405:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置406;包 括例如液晶显示器(LCD)、扬声器、振动器等的输出装置407;包括例如磁带、硬盘等的存储装置408;以及通信装置409。通信装置409可以允许电子设备400与其他设备进行无线或有线通信以交换数据。虽然图4示出了具有各种装置的电子设备400,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置409从网络上被下载和安装,或者从存储装置408被安装,或者从ROM 402被安装。在该计算机程序被处理装置401执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述的本公开的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,提供了一种文档标签的生成方法,所述生成方法包括:基于第一文档的内容,为所述第一文档提供候选标签;响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
根据本公开的一个或多个实施例,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:提取所述内容的关键词;基于提取的所述关键词或与提取的所述关键词相关的网络关键词,为所述第一文档提供候选标签。
根据本公开的一个或多个实施例,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:当所述第一文档引用第三文档时,基于所述第三文档的标签,为所述第一文档提供所述候选标签。
根据本公开的一个或多个实施例,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:当所述第一文档引用第四文档的内容块 时,基于与所述第四文档的所述内容块对应的所述第四文档的标签,为所述第一文档提供所述候选标签。
根据本公开的一个或多个实施例,所述第二文档的标题为所述标签,或者所述第二文档的内容为所述标签。
根据本公开的一个或多个实施例,所述标签与所述第二文档相关联包括:当不存在所述第二文档时,基于所述标签创建与所述标签相关联的第二文档;或者当存在所述第二文档时,建立所述标签与所述第二文档的关联关系。
根据本公开的一个或多个实施例,还包括:响应于对所述标签的第二预设操作,显示所述第二文档的文档内容。
根据本公开的一个或多个实施例,所述第二文档还包括与所述标签对应的第五文档的信息。
根据本公开的一个或多个实施例,所述第五文档包括已采用所述标签的文档,和/或未采用所述标签的文档;其中,所述未采用所述标签的文档的内容中包括与所述标签对应的文档内容。
根据本公开的一个或多个实施例,在所述第五文档未采用所述标签时,响应于对所述第二文档中显示的预设按钮的第三预设操作,使所述第五文档采用所述标签。
根据本公开的一个或多个实施例,在所述第五文档已采用所述标签时,在所述第二文档中的预设位置处显示所述第五文档的标题以及所述标签所在的所述第五文档的内容块的文档内容。
根据本公开的一个或多个实施例,一种文档标签的生成装置,生成装置包括:候选标签提供模块,配置为基于第一文档的内容,为所述第一文档提供候选标签;标签生成模块,配置为响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
根据本公开的一个或多个实施例,提供了一种终端,包括:至少一个存储器和至少一个处理器;其中,所述至少一个存储器用于存储程序代码,所述至少一个处理器用于调用所述至少一个存储器所存储的程序代码执行上述中任一项所述的方法。
根据本公开的一个或多个实施例,提供了一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行上述的方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (14)

  1. 一种文档标签的生成方法,其特征在于,所述生成方法包括:
    基于第一文档的内容,为所述第一文档提供候选标签;
    响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;
    其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
  2. 根据权利要求1所述的文档标签的生成方法,其特征在于,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:
    提取所述内容的关键词;
    基于提取的所述关键词或与提取的所述关键词相关的网络关键词,为所述第一文档提供候选标签。
  3. 根据权利要求1所述的文档标签的生成方法,其特征在于,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:
    当所述第一文档引用第三文档时,基于所述第三文档的标签,为所述第一文档提供所述候选标签。
  4. 根据权利要求1所述的文档标签的生成方法,其特征在于,基于所述第一文档的所述内容,为所述第一文档提供所述候选标签包括:
    当所述第一文档引用第四文档的内容块时,基于与所述第四文档的所述内容块对应的所述第四文档的标签,为所述第一文档提供所述候选标签。
  5. 根据权利要求1所述的文档标签的生成方法,其特征在于,所述第二文档的标题为所述标签,或者所述第二文档的内容为所述标签。
  6. 根据权利要求1所述的文档标签的生成方法,其特征在于,所述标签与所述第二文档相关联包括:
    当不存在所述第二文档时,基于所述标签创建与所述标签相关联的第二文档;或者
    当存在所述第二文档时,建立所述标签与所述第二文档的关联关系。
  7. 根据权利要求1所述的文档标签的生成方法,其特征在于,还包括:
    响应于对所述标签的第二预设操作,显示所述第二文档的文档内容。
  8. 根据权利要求1所述的文档标签的生成方法,其特征在于,所述第二文档还包括与所述标签对应的第五文档的信息。
  9. 根据权利要求8所述的文档标签的生成方法,其特征在于,所述第五文档包括已采用所述标签的文档,和/或未采用所述标签的文档;其中,所述未采用所述标签的文档的内容中包括与所述标签对应的文档内容。
  10. 根据权利要求9所述的文档标签的生成方法,其特征在于,在所述第五文档未采用所述标签时,响应于对所述第二文档中显示的预设按钮的第三预设操作,使所述第五文档采用所述标签。
  11. 根据权利要求9所述的文档标签的生成方法,其特征在于,在所述第五文档已采用所述标签时,在所述第二文档中的预设位置处显示所述第五文档的标题以及所述标签所在的所述第五文档的内容块的文档内容。
  12. 一种文档标签的生成装置,其特征在于,所述生成装置包括:
    候选标签提供模块,配置为基于第一文档的内容,为所述第一文档提供候选标签;
    标签生成模块,配置为响应于对所述候选标签的第一预设操作,生成与所述第一文档对应的标签;
    其中,所述标签与第二文档相关联,所述第二文档包括所述第一文档中的与所述标签对应的文档内容。
  13. 一种终端,包括:
    至少一个存储器和至少一个处理器;
    其中,所述至少一个存储器用于存储程序代码,所述至少一个处理器用于调用所述至少一个存储器所存储的程序代码执行权利要求1至11中任一项所述的文档标签的生成方法。
  14. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求1至11中任一项所述的文档标签的生成方法。
PCT/CN2022/078560 2021-03-01 2022-03-01 文档标签的生成方法、装置、终端和存储介质 WO2022184048A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/279,378 US20240184973A1 (en) 2021-03-01 2022-03-01 Method and apparatus for generating document tag, and terminal and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110226955.9 2021-03-01
CN202110226955.9A CN114997120B (zh) 2021-03-01 2021-03-01 文档标签的生成方法、装置、终端和存储介质

Publications (1)

Publication Number Publication Date
WO2022184048A1 true WO2022184048A1 (zh) 2022-09-09

Family

ID=83018431

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/078560 WO2022184048A1 (zh) 2021-03-01 2022-03-01 文档标签的生成方法、装置、终端和存储介质

Country Status (3)

Country Link
US (1) US20240184973A1 (zh)
CN (1) CN114997120B (zh)
WO (1) WO2022184048A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408886A (zh) * 2007-10-05 2009-04-15 富士通株式会社 通过分析文档的段落来选择该文档的标签
CN101571859A (zh) * 2008-04-28 2009-11-04 国际商业机器公司 用于对文档进行标注的方法和设备
US20120016885A1 (en) * 2010-07-16 2012-01-19 Ibm Corporation Adaptive and personalized tag recommendation
US20130104029A1 (en) * 2011-10-24 2013-04-25 Apollo Group, Inc. Automated addition of accessiblity features to documents
US20200301950A1 (en) * 2019-03-22 2020-09-24 Microsoft Technology Licensing, Llc Method and System for Intelligently Suggesting Tags for Documents

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040177322A1 (en) * 2003-03-03 2004-09-09 International Business Machines Corporation Apparatus, system and method of automatically placing embedded icons in their visual order in a displayed or printed bi-directionally formatted document
US20110106835A1 (en) * 2009-10-29 2011-05-05 International Business Machines Corporation User-Defined Profile Tags, Rules, and Recommendations for Portal
CN102713905A (zh) * 2010-01-08 2012-10-03 瑞典爱立信有限公司 用于媒体文件的社会标签的方法和设备
CN103729360A (zh) * 2012-10-12 2014-04-16 腾讯科技(深圳)有限公司 一种兴趣标签推荐方法及系统
US9529901B2 (en) * 2013-11-18 2016-12-27 Oracle International Corporation Hierarchical linguistic tags for documents
CN104461504B (zh) * 2014-11-06 2019-05-14 深圳市金立通信设备有限公司 一种终端应用程序的管理方法
US20170161269A1 (en) * 2015-12-04 2017-06-08 Ca, Inc. Document handling using triple identifier
US20200110839A1 (en) * 2018-10-05 2020-04-09 International Business Machines Corporation Determining tags to recommend for a document from multiple database sources
CN110334178B (zh) * 2019-03-28 2023-06-20 平安科技(深圳)有限公司 数据检索方法、装置、设备及可读存储介质
CN110806873B (zh) * 2019-10-31 2023-07-21 拉扎斯网络科技(上海)有限公司 目标控件确定方法、装置、电子设备及存储介质
CN111277572A (zh) * 2020-01-13 2020-06-12 深圳市赛为智能股份有限公司 云存储安全去重方法、装置、计算机设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408886A (zh) * 2007-10-05 2009-04-15 富士通株式会社 通过分析文档的段落来选择该文档的标签
CN101571859A (zh) * 2008-04-28 2009-11-04 国际商业机器公司 用于对文档进行标注的方法和设备
US20120016885A1 (en) * 2010-07-16 2012-01-19 Ibm Corporation Adaptive and personalized tag recommendation
US20130104029A1 (en) * 2011-10-24 2013-04-25 Apollo Group, Inc. Automated addition of accessiblity features to documents
US20200301950A1 (en) * 2019-03-22 2020-09-24 Microsoft Technology Licensing, Llc Method and System for Intelligently Suggesting Tags for Documents

Also Published As

Publication number Publication date
US20240184973A1 (en) 2024-06-06
CN114997120B (zh) 2023-09-26
CN114997120A (zh) 2022-09-02

Similar Documents

Publication Publication Date Title
CN111368185B (zh) 数据展示方法、装置、存储介质及电子设备
CN111522927B (zh) 基于知识图谱的实体查询方法和装置
WO2022037419A1 (zh) 音频内容识别方法、装置、设备和计算机可读介质
WO2020182123A1 (zh) 用于推送语句的方法和装置
WO2022247562A1 (zh) 多模态数据检索方法、装置、介质及电子设备
WO2022111591A1 (zh) 页面生成方法和装置、存储介质和电子设备
WO2022218034A1 (zh) 交互方法、装置和电子设备
WO2021190129A1 (zh) 页面处理方法、装置、电子设备及计算机可读存储介质
CN112287206A (zh) 信息处理方法、装置和电子设备
WO2023142914A1 (zh) 日期识别方法、装置、可读介质及电子设备
WO2022184034A1 (zh) 一种文档处理方法、装置、设备和介质
WO2020199659A1 (zh) 用于确定推送优先级信息的方法和装置
WO2024078297A1 (zh) 数据处理方法、装置、电子设备和存储介质
WO2024087821A1 (zh) 信息处理方法、装置和电子设备
WO2024032413A1 (zh) 书籍信息显示方法、装置、设备和存储介质
US20240021004A1 (en) Cross-region document content recognition method, apparatus and medium
WO2023088378A1 (zh) 信息处理方法、装置、终端和存储介质
WO2022184048A1 (zh) 文档标签的生成方法、装置、终端和存储介质
CN111382262A (zh) 用于输出信息的方法和装置
CN111382365A (zh) 用于输出信息的方法和装置
CN113807056B (zh) 一种文档名称序号纠错方法、装置和设备
WO2022184037A1 (zh) 文档处理方法、装置、设备和介质
WO2023000782A1 (zh) 获取视频热点的方法、装置、可读介质和电子设备
WO2022206413A1 (zh) 标注数据确定方法、装置、可读介质及电子设备
CN111399902B (zh) 客户端源文件处理方法、装置、可读介质与电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22762513

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18279378

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.01.2024)