CN117251538A

CN117251538A - Document processing method, computer terminal and computer readable storage medium

Info

Publication number: CN117251538A
Application number: CN202310995388.2A
Authority: CN
Inventors: 石三川
Original assignee: Hangzhou Alibaba Cloud Feitian Information Technology Co ltd
Current assignee: Hangzhou Alibaba Cloud Feitian Information Technology Co ltd
Priority date: 2023-08-08
Filing date: 2023-08-08
Publication date: 2023-12-19

Abstract

The application discloses a document processing method, a computer terminal and a computer readable storage medium. Wherein the method comprises the following steps: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; based on the document area, a dialogue result corresponding to the query sentence is determined. The method and the device solve the technical problem that feedback dialogue results are inaccurate when the dialogue is conducted based on the document in the related technology.

Description

Document processing method, computer terminal and computer readable storage medium

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a document processing method, a computer terminal, and a computer readable storage medium.

Background

With the popularity of the Internet and artificial intelligence, the advent of large model dialogue services has laid the foundation for document information retrieval functionality. In the related art, when searching a document, a method is generally adopted in which text content is extracted from an input document, searching is performed based on the extracted text content, and a corresponding search result is fed back, so that a function of searching the document is realized. The large model dialogue service predicts answers corresponding to the received questions by adopting an artificial intelligent model algorithm, and has better dialogue experience. In the related art, when the large model dialogue service is applied to document searching, a text content is extracted from a document, and then an answer corresponding to a question is searched out from the extracted text content, but the answer fed back is a paragraph in the document corresponding to the question, and sometimes the fed back paragraph cannot completely correspond to the question, so that the problem of inaccurate feedback dialogue results exists.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the application provides a document processing method, a computer terminal and a computer readable storage medium, which at least solve the technical problem that feedback dialogue results are inaccurate when a dialogue is performed based on a document in the related technology.

According to an aspect of an embodiment of the present application, there is provided a document processing method including: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; and determining a dialogue result corresponding to the inquiry statement based on the document area.

Optionally, the searching the document area corresponding to the query sentence from the structured object includes: determining a plurality of search results corresponding to the query sentence, wherein the search results comprise candidate titles or candidate paragraph contents; determining nodes corresponding to the plurality of search results in the tree structure under the condition that the structure corresponding to the structured object is the tree structure; and determining a document area corresponding to the query statement based on the nodes corresponding to the search results in the tree structure.

Optionally, the determining, based on the nodes corresponding to the search results in the tree structure, a document area corresponding to the query sentence includes: acquiring a threshold value of the number of nodes; determining a tree structure area comprising a number of nodes exceeding the threshold number of nodes; and determining the document area corresponding to the query statement based on the tree structure area.

Optionally, the determining, based on the tree structure area, the document area corresponding to the query sentence includes: detecting whether the tree structure area is smaller than a preset area threshold value; and under the condition that the detection result is that the tree structure area is smaller than a preset area threshold value, determining the area corresponding to the tree structure area in the target document as the document area corresponding to the query sentence.

Optionally, the determining, based on the document area, a dialogue result corresponding to the query sentence includes: obtaining a prompt word corresponding to the document area, wherein the prompt word is used for prompting the output of a result corresponding to the inquiry sentence in a dialogue form; and determining the dialogue result corresponding to the query statement based on the document area and the prompt word.

Optionally, the determining the dialogue result corresponding to the query sentence based on the document region and the prompt word includes inputting the prompt word and the document region into a large language model, wherein the large language model analyzes the document region based on the prompt of the prompt word to obtain an analysis result obtained after the analysis of the document region under the prompt; and receiving the dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and the analysis result, and the dialogue words correspond to the prompt words.

Optionally, after determining the dialogue result corresponding to the query sentence based on the document area, the method further includes: and displaying the dialogue result in a dialogue box corresponding to the inquiry statement.

Optionally, the method further comprises: matching the dialogue result with the document area to obtain an original expression of the dialogue result in the document area; the location of the original expression in the target document is displayed in a highlighted manner.

According to another aspect of the embodiments of the present application, there is also provided a document processing method, including: displaying a document input box on a display interface; receiving a target document in response to an operation on the document input box, and converting the target document into a structured object; displaying a dialog box on the display interface; receiving an inquiry sentence in response to an input operation to the dialog box; and in response to the confirmation operation of the query sentence, displaying a dialogue result corresponding to the query sentence in the dialogue box, wherein the dialogue result is determined based on a document area corresponding to the query sentence, and the document area is searched from the structured object.

According to still another aspect of the embodiments of the present application, there is also provided a document processing method, including: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; and determining a dialogue result corresponding to the query sentence based on the document area by adopting a large language model.

Optionally, the determining, by using the large language model, a dialogue result corresponding to the query sentence based on the document area includes: acquiring a prompt word, wherein the prompt word is used for prompting the large language model to analyze the document area; analyzing the document area by adopting the large language model based on the prompt of the prompt word to obtain an analysis result; and receiving the dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and the analysis result, and the dialogue words correspond to the prompt words.

According to another aspect of the embodiments of the present application, there is also provided a document processing apparatus including: the first receiving module is used for receiving the target document; the first conversion module is used for converting the target document into a structured object; the second receiving module is used for receiving the inquiry statement; the first searching module is used for searching out a document area corresponding to the query sentence from the structured object; and the first determining module is used for determining a dialogue result corresponding to the query statement based on the document area.

According to another aspect of the embodiments of the present application, there is also provided a document processing apparatus including: the first display module is used for displaying a document input box on a display interface; a third receiving module for receiving a target document in response to an operation on the document input box and converting the target document into a structured object; the second display module is used for displaying a dialog box on the display interface; a fourth receiving module, configured to receive an inquiry sentence in response to an input operation to the dialog box; and the third display module is used for responding to the confirmation operation of the query statement and displaying a dialogue result corresponding to the query statement in the dialogue box, wherein the dialogue result is determined based on a document area corresponding to the query statement, and the document area is searched from the structured object.

According to still another aspect of the embodiments of the present application, there is also provided a document processing apparatus, including: a fifth receiving module for receiving the target document; the second conversion module is used for converting the target document into a structured object; a sixth receiving module for receiving the query sentence; the second searching module is used for searching out a document area corresponding to the query sentence from the structured object; and the second determining module is used for determining a dialogue result corresponding to the query sentence based on the document area by adopting a large language model.

According to still another aspect of the embodiments of the present application, there is also provided an electronic device, including: a memory storing an executable program; and the processor is used for running the program, wherein the program executes any one of the document processing methods.

According to still another aspect of the embodiments of the present application, there is also provided a computer-readable storage medium including: the computer readable storage medium includes a stored executable program, wherein the executable program, when run, controls a device in which the storage medium is located to perform any one of the document processing methods described above.

In the embodiment of the application, a mode of searching structured document information based on an inquiry sentence is adopted, a target document is converted into a structured object, a corresponding document area in the structured object is determined based on the inquiry sentence, and a dialogue result corresponding to the inquiry sentence is determined based on the document area.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

FIG. 1 is a block diagram showing a hardware configuration of a computer terminal for implementing a document processing method;

FIG. 2 illustrates a block diagram of a computing environment for implementing a document processing method;

FIG. 3 shows a block diagram of a service grid for implementing a document processing method;

FIG. 4 is a flowchart of a document processing method one according to embodiment 1 of the present invention;

FIG. 5 is a flowchart of a document processing method II according to embodiment 1 of the present invention;

FIG. 6 is a flowchart of a document processing method III according to embodiment 1 of the present invention;

FIG. 7 is an alternative document knowledge base question-answering retrieval system framework diagram according to embodiment 1 of the present invention;

FIG. 8 is a schematic view of a first document processing apparatus according to embodiment 2 of the present invention;

FIG. 9 is a schematic view of a document processing apparatus II according to embodiment 3 of the present invention;

FIG. 10 is a schematic view of a third document processing apparatus according to embodiment 4 of the present invention;

fig. 11 is a block diagram of the structure of a computer terminal according to embodiment 5 of the present invention.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, partial terms or terminology appearing in describing embodiments of the present application are applicable to the following explanation:

intelligent document processing (Intelligent Document Process, abbreviated IDP): the method refers to different file types, the same structural body description of the file content is obtained through an intelligent processing technology, the file information is stripped from the file object, and the capability of being exported to a system is provided.

Structuring a document: refers to a way of describing information about a document, and generally includes information about text paragraph content, text hierarchical relationships, text styles, text content, text coordinates, and the like of the document.

Searching: refers to a function of searching similar content text in text through text relevance or text content semantic relevance.

Question-answer dialogue communication: in the method, a user inputs a prompt word or a query sentence through a dialog box, and the system replies according to the sentence submitted by the user.

Key-Value (KV) information: refers to key-value relation pairs extracted from text content, such as "user name: xxx "," username "is a key, and" xxx "is a value, i.e. it is described that" xxx "is" username ".

Generating a Pre-training transducer model (generated Pre-training Transformer, abbreviated as GPT): the method is an Internet-based deep learning model generated by text which can be trained by data, is a general pre-training model, and evolved GPT2, GPT3, GPT-J and the like can be collectively called as a large language model in the application.

Example 1

In accordance with embodiments of the present application, a document processing method is provided in which steps shown in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions, and although a logical order is shown in the flowcharts, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.

The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Fig. 1 shows a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a document processing method. As shown in fig. 1, the computer terminal 10 (or mobile device) may include one or more processors 102 (shown as 102a, 102b, … …,102 n) which may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA, a memory 104 for storing data, and a transmission module 106 for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a universal serial BUS (UniversalSerialBus, USB) port (which may be included as one of the ports of the BUS), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors 102 and/or other data processing circuits described above may be referred to generally herein as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated, in whole or in part, into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination to interface).

The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the document processing methods in the embodiments of the present application, and the processor 102 executes the software programs and modules stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the document processing methods described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means 106 is arranged to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.

The display may be, for example, a touch screen type liquid crystal display (LiquidCrystalDisplay, LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).

The hardware block diagram shown in fig. 1 may be used not only as an exemplary block diagram of the computer terminal 10 (or mobile device) described above, but also as an exemplary block diagram of the server described above, and in an alternative embodiment, fig. 2 shows, in block diagram form, one embodiment of using the computer terminal 10 (or mobile device) shown in fig. 1 described above as a computing node in a computing environment 201. Fig. 2 illustrates a block diagram of a computing environment, as shown in fig. 2, where the computing environment 201 includes a plurality of computing nodes (e.g., servers) operating on a distributed network (shown as 210-1, 210-2, …). The computing nodes each contain local processing and memory resources and end user 202 may run applications or store data remotely in computing environment 201. An application may be provided as a plurality of services 220-1,220-2,220-3 and 220-4 in computing environment 201, representing services "A", "D", "E", and "H", respectively.

End user 202 may provide and access services through a web browser or other software application on a client, in some embodiments, provisioning and/or requests of end user 202 may be provided to portal gateway 230. Ingress gateway 230 may include a corresponding agent to handle provisioning and/or request for services (one or more services provided in computing environment 201).

Services are provided or deployed in accordance with various virtualization techniques supported by the computing environment 201. In some embodiments, services may be provided according to virtual machine (VirtualMachine, VM) based virtualization, container based virtualization, and/or the like. Virtual machine-based virtualization may be the emulation of a real computer by initializing a virtual machine, executing programs and applications without directly touching any real hardware resources. While the virtual machine virtualizes the machine, according to container-based virtualization, a container may be started to virtualize the entire operating system (OperatingSystem, OS) so that multiple workloads may run on a single operating system instance.

In one embodiment based on container virtualization, several containers of a service may be assembled into one Pod (e.g., kubernetesPod). For example, as shown in FIG. 2, the service 220-2 may be equipped with one or more Pods 240-1,240-2, …,240-N (collectively referred to as Pods). The Pod may include an agent 245 and one or more containers 242-1,242-2, …,242-M (collectively referred to as containers). One or more containers in the Pod handle requests related to one or more corresponding functions of the service, and the agent 245 generally controls network functions related to the service, such as routing, load balancing, etc. Other services may accompany a Pod similar to the Pod.

In operation, executing a user request from end user 202 may require invoking one or more services in computing environment 201, and executing one or more functions of one service may require invoking one or more functions of another service. As shown in FIG. 2, service "A"220-1 receives a user request of end user 202 from ingress gateway 230, service "A"220-1 may invoke service "D"220-2, and service "D"220-2 may request service "E"220-3 to perform one or more functions.

The computing environment may be a cloud computing environment, and the allocation of resources is managed by a cloud service provider, allowing the development of functions without considering the implementation, adjustment or expansion of the server. The computing environment allows developers to execute code that responds to events without building or maintaining a complex infrastructure. Instead of expanding a single hardware device to handle the potential load, the service may be partitioned to a set of functions that can be automatically scaled independently.

In another alternative embodiment, FIG. 3 illustrates in block diagram form one embodiment of using the computer terminal 10 (or mobile device) illustrated in FIG. 1 described above as a service grid. Fig. 3 illustrates a block diagram of a service grid, as shown in fig. 3, that is a service grid 300 that is primarily used to facilitate secure and reliable communication between a plurality of micro services, i.e., applications that are broken down into a plurality of smaller services or instances and run on different clusters/machines.

As shown in fig. 3, the micro-services may include an application service instance a and an application service instance B, which form a functional application layer of the service grid 300. In one embodiment, application service instance A runs in the form of container/process 308 on machine/workload container set 314 (Pod) and application service instance B runs in the form of container/process 310 on machine/workload container set 316 (Pod).

In one embodiment, application service instance a may be a commodity query service and application service instance B may be a commodity ordering service.

As shown in fig. 3, application service instance a and grid agent (sidecar) 303 coexist in machine workload container set 314 and application service instance B and grid agent 305 coexist in machine workload container 316. Grid agent 303 and grid agent 305 form a data plane layer (dataplane) of service grid 300. Wherein the grid agent 303 and the grid agent 305 are running in the form of containers/processes 304, 306, respectively, which may receive requests 312 for goods inquiry services, and which may be in bi-directional communication between the grid agent 303 and the application service instance a, and which may be in bi-directional communication between the grid agent 305 and the application service instance B. In addition, two-way communication is also possible between the grid agent 303 and the grid agent 305.

In one embodiment, application service instance A's traffic is routed through grid agent 303 to the appropriate destination and application service instance B's network traffic is routed through grid agent 305 to the appropriate destination. It should be noted that, the network traffic mentioned herein includes, but is not limited to, hypertext transfer protocol (Hyper Text Transfer Protocol, abbreviated as HTTP), representational state transfer (Representational State Transfer, abbreviated as REST), high performance, general purpose open source framework (googleRemote Procedure Call, gRPC), data structure storage system (dis) in open source memory, and the like.

In one embodiment, the functionality of the extended data plane layer may be implemented by writing custom filters (filters) for agents (envoys) in the service grid 300, which may be configured to enable the service grid to properly proxy service traffic for service interworking and service remediation. Grid agent 303 and grid agent 305 may be configured to perform at least one of the following functions: service discovery (servicediscovery), health checking (health checking), routing (Routing), load balancing (LoadBalancing), authentication and authorization (authenticationand authorization), and observability (observability).

As shown in fig. 3, the service grid 300 also includes a control plane layer. Wherein the control plane layer may be a set of services running in a dedicated namespace, hosted by the hosting control plane component 301 in the machine/workload container set (machine/Pod) 302. As shown in fig. 3, managed control plane component 301 is in bi-directional communication with grid agent 303 and grid agent 305. Managed control plane component 301 is configured to perform some control management functions. For example, managed control plane component 301 receives telemetry data transmitted by grid agent 303 and grid agent 305, which may be further aggregated. These services, hosting control plane component 301 may also provide a user-oriented application program interface (ApplicationProgrammingInterface, API) to more easily manipulate network behavior, provide configuration data to grid agents 303 and 305, and the like.

In the above-described operating environment, the present application provides a document processing method as shown in fig. 4. Fig. 4 is a flowchart of a document processing method one according to embodiment 1 of the present application.

Step S402, receiving a target document;

as an alternative embodiment, the execution body of the method of this embodiment may be a terminal or a server for processing a document. For example, when applied to a terminal for executing document processing, document processing in a simple document scene can be realized in a lightweight manner when applied to the terminal; for another example, when the method is applied to a server, abundant computing resources of the server can be called, or a relatively large and accurate document model can be used, so that the document can be identified more accurately.

The types of the terminals may be various, for example, a mobile terminal having a certain computing power, a fixed computer device having a recognition capability, or the like. The types of the servers may be various, for example, a local server or a virtual cloud server. The server may be a single computer device according to its computing power, or may be a computer cluster in which a plurality of computer devices are integrated.

It should be noted that, the target document herein may be a document in various formats, for example: PDF format, HTML format, word format, excel format, txt format, and the like. The target document may be some files obtained based on IDP technology, and the file information may be stripped from these files and exported to a processing system, with a document having a certain format.

As an alternative embodiment, in the step S402, the target document may have various sources and sources, and may be a private document uploaded by an individual user, or a document selected from a web page or an electronic journal, etc. The number and the length of the received target documents can also comprise a plurality of types, and the received target documents can be long documents with a large number of lengths, or a plurality of target documents needing to be searched are received simultaneously, so that the document question-answering service is realized for the plurality of target documents simultaneously.

Step S404, converting the target document into a structured object;

as an alternative embodiment, the above step S404, document structuring may organize the document content according to certain levels and relationships, such as: the organization may be in terms of order, logical relationship, importance, etc. The structured objects of the document may also include a variety of types, such as: title, body, paragraph, list, table, chart, etc. Because the contents of the documents are classified and sorted according to a certain rule, the organization of the documents is clear and unified, the readability and maintainability of the documents can be improved by converting the target documents into the structured objects, the contents of the documents can be searched more clearly after the target documents are converted into the structured objects, more convenient conditions can be provided for the processing, searching and analyzing of the documents, and a user can be helped to find the required information more quickly.

Step S406, receiving an inquiry sentence;

as an alternative embodiment, the step S406 receives an inquiry sentence, which may be a question posed based on the received target document. For example, it may be a sentence input by the user by voice or text. After receiving the query sentence, the query sentence can be split to obtain text content corresponding to the query sentence, and then the obtained text content is converted to obtain a corresponding text vector, so that the subsequent search in the structured object of the target document based on the text vector is facilitated, and the structured object corresponding to the query sentence is found out.

Step S408, searching out a document area corresponding to the inquiry sentence from the structured object;

as an alternative embodiment, the searching the document area corresponding to the query sentence from the structured object in step S408 may include: determining a plurality of search results corresponding to the query sentence, wherein the search results comprise candidate titles or candidate paragraph contents, and the candidate titles and the candidate paragraph contents are obtained based on the search based on text vectors after the document conversion; based on the structured object, a tree structure with clear organization information and clear content theme classification is established, and under the condition that the structure corresponding to the structured object is the tree structure, nodes corresponding to a plurality of search results in the tree structure are determined; and determining a document area corresponding to the query sentence based on the nodes corresponding to the plurality of search results in the tree structure. Because of the establishment of the tree structure, the structure of the structured object is clearer, the corresponding nodes in the structured object are obtained by searching the query statement, and the corresponding areas in the document are determined based on the nodes, so that the query result of the query statement can be accurately positioned. The structure of the document is characterized in a tree structure mode, and the tree structure is simpler compared with the whole document. After determining a plurality of search results corresponding to the query sentences, the nodes corresponding to the search results in the tree structure can be directly used for displaying the area covered by each query sentence more intuitively based on the corresponding nodes, and the target area, namely the document area corresponding to the query sentences, is determined based on the covered area. Determining the document area is more intuitive and efficient than determining the document area based on the original document.

As an alternative embodiment, when determining the document area corresponding to the query sentence based on the nodes corresponding to the plurality of search results in the tree structure, various manners may be adopted. For example, it may be determined directly from the threshold number of nodes required for the document area. The node number threshold may be determined according to a statistical value, where the statistical value may be a number of nodes that can obtain a more accurate answer to a question. The number of nodes threshold can be acquired, the tree structure area comprising the number of nodes exceeding the number of nodes threshold can be determined, and the document area corresponding to the query sentence can be determined based on the tree structure area exceeding the number of nodes threshold. Because the more the number of nodes is, the higher the matching degree of the document area and the query sentence is, when the document question is answered on a long document or multiple documents, a plurality of search results are obtained, therefore, the tree structure area with the more nodes is selected from the search results, and the matching degree of the document area and the corresponding query sentence which is finally determined based on the tree structure area is further improved. Therefore, the document area corresponding to the query sentence is determined according to the number of nodes included in the tree structure area, and as long as the number of nodes included in the tree structure area exceeds the threshold value of the number of nodes, the determined tree structure area can be considered to meet the requirement of high matching degree of the questions and the answers, so that when the query sentence is used for carrying out the query and the answers, the query result with higher matching degree with the query sentence can be obtained.

As an alternative embodiment, in principle, when determining the document area corresponding to the query sentence based on the tree structure, the more nodes the tree structure includes, the more accurate the document area determined based on the tree structure area can match the answer of the query sentence. However, conversely, the larger the number of nodes, the larger the covered area, and the larger the resources consumed for determining the answer corresponding to the query sentence from the covered area, and the efficiency of returning the dialogue result is affected to some extent. Therefore, the size of the tree structure area can be controlled through the preset area threshold value besides determining the tree structure based on the node number threshold value, and the problem of low efficiency caused by overlarge tree structure area is avoided. Therefore, when determining the document area corresponding to the query sentence based on the tree structure area, it is possible to detect whether the tree structure area is smaller than the predetermined area threshold, and if the tree structure area is smaller than the predetermined area threshold as a result of the detection, it is determined that the area corresponding to the tree structure area in the target document is the document area corresponding to the query sentence. The number of the nodes included in the tree structure area can be effectively ensured through the threshold value based on the number of the nodes, namely, the accuracy of a dialogue result is ensured; on the other hand, through the preset area threshold, the whole size of the tree structure area can be effectively ensured, namely, the efficiency of returning the result based on the tree structure area is ensured. Therefore, based on the node number threshold value and the preset area threshold value, the document area corresponding to the query statement is considered in terms of accuracy and efficiency, so that the balance between accuracy and efficiency of the subsequent return dialogue result is realized.

Step S410, based on the document area, determining the dialogue result corresponding to the inquiry sentence.

As an alternative embodiment, in the step 410, when determining the dialogue result corresponding to the query sentence based on the document area, the following processing may be adopted, for example: firstly, acquiring a prompt word corresponding to a document area, wherein the prompt word is used for prompting the output of a result corresponding to an inquiry sentence in a dialogue form; based on the document area and the prompt word, a dialogue result corresponding to the inquiry sentence is determined. It should be noted that the above mentioned prompt words may be some fixed questions, and the prompt words may include some spoken language, breath, attitude, emotion expression, etc. besides some normal questions and answers.

As an alternative embodiment, the step 410 may input the determined document area and the prompt word into a large language model, and may combine the prompt word with the document area through the large language model to perform spoken language splitting output, so as to implement a more pertinent, real and accurate document question-answering service for the user in a document dialogue processing manner.

As an alternative embodiment, the prompt word is obtained based on the determined document area, if the document content in the determined document area needs to be summarized based on the current determined document area, the prompt word may be: what is described in this paragraph of text below. That is, the hint may be instructions, for example, for analyzing the content included in the determined document area based on the determined document area to obtain a question that may correspond to the content of the document area or a description of the content of the document area. Therefore, the prompt word can lead the large language model to combine with the document area to better realize the output of the spoken and personified question-answer dialogue.

As an alternative embodiment, when determining the dialogue result corresponding to the query sentence based on the document area and the prompt word, various manners may be adopted, for example, the dialogue result corresponding to the query sentence may be determined based on a large language model, that is, the following processing manner may be adopted: inputting the prompt words and the document area into a large language model, wherein the large language model analyzes the document area based on the prompt of the prompt words to obtain an analysis result obtained after the document area is analyzed under the prompt; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

Through the processing, the searched document area is processed by adopting the large language model, and the large language model is obtained based on a large amount of training data training and has the functions of context understanding, summarizing and summarizing, so that the obtained result is more accurate. In addition, the prompt words and the document area are input into the large language model together, so that the document area can be well analyzed based on the prompt of the prompt words, and the dialogue result obtained based on the large language model is more targeted. In addition, after the large language model is analyzed based on the prompt word to obtain the analysis result, the dialogue result is generated based on the dialogue word and the analysis result, and the dialogue word corresponds to the prompt word, so that the dialogue result output by the large language model can be more personified, has more dialogue characteristics and is more intelligent.

When determining the dialogue result corresponding to the query sentence based on the document region and the prompt word, the document region and the prompt word may be input into a large language model, and the large language model processes, for example, analyzes, summarizes, and generalizes the answer corresponding to the query sentence based on the prompt word in the document region, and then outputs the answer corresponding to the query sentence. In addition, the prompt word corresponds to the query sentence, and the prompt word can be input in a targeted manner relative to the large language model, namely, based on the prompt word, the prompt word can not only represent the meaning of the query sentence, but also meet the input requirement of the large language model.

As an alternative embodiment, to intuitively display the dialogue result, after determining the dialogue result corresponding to the query sentence based on the document area, the dialogue result output by the large language model may be displayed in the dialogue box corresponding to the query sentence input by the user, and the form of the dialogue result output may also be various, for example: the pop-up window forms pop-up dialogue results or output dialogue results and simultaneously play voices and the like, so that more real document dialogue effects are realized.

As an alternative embodiment, in order to more clearly show the searching and outputting process of the dialogue result, the dialogue result and the determined document area can be matched in content, so that the original expression recorded in the document area of the dialogue result is obtained, the original expression position is marked in the target document in a highlighting manner, the display of the searching process of the dialogue result is convenient, the output result and the highlighted position in the original document can be combined for viewing, and the accuracy and the reliability of the output result are checked. It should be noted that, the display area of the target document may be any area that can be displayed, for example, may be a preview area for previewing on the display interface. By adopting the processing mode, as the dialogue result corresponding to the inquiry sentence is displayed in the dialogue box, the target document is displayed in the preview area, and the matched original expression is highlighted, the dialogue result displayed in the dialogue box can be compared directly in a highlighting mode, the detection of the output dialogue result is realized to a certain extent, and therefore the acceptance of the user to the output dialogue result is improved.

According to the embodiment, the method of searching the structured document information based on the query sentence is adopted, the target document is converted into the structured object, the corresponding document area in the structured object is determined based on the query sentence, the dialogue result corresponding to the query sentence is determined based on the document area, and because the document area is determined based on the structured object and corresponds to the query sentence, compared with the paragraphs determined in the related art and corresponds to the query sentence, the document area can comprise a plurality of paragraphs, namely, the determined document area can more comprehensively correspond to the query sentence and answer the query sentence more accurately, so that the aim of determining the dialogue result accurately corresponding to the query sentence from the document can be fulfilled, the technical effect of accurately answering based on the document is achieved, and the technical problem that feedback dialogue result is inaccurate when the dialogue is performed based on the document in the related art is solved.

In the above-described operating environment, the present application provides a document processing method as shown in fig. 5. Fig. 5 is a flowchart of a document processing method two according to embodiment 1 of the present invention. As shown in fig. 5, the method may include the steps of:

Step S502, displaying a document input box on a display interface;

step S504, receiving a target document in response to the operation of the document input box, and converting the target document into a structured object;

as an alternative embodiment, after the target document is input into the document input box, the target document is received, and the target document is converted into a structured object, so that the structure of the target document is clear, and the content of the target document is conveniently searched.

Step S506, displaying a dialog box on the display interface;

step S508, receiving an inquiry sentence in response to the input operation of the dialog box;

in response to the confirmation operation of the query sentence, a dialogue result corresponding to the query sentence is displayed in the dialogue box, wherein the dialogue result is determined based on the document area corresponding to the query sentence, and the document area is searched for from the structured object.

In the embodiment of the application, based on the display interface and the interactive operation, a mode of searching the structured document information based on the query statement is adopted, the target document is converted into the structured object, the corresponding document area in the structured object is determined based on the query statement, the dialogue result corresponding to the query statement is determined based on the document area, and because the document area corresponding to the query statement is determined based on the structured object, compared with the paragraphs corresponding to the query statement determined in the related art, the document area can comprise a plurality of paragraphs, namely, the determined document area can more comprehensively correspond to the query statement and answer the query statement more accurately, so that the purposes of displaying based on the display interface and interacting and determining the dialogue result accurately corresponding to the query statement from the document are achieved, the accurate question and answer based on the document are achieved, and the technical effect of intuitiveness is achieved, and the technical problem that feedback dialogue result is inaccurate when the dialogue is performed based on the document in the related art is solved.

In the above-described operating environment, the present application provides a document processing method as shown in fig. 6. Fig. 6 is a flowchart of a document processing method three according to embodiment 1 of the present invention. As shown in fig. 6, the method may include the steps of:

step S602, receiving a target document;

step S604, converting the target document into a structured object;

step S606, receiving an inquiry sentence;

step S608, searching out a document area corresponding to the inquiry sentence from the structured object;

step S610, a large language model is adopted to determine a dialogue result corresponding to the query sentence based on the document area.

In the embodiment of the application, a mode of searching structured document information based on an inquiry sentence is adopted, a target document is converted into a structured object, a corresponding document area in the structured object is determined based on the inquiry sentence, and a large language model is adopted to determine a dialogue result corresponding to the inquiry sentence based on the document area. On the one hand, because the document area is determined based on the structured object and corresponds to the query sentence, the document area is the context content related to the query sentence, and compared with the paragraphs determined in the related art and corresponding to the query sentence, the document area can comprise a plurality of paragraphs, namely, the determined document area can more comprehensively correspond to the query sentence and answer the query sentence more accurately; on the other hand, the large language model is adopted to analyze the document area to obtain the dialogue result corresponding to the query sentence, and the large language model is trained based on a large amount of data and has the capabilities of analysis, summarization and summarization, so that the dialogue result obtained based on the large language model is accurate and is personified, and the dialogue experience is further provided. Therefore, the technical problem that feedback dialogue results are inaccurate when the dialogue is conducted based on the document in the related technology is solved, and interaction experience is improved.

As an alternative embodiment, when the large language model is adopted to determine the dialogue result corresponding to the query sentence based on the document area, the large language model can be prompted to perform the tendentious analysis based on a certain prompt word so as to further improve the accuracy of the dialogue result obtained based on the large language model, and further obtain the result which is prone to the requirement of the user. For example, the following processing may be employed: firstly, acquiring a prompt word, wherein the prompt word is used for prompting a large language model to analyze a document area; analyzing the document area by adopting a large language model based on the prompt of the prompt word to obtain an analysis result; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

Since the above-mentioned document area is searched out based on the query sentence, the content associated with the query sentence, and the context content related to the corresponding query sentence is included in the document area, the content involved may be still relatively large. But there may be a more general trend in searching generally based on query sentences. For example, when analyzing based on the above-described document area using a large language model, a more general tendency demand may be "what is mainly said in the document area", "what is important in the document area", and the like. Based on these tendencies, the addition of these hints allows the large language model to take into account the tendencies when analyzing the document region, so that the analysis results given by the large language model better meet the tendencies. After the analysis result meeting the tendency requirement is obtained, when the analysis result is fed back to the user, the dialogue result can be generated by combining the dialogue word corresponding to the prompt word on the basis of the analysis result, and then the dialogue result is fed back to the user. For example, the dialog result may be that "the document area is mainly described in the document area", "the emphasis of the document area is the same. Through the processing, the prompt words and the document area are input into the large language model together, and the document area can be well analyzed based on the prompt of the prompt words, so that the dialogue result obtained based on the large language model is more targeted. After the large language model is analyzed based on the prompt words to obtain analysis results, dialogue results are generated based on the dialogue words and the analysis results, and the dialogue words correspond to the prompt words, so that the dialogue results output by the large language model are more anthropomorphic, have dialogue characteristics and are more intelligent.

Based on the above embodiment and the optional embodiment, the present invention further provides an optional implementation manner.

In the related art, a method for retrieving a document generally extracts contents in the document, builds a database based on text information obtained by the extraction, stores the text information, and then builds a search tool based on the text information. Searching the stored document content and establishing corresponding searching tools according to the content of the text information is still needed, so that a great deal of time is wasted. When the contents of the documents are asked and answered based on the large language model dialogue, such as ChatGPT, the large language model dialogue service cannot input texts exceeding the processable range of the model, so in the related technology, the short documents can be searched only, and the long documents and the documents cannot be searched. In addition, when the large model dialogue service is applied to document searching, a method is adopted in the related art that text content is extracted from a document, and then an answer corresponding to a question is searched out from the extracted text content, but the answer fed back is a paragraph in the document corresponding to the question, and the fed back paragraph sometimes cannot completely correspond to the question, so that there is a problem that the feedback dialogue result is inaccurate.

Based on the above-mentioned problems, in this alternative embodiment, a document knowledge base question-answering retrieval system based on a large language model and structured documents is provided, and fig. 7 is a frame diagram of an alternative document knowledge base question-answering retrieval system according to an embodiment of the present invention, as shown in fig. 7, the document knowledge base question-answering retrieval system includes the following contents:

the services involved in the document knowledge base question-answer retrieval system include: the IDP document structuring service, the document structuring searching service and the large model dialogue service have the following three service functions:

IDP document structuring service: documents such as documents (PDF/HTML/word/excel/txt) are converted into information such as structured objects (including text, styles, tables, picture layouts, association relations, KV) and the like.

Document structured search service: the document structured object is stored to build a search service including, but not limited to, by way of SQL, an open source search engine (elastic search), etc.

Large model dialogue service (Large language model above): refers to a contextual question-and-answer service built by including GPT, GPT derivative models, and the like.

The three services are combined to realize the question and answer retrieval of the document knowledge base, and the flow is as follows: the user distinguishes the user identity through the user ID, the user uploads the document and then sequentially converts the uploaded document into a document structured object through the IDP service, the converted document structured object is placed into a storage database and a search database to be stored, a document structured search engine of the subsequent search database searches the document structured object, and then the user asks for the document content through an inquiry sentence, for example: "what does this document teach? ".

After receiving a problem input by a user, the system splits the problem to obtain text body content corresponding to the problem, and then converts the text body content into text vectors. The text main body content can represent main content of a question input by a user, text vectors corresponding to the text main body content are input into a document structured search engine, the document structured search engine searches answers of the input query sentences, and searches are performed from stored document structured objects to obtain TopN results returned by a search service. Where TopN results are N results for a different number of nodes in the text structured object that match the entered question based on the entered question. In order to ensure the consistency of the context of the document content, the system selects the content of the father title and father paragraph involved in the first (including the result with the highest node number matched with the input problem) according to the node number included in the document structured tree object by the TopN results, and picks the content as the subtree in the document structured tree object. Similarly, if the spanning tree object fails, in order to ensure continuity of the document content, the text content of the fixed interval is intercepted according to the text content where the first result is located (because the obtained result is structured information, text content selection can be performed from the context of the original document location where the structured information is located, and the length of the fixed interval can also be obtained empirically). And finally, selecting text content in the context of the original document position where the returned first result is located, and forming text content included in the targeted document area.

After the regional document text content of the document hit of the question-answer content is obtained in the system, the question-answer content (namely the document text content) is added with the prompt word. The hint terms may be analyzed based on the determined content of the document area to obtain questions or descriptions of the content of the document area that may correspond to the content of the document area. Therefore, the prompt word can play a role in guiding the large language model to combine the document area and the question-answering content to better realize the output of the spoken and personified question-answering dialogue. Wherein, the generated prompting words can be independently used as a prompting word injection service, and the system can select not to inject prompting words, for example: "what is taught in the text of this paragraph below? ", then the question-answer content and the prompt word content are input into the large model dialogue service, for example: "what is taught in the text of this paragraph below? [ original document content ] ".

After obtaining the results of the large model dialogue service, for example, return "this document teaches xxx", the returned results and the original provided context content are subjected to rule matching, and based on the matching results, the hit vocabulary or paragraph in the original document content is obtained. For the hit vocabulary or paragraphs in the original document, the system returns a return result of the large model dialogue service and provides the return result to the dialogue box input by the user, and the document part of the hit vocabulary or paragraph coordinates in the window is highlighted in a visual mode, so that the user can conveniently and clearly check whether the result of the question-answer dialogue is reliable or not, and whether the provided result is reasonable or not.

It should be noted that, in the above alternative embodiment, the processing link based on the IDP technology may obtain a fine-grained document structured object higher than that of the ordinary document processing flow, for example, may include text, coordinates, style, paragraph, table, KV, entity type, and the like of the document content;

based on the structured object and the document searching service, the minimum hit context of the user question is obtained, the problem that the document content is input into a large language model at one time, the input limit of the model appears, and the like is solved, and long-document and cross-document question-answer searching can be performed;

constructing a complete flow from document structuring, document storage searching, document question answering and the like, and based on a large language model and a prompt word injection mode, the method comprises the following steps: the solutions of document question and answer understanding, document searching and other service providing performance and the like shorten the problems of individual users such as document reading understanding, document key information understanding and the like when reading documents, and realize the communication between the users and the documents.

Thus, by the above alternative embodiments, the following effects are effectively achieved:

the knowledge base question-answering system construction of any document of a user can be realized by combining the IDP service with the document structured content;

Through the document structured object and the minimum hit context of the search query, the problem that a text exceeding the processable range of a model cannot be input in a large language model dialogue service is solved, and a document dialogue processing mode for long documents and multiple documents can be realized;

the method and the device have the advantages that large language model technologies including GPT, GPT-J and the like are cited, more general personification on question and answer questions and answers is realized, spoken split output is carried out on context contents through the large language model technologies on user questions and hits, and non-document content output of the large language model is limited.

The system constructs questions and answers implementing the personal document knowledge base including, but not limited to, extending functions including document searching, document recommending.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.

Example 2

According to an embodiment of the present application, there is also provided a document processing apparatus for implementing the above document processing method, fig. 8 is a block diagram of a first document processing apparatus provided according to embodiment 2 of the present invention, as shown in fig. 8, the apparatus including: the first receiving module 80, the first converting module 82, the second receiving module 84, the first searching module 86, and the first determining module 88 are described below.

A first receiving module 80 for receiving a target document; a first conversion module 82 connected to the first receiving module 80 for converting the target document into a structured object; a second receiving module 84 connected to the first converting module 82 for receiving the query sentence; a first search module 86, connected to the second receiving module 84, for searching out a document area corresponding to the query sentence from the structured object; the first determining module 88 is connected to the first searching module 86, and is configured to determine a dialogue result corresponding to the query sentence based on the document area.

Here, the first receiving module 80, the first converting module 82, the second receiving module 84, the first searching module 86, and the first determining module 88 correspond to steps S402 to S410 in embodiment 1, and the modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (e.g., the memory 104) and processed by one or more processors (e.g., the processors 102a, 102b, … …,102 n), or the above-mentioned modules may also be executed as a part of the apparatus in the computer terminal 10 provided in embodiment 1.

It should be noted that, the preferred embodiments in the foregoing examples of the present application are the same as the embodiments provided in example 1, the application scenario and the implementation process, but are not limited to the embodiments provided in example 1.

Example 3

According to an embodiment of the present application, there is also provided a document processing apparatus for implementing the above document processing method, fig. 9 is a block diagram of a document processing apparatus according to embodiment 3 of the present invention, as shown in fig. 9, including: the first display module 90, the third receiving module 92, the second display module 94, the fourth receiving module 96, and the third display module 98 are described below.

A first display module 90 for displaying a document input box on a display interface; a third receiving module 92 connected to the first display module 90 for receiving a target document and converting the target document into a structured object in response to an operation on the document input box; a second display module 94 connected to the third receiving module 92 for displaying a dialog box on the display interface; a fourth receiving module 96 connected to the second display module 94 for receiving an inquiry sentence in response to an input operation to the dialog box; and a third display module 98 connected to the fourth receiving module 96 for displaying a dialogue result corresponding to the query sentence in the dialogue box in response to the confirmation operation of the query sentence, wherein the dialogue result is determined based on the document area corresponding to the query sentence, and the document area is searched for from the structured object.

Here, the first display module 90, the third receiving module 92, the second display module 94, the fourth receiving module 96, and the third display module 98 correspond to steps S502 to S510 in embodiment 1, and the above modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (e.g., the memory 104) and processed by one or more processors (e.g., the processors 102a, 102b, … …,102 n), or the above-mentioned modules may also be executed as a part of the apparatus in the computer terminal 10 provided in embodiment 1.

Example 4

According to an embodiment of the present application, there is also provided a document processing apparatus for implementing the above document processing method, fig. 10 is a block diagram of a third document processing apparatus provided according to embodiment 4 of the present invention, as shown in fig. 10, the apparatus including: the fifth receiving module 100, the second converting module 102, the sixth receiving module 104, the second searching module 106, and the second determining module 108 are described below.

A fifth receiving module 100 for receiving a target document; a second conversion module 102, connected to the fifth receiving module 100, for converting the target document into a structured object; a sixth receiving module 104, coupled to the second converting module 102, for receiving an inquiry sentence; a second search module 106, coupled to the sixth receiving module 104, for searching the document area corresponding to the query sentence from the structured object; and a second determining module 108, coupled to the second searching module 106, for determining a dialogue result corresponding to the query sentence based on the document area using the large language model.

It should be noted that, the fifth receiving module 100, the second converting module 102, the sixth receiving module 104, the second searching module 106, and the second determining module 108 correspond to steps S602 to S610 in embodiment 1, and the modules are the same as the examples and the application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (e.g., the memory 104) and processed by one or more processors (e.g., the processors 102a, 102b, … …,102 n), or the above-mentioned modules may also be executed as a part of the apparatus in the computer terminal 10 provided in embodiment 1.

Example 5

Embodiments of the present application may provide a computer terminal, which may be any one of a group of computer terminals. Alternatively, in the present embodiment, the above-described computer terminal may be replaced with a terminal device such as a mobile terminal.

Alternatively, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among a plurality of network devices of the computer network.

In this embodiment, the above-described computer terminal may execute the program code of the following steps in the document processing method: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; based on the document area, a dialogue result corresponding to the query sentence is determined.

Alternatively, fig. 11 is a block diagram of a computer terminal according to an embodiment of the present application. As shown in fig. 11, the computer terminal may include: one or more (only one is shown) processors 1102, memory 1104, a memory controller, and a peripheral interface, wherein the peripheral interface interfaces with the radio frequency module, the audio module, and the display.

The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the document processing methods and apparatuses in the embodiments of the present application, and the processor executes the software programs and modules stored in the memory, thereby executing various functional applications and data processing, that is, implementing the document processing methods described above. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located relative to the processor, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; based on the document area, a dialogue result corresponding to the query sentence is determined.

Optionally, the above processor may further execute program code for: searching a document area corresponding to the inquiry sentence from the structured object comprises the following steps: determining a plurality of search results corresponding to the query sentence, wherein the search results include candidate titles or candidate paragraph content; under the condition that the structure corresponding to the structured object is a tree structure, determining nodes corresponding to a plurality of search results in the tree structure; and determining a document area corresponding to the query sentence based on the nodes corresponding to the plurality of search results in the tree structure.

Optionally, the above processor may further execute program code for: determining a document area corresponding to the query sentence based on nodes corresponding to the plurality of search results in the tree structure, including: acquiring a threshold value of the number of nodes; determining a tree structure area comprising a number of nodes exceeding a threshold number of nodes; based on the tree structure region, a document region corresponding to the query sentence is determined.

Optionally, the above processor may further execute program code for: determining a document area corresponding to the query sentence based on the tree structure area, including: detecting whether the tree structure area is smaller than a preset area threshold value; and determining the corresponding area of the tree structure area in the target document as the document area corresponding to the query sentence under the condition that the tree structure area is smaller than the preset area threshold value as the detection result.

Optionally, the above processor may further execute program code for: based on the document area, determining a dialogue result corresponding to the query sentence, including: acquiring a prompt word corresponding to a document area, wherein the prompt word is used for prompting the output of a result corresponding to an inquiry sentence in a dialogue form; based on the document area and the prompt word, a dialogue result corresponding to the inquiry sentence is determined.

Optionally, the above processor may further execute program code for: inputting the prompt words and the document area into a large language model, wherein the large language model analyzes the document area based on the prompts of the prompt words to obtain an analysis result obtained after the document area is analyzed under the prompts; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

Optionally, the above processor may further execute program code for: after determining the dialogue result corresponding to the query sentence based on the document area, the method further comprises: the dialog result is displayed in a dialog box corresponding to the query sentence.

Optionally, the above processor may further execute program code for: the method further comprises the following steps: matching the dialogue result with the document area to obtain an original expression of the dialogue result in the document area; the location of the original expression in the target document is displayed in a highlighted manner.

The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: displaying a document input box on a display interface; receiving a target document in response to an operation on the document input box, and converting the target document into a structured object; displaying a dialog box on a display interface; receiving an inquiry sentence in response to an input operation to the dialog box; in response to a confirmation operation of the query sentence, a dialogue result corresponding to the query sentence is displayed in the dialogue box, wherein the dialogue result is determined based on a document area corresponding to the query sentence, and the document area is searched from the structured object.

The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; and determining a dialogue result corresponding to the query sentence based on the document area by adopting the large language model.

Optionally, the above processor may further execute program code for: determining a dialogue result corresponding to the query sentence based on the document area by adopting a large language model, wherein the method comprises the following steps: acquiring a prompt word, wherein the prompt word is used for prompting the large language model to analyze the document area; analyzing the document area by adopting a large language model based on the prompt of the prompt word to obtain an analysis result; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

By adopting the embodiment of the application, a scheme for searching the structured document information based on the query sentence is provided. The method comprises the steps of converting a target document into a structured object based on an inquiry sentence, determining a corresponding document area in the structured object based on the inquiry sentence, and determining a dialogue result corresponding to the inquiry sentence based on the document area.

It will be appreciated by those skilled in the art that the structure shown in fig. 11 is only illustrative, and the computer terminal may be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a mobile internet device (MobileInternetDevices, MID), a PAD, etc. Fig. 11 is not limited to the structure of the electronic device. For example, the computer terminal may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 11, or have a different configuration than shown in FIG. 11.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing a terminal device to execute in association with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

Example 5

Embodiments of the present application also provide a computer-readable storage medium. Alternatively, in this embodiment, the computer-readable storage medium may be used to store the program code executed by the document processing method provided in the first embodiment.

Alternatively, in this embodiment, the above-mentioned computer-readable storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.

Optionally, in the present embodiment, the computer readable storage medium is configured to store program code for performing the steps of: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; based on the document area, a dialogue result corresponding to the query sentence is determined.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: searching a document area corresponding to the inquiry sentence from the structured object comprises the following steps: determining a plurality of search results corresponding to the query sentence, wherein the search results include candidate titles or candidate paragraph content; under the condition that the structure corresponding to the structured object is a tree structure, determining nodes corresponding to a plurality of search results in the tree structure; and determining a document area corresponding to the query sentence based on the nodes corresponding to the plurality of search results in the tree structure.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: determining a document area corresponding to the query sentence based on nodes corresponding to the plurality of search results in the tree structure, including: acquiring a threshold value of the number of nodes; determining a tree structure area comprising a number of nodes exceeding a threshold number of nodes; based on the tree structure region, a document region corresponding to the query sentence is determined.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: determining a document area corresponding to the query sentence based on the tree structure area, including: detecting whether the tree structure area is smaller than a preset area threshold value; and determining the corresponding area of the tree structure area in the target document as the document area corresponding to the query sentence under the condition that the tree structure area is smaller than the preset area threshold value as the detection result.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: based on the document area, determining a dialogue result corresponding to the query sentence, including: acquiring a prompt word corresponding to a document area, wherein the prompt word is used for prompting the output of a result corresponding to an inquiry sentence in a dialogue form; based on the document area and the prompt word, a dialogue result corresponding to the inquiry sentence is determined.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: after determining the dialogue result corresponding to the query sentence based on the document area, the method further comprises: the dialog result is displayed in a dialog box corresponding to the query sentence.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: matching the dialogue result with the document area to obtain an original expression of the dialogue result in the document area; the location of the original expression in the target document is displayed in a highlighted manner.

Optionally, in the present embodiment, the computer readable storage medium is further configured to store program code for performing the steps of: inputting the prompt words and the document area into a large language model, wherein the large language model analyzes the document area based on the prompts of the prompt words to obtain an analysis result obtained after the document area is analyzed under the prompts; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

Optionally, in the present embodiment, the computer readable storage medium is configured to store program code for performing the steps of: receiving a target document; converting the target document into a structured object; receiving an inquiry sentence; searching a document area corresponding to the inquiry sentence from the structured object; and determining a dialogue result corresponding to the query sentence based on the document area by adopting the large language model.

Optionally, in the present embodiment, the computer readable storage medium is configured to store program code for performing the steps of: determining a dialogue result corresponding to the query sentence based on the document area by adopting a large language model, wherein the method comprises the following steps: acquiring a prompt word, wherein the prompt word is used for prompting the large language model to analyze the document area; analyzing the document area by adopting a large language model based on the prompt of the prompt word to obtain an analysis result; and receiving a dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and analysis results, and the dialogue words correspond to the prompt words.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and are merely a logical functional division, and there may be other manners of dividing the apparatus in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims

1. A document processing method, comprising:

receiving a target document;

converting the target document into a structured object;

receiving an inquiry sentence;

searching a document area corresponding to the inquiry sentence from the structured object;

and determining a dialogue result corresponding to the inquiry statement based on the document area.

2. The method of claim 1, wherein the searching for a document region from the structured object that corresponds to the query statement comprises:

determining a plurality of search results corresponding to the query sentence, wherein the search results comprise candidate titles or candidate paragraph contents;

determining nodes corresponding to the plurality of search results in the tree structure under the condition that the structure corresponding to the structured object is the tree structure;

and determining a document area corresponding to the query statement based on the nodes corresponding to the search results in the tree structure.

3. The method of claim 2, wherein the determining a document region corresponding to the query statement based on the nodes corresponding to the plurality of search results in the tree structure comprises:

Acquiring a threshold value of the number of nodes;

determining a tree structure area comprising a number of nodes exceeding the threshold number of nodes;

and determining the document area corresponding to the query statement based on the tree structure area.

4. A method according to claim 3, wherein said determining the document area to which the query statement corresponds based on the tree structure area comprises:

detecting whether the tree structure area is smaller than a preset area threshold value;

and under the condition that the detection result is that the tree structure area is smaller than a preset area threshold value, determining the area corresponding to the tree structure area in the target document as the document area corresponding to the query sentence.

5. The method of claim 1, wherein the determining a dialog result corresponding to the query sentence based on the document area comprises:

obtaining a prompt word corresponding to the document area, wherein the prompt word is used for prompting the output of a result corresponding to the inquiry sentence in a dialogue form;

and determining the dialogue result corresponding to the query statement based on the document area and the prompt word.

6. The method of claim 5, wherein the determining the dialog result corresponding to the query sentence based on the document area and the prompt word comprises:

Inputting the prompt words and the document area into a large language model, wherein the large language model analyzes the document area based on the prompt of the prompt words to obtain an analysis result obtained after the document area is analyzed under the prompt;

and receiving the dialogue result fed back by the large language model, wherein the dialogue result is generated by the large language model based on dialogue words and the analysis result, and the dialogue words correspond to the prompt words.

7. The method according to any one of claims 1 to 6, further comprising, after determining a dialogue result corresponding to the query sentence based on the document area:

and displaying the dialogue result in a dialogue box corresponding to the inquiry statement.

8. The method of claim 7, wherein the method further comprises:

matching the dialogue result with the document area to obtain an original expression of the dialogue result in the document area;

the location of the original expression in the target document is displayed in a highlighted manner.

9. A document processing method, comprising:

Displaying a document input box on a display interface;

receiving a target document in response to an operation on the document input box, and converting the target document into a structured object;

displaying a dialog box on the display interface;

receiving an inquiry sentence in response to an input operation to the dialog box;

and in response to the confirmation operation of the query sentence, displaying a dialogue result corresponding to the query sentence in the dialogue box, wherein the dialogue result is determined based on a document area corresponding to the query sentence, and the document area is searched from the structured object.

10. A document processing method, comprising:

receiving a target document;

converting the target document into a structured object;

receiving an inquiry sentence;

and determining a dialogue result corresponding to the query sentence based on the document area by adopting a large language model.

11. The method of claim 10, wherein the determining a dialog result corresponding to the query sentence based on the document region using a large language model comprises:

Acquiring a prompt word, wherein the prompt word is used for prompting the large language model to analyze the document area;

analyzing the document area by adopting the large language model based on the prompt of the prompt word to obtain an analysis result;

12. A computer terminal, comprising:

a memory storing an executable program;

a processor for executing the program, wherein the program when run performs the method of any of claims 1 to 11.

13. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored executable program, wherein the executable program when run controls a device in which the storage medium is located to perform the method of any one of claims 1 to 11.