WO2023029230A1

WO2023029230A1 - Ai and rpa-based file annotation method and apparatus, device, and medium

Info

Publication number: WO2023029230A1
Application number: PCT/CN2021/132175
Authority: WO
Inventors: 杨子杰; 汪冠春; 胡一川; 褚瑞; 李玮
Original assignee: 北京来也网络科技有限公司; 来也科技(北京)有限公司
Priority date: 2021-09-01
Filing date: 2021-11-22
Publication date: 2023-03-09
Also published as: CN113836090A

Abstract

The present invention relates to the fields of AI and RPA, and provides an AI and RPA-based file annotation method and apparatus. The method comprises: an RPA system obtains a file annotation request; the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request; the RPA system draws, according to the response result, a target picture corresponding to a file to be annotated; the RPA system determines a regional range of text annotation in the target picture in response to a mouse event; and the RPA system determines a text annotation result in the regional range according to first text information obtained by performing optical character recognition (OCR) on the file to be annotated and position information corresponding to text fragments of the first text information.

Description

File annotation method, device, equipment and medium based on AI and RPA

Cross References to Related Applications

This application is based on a Chinese patent application with application number 202111021971.0 and a filing date of September 1, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.

technical field

The present disclosure relates to the fields of artificial intelligence (AI for short) and robotic process automation (RPA for short), and in particular to a method, device, device and medium for document labeling based on AI and RPA.

Background technique

RPA uses specific "robot software" to simulate human operations on computers and automatically execute process tasks according to rules.

AI is a technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence.

With the popularity of RPA, more and more companies use RPA to help employees complete repetitive tasks. However, in the process of model training, a large number of manual annotations on files are still required to obtain training data. For example, training data is obtained through a large number of manually annotated PDF files or pictures, and the document structure information and visual information are modeled. For example, the general document pre-training model LayoutLM allows the model to perform multi-modal alignment in the pre-training stage.

However, the above-mentioned document labeling method cannot select discontinuous text and extract text on the picture, and does not include the position information of the text in the document, which cannot meet the needs of model training.

Contents of the invention

The present disclosure aims to solve one of the technical problems in the related art at least to a certain extent.

For this reason, this disclosure proposes a file tagging method, device, device, and medium based on AI and RPA, in order to realize that the RPA system realizes image tagging by determining the text tagging area range in the target image and the text tagging results within the area range. The extraction of Chinese text information and the selection of discontinuous characters in the text can also obtain the text information within the marked area and the position information of the text fragments in the text information, which can meet the needs of model training.

The embodiment of the first aspect of the present disclosure proposes a file tagging method based on AI and RPA, including: the RPA system obtains a file tagging request; wherein, the file tagging request is used to tag the file to be tagged; the RPA system responds to The file labeling request generates a response result corresponding to the file labeling request; the RPA system draws a target picture corresponding to the file to be marked according to the response result; the RPA system determines the The area range of the text annotation in the target picture; the RPA system is based on the first text information obtained by performing optical character recognition OCR on the document to be annotated and the position information corresponding to each text segment of the first text information, A text annotation result within the range of the region is determined.

The embodiment of the second aspect of the present disclosure proposes a file tagging device based on AI and RPA. The file tagging device is applied to the RPA system, and includes: an acquisition module for obtaining a file tagging request; wherein, the file tagging request uses To mark the file to be marked; the generation module is used to generate a response result corresponding to the file mark request in response to the file mark request; the drawing module is used to draw the corresponding file to be marked according to the response result The target picture; the first determination module is used to determine the area range of the text label in the target picture in response to the mouse event; the second determination module is used to obtain the optical character recognition OCR according to the document to be marked The first text information of the first text information and the position information corresponding to each text segment of the first text information, determine the text annotation result within the range of the area.

The embodiment of the third aspect of the present disclosure proposes an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, it realizes the The method described in the embodiment of the first aspect above.

The embodiment of the fourth aspect of the present disclosure provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the method as described in the above-mentioned embodiment of the first aspect of the present disclosure is implemented.

The embodiment of the fifth aspect of the present disclosure provides a computer program product, including a computer program. When the computer program is executed by a processor, the method as described in the above-mentioned embodiment of the first aspect of the present disclosure is implemented.

The technical solutions provided by the embodiments of the present disclosure include the following beneficial effects:

Obtain the file labeling request through the RPA system; wherein, the file labeling request is used to label the file to be marked; the RPA system generates a response result corresponding to the file labeling request in response to the file labeling request; the RPA system draws a file according to the response result The target picture corresponding to the document to be marked; the RPA system determines the area range of the text label in the target picture in response to the mouse event; the RPA system performs optical character recognition OCR according to the first text obtained by the document to be marked information and the position information corresponding to each text segment of the first text information to determine the text annotation result within the range of the area. Therefore, the RPA system realizes the extraction of text information in the image and the selection of discontinuous text in the text by determining the range of the text annotation area in the target image and the text annotation results within the area range, and at the same time can obtain the range of the annotation area The text information in the text information and the position information of the text fragments in the text information can meet the needs of model training.

Additional aspects and advantages of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.

Description of drawings

The above and/or additional aspects and advantages of the present disclosure will become apparent and understandable from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

FIG. 1 is a schematic flow diagram of an AI and RPA-based file labeling method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic flow diagram of another AI and RPA-based file labeling method provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of the position information of the target sub-picture corresponding to the sub-file to be marked corresponding to the area range provided by the embodiment of the present disclosure;

FIG. 4 is a schematic flow diagram of another AI and RPA-based file labeling method provided by an embodiment of the present disclosure;

FIG. 5 is a schematic flow diagram of another AI and RPA-based file labeling method provided by an embodiment of the present disclosure;

FIG. 6 is a schematic flow diagram of another AI and RPA-based file labeling method provided by an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of an AI and RPA-based file tagging device provided by an embodiment of the present disclosure;

FIG. 8 shows a block diagram of an exemplary electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed ways

Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the drawings, in which the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present disclosure and should not be construed as limiting the present disclosure.

The following describes the AI and RPA-based file tagging method, device, device, and medium of the embodiments of the present disclosure with reference to the accompanying drawings.

FIG. 1 is a schematic flowchart of an AI and RPA-based document labeling method provided by an embodiment of the present disclosure.

The file tagging method based on AI and RPA provided in the embodiment of the present disclosure can be applied to the device for tagging files based on AI and RPA in the embodiment of the present disclosure, and the device can be configured in an electronic device. Wherein, the electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal is, for example, a mobile phone, a tablet computer, a personal digital assistant, and other hardware devices with various operating systems.

As shown in Figure 1, the AI and RPA-based document annotation method may include the following steps:

In step 101, the RPA system obtains a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated.

In the embodiment of the present disclosure, the user can send a file labeling request to the RPA system through the interactive interface, so that the RPA system can label the file to be marked according to the file labeling request. Wherein, it should be noted that the text annotation request can be used to annotate the file to be annotated.

In step 102, the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request.

Further, the RPA system generates a response result corresponding to the labeling request according to the obtained file labeling request, wherein the response result may include: the file to be marked corresponding to the file labeling request, the conversion picture corresponding to the file to be marked; and based on Optical Character Recognition (OCR for short) acquires the first text information corresponding to the file to be marked and the position information corresponding to each text segment in the first text information. Wherein, the location information corresponding to the text segment may include location information corresponding to each word and text in the text. Wherein, the position information corresponding to each word and text may be the position of each word and text relative to the page, for example, the coordinate information of the word or text relative to the four vertices of the page.

Step 103, the RPA system draws the target picture corresponding to the file to be marked according to the response result.

As an example, the to-be-labeled file in the response result may include one or more to-be-labeled sub-files, and the target picture corresponding to the to-be-labeled file may be drawn in different ways according to the number of to-be-labeled sub-files.

As an example, when the file to be marked includes multiple sub-files to be marked, the RPA system can draw the sub-files to be marked according to the text information corresponding to the multiple sub-files to be marked and the position information corresponding to each text segment in the text information The corresponding target sub-picture is spliced to obtain the target picture corresponding to the file to be marked.

As another example, when the file to be marked includes a sub-file to be marked, the RPA system can draw the first text information corresponding to the file to be marked and the position information corresponding to each text segment in the first text information, and draw the The target image corresponding to the file.

In step 104, the RPA system determines the area range of the text annotation in the target image in response to the mouse event.

In the embodiment of the present disclosure, the RPA system can determine the area range of the text annotation in the target picture according to the mouse event. For example, when the mouse event sequentially includes: a mouse click event, a mouse move event, and a mouse lift event, the area range of the text marked by the mouse event may be determined.

In step 105, the RPA system determines the text labeling result within the area according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information.

In the embodiment of the present disclosure, the to-be-labeled subfile to which the area belongs can be determined according to the coordinate information of the area range, and then the second text information corresponding to the to-be-labeled subfile in the first text information and the first text information in the first text information are obtained. Each text segment of the corresponding second text information in each text segment of each text segment, and then, according to the position information of the range range relative to the sub-file to be marked, determine the area range from the second text information and each text segment of the second text information The text labeling results of .

As an application scenario, for example, in tender announcements and red-headed documents, the document labeling method of the embodiment of the present disclosure can convert unstructured long text into structured data, and assist users to complete the intelligent extraction of key information of documents.

In summary, the file annotation request is obtained through the RPA system; the file annotation request is used to annotate the file to be annotated; the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request; the RPA system draws the response result according to the response result Annotate the target picture corresponding to the file; the RPA system determines the area range of the text label in the target picture in response to the mouse event; the RPA system performs optical character recognition OCR according to the document to be marked. The location information corresponding to the text segment determines the text labeling result within the area. Therefore, the RPA system realizes the extraction of text information in the image and the selection of discontinuous text in the text by determining the range of the text annotation area in the target image and the text annotation results within the area range, and at the same time can obtain the range of the annotation area The text information in the text information and the position information of the text fragments in the text information can meet the needs of model training.

In order to obtain the text information within the marked area and the position information of the text fragments in the text information, as shown in Figure 2, Figure 2 is the flow of another AI and RPA-based file marking method provided by the embodiment of the present disclosure Schematic diagram, in the embodiment of the present disclosure, it is possible to determine the sub-file to be marked to which the area range belongs, according to the location information of the area range relative to the sub-file to be marked to which the area range belongs, and the position information of the sub-file to be marked in the file to be marked, Therefore, in the first text information and the position information corresponding to each text segment of the first text information, the text labeling result within the area range is determined. The embodiment shown in Figure 2 may include the following steps:

In step 201, the RPA system obtains a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated.

In step 202, the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request.

Step 203, the RPA system draws the target picture corresponding to the file to be marked according to the response result.

In step 204, the RPA system determines the area range of the text annotation in the target image in response to the mouse event.

Step 205, the RPA system determines the sub-file to be marked to which the area range belongs according to the vertex coordinate information of the area range and the height information of the sub-file to be marked in the file to be marked.

In the embodiment of the present disclosure, the RPA system can pre-set the height information of each sub-file to be marked, and then, the RPA system can determine the height of the vertex of the area range relative to the sub-file to be marked according to the vertex coordinate information (such as the upper left vertex) of the area range. According to the height information of the origin of the target sub-picture corresponding to the file and the height information of the target sub-picture corresponding to each sub-file to be marked, the sub-file to be marked to which the area belongs can be determined. For example, the height of the vertices of the region range relative to the origin of the target picture corresponding to the sub-file to be marked is greater than the height of the target sub-picture corresponding to one sub-file to be marked, and smaller than the height of the target sub-picture corresponding to two sub-files to be marked, It can be determined that the area range belongs to the second sub-file to be marked.

In step 206, the RPA system determines the location information of the region range relative to the target sub-picture corresponding to the to-be-marked sub-file to which the region range belongs.

For example, as shown in Figure 3, taking the document to be marked as a pdf file as an example, the elements on the page from outside to inside are: window object window.document, drawing object canvas for drawing pdf files, canvas relative to document The positions are left and top, and there is no gap between page and canvas. The target sub-pictures (page1, page2, etc.) corresponding to multiple sub-files to be marked in the pdf file, taking the target sub-picture corresponding to the sub-files to be marked to which the area belongs is page2 as an example, the coordinates of the upper left corner of the area are (x, y ), the position information of the area range relative to the target sub-picture page2 corresponding to the sub-file to be marked is relativeLeft=x-left, relativeRight=x-left+width, relativeTop=y-top-pageHeight*(PageNo-1), relativeBottom= relativeTop+height. Wherein, width and height are the width and height of the area range respectively, and the width and height of the area range can be obtained by calculating the end coordinate and the start coordinate of the label. For example, the start coordinates of the area range label are (x, y), the end coordinates of the area range label are (x1, y1), the width of the area range can be |x1-x|, and the height of the area range can be |y1- y|.

In step 207, the RPA system determines the text labeling results within the area in the first text information and the position information corresponding to each text segment of the first text information according to the position information.

In some embodiments, the RPA system determines the second text information corresponding to the sub-file to be marked to which the area range belongs in the first text information according to the position information of the sub-file to be marked to which the area range belongs; the RPA system According to the corresponding relationship between the second text information and the first text information, determine the position information corresponding to each text segment of the second text information in the position information corresponding to each text segment of the first text information; The location information of the target sub-picture corresponding to the sub-file to be marked to determine the third text information in the second text information within the area; the RPA system according to the correspondence between the third text information and the second text information, in the second text information The position information corresponding to each text segment of the third text information is determined in the position information corresponding to each text segment of the information; the RPA system uses the third text information and the position information corresponding to each text segment of the third text information as the area range Text annotation results.

That is to say, after the RPA system determines the sub-file to be marked to which the area belongs, it can determine the second sub-file corresponding to the sub-file to be marked in the first text information according to the position information of the sub-file to be marked in the file to be marked. For text information, for example, the RPA system determines that the sub-file to be marked to which the region belongs is the second page in the file to be marked, and the second text information corresponding to the sub-file to be marked on the second page can be determined from the first text information. Next, the RPA system can determine the position information corresponding to each text segment of the second text information from the position information corresponding to each text segment of the first text information according to the correspondence relationship between the second text information and the first text information. Further, the RPA system determines the third text information corresponding to the area range in the second text information according to the position information of the area range relative to the target sub-picture corresponding to the sub-file to be marked to which the area range belongs, and the RPA system determines the third text information corresponding to the area range according to the third text information The corresponding relationship with the second text information is to determine the position information corresponding to each text segment of the third text information in the position information corresponding to each text segment of the second text information; the RPA system combines the third text information and the third text information The location information corresponding to each text segment is used as the text labeling result within the region.

In the embodiment of the present disclosure, the RPA system can label and save the position information of the region range relative to the target sub-picture of the sub-file to be marked to which the region range belongs, and the text labeling results of the region range, as the training data of the model , to meet the needs of model training. For example, it can be used as training data for the general document pre-training model.

As an application scenario, the general document pre-training model can combine document structure information and visual information for multi-modal alignment. This model can be applied to tasks such as form understanding, bill understanding, and document image classification.

In the embodiment of the present disclosure, steps 201-204 may be implemented in any one of the embodiments of the present disclosure, which is not limited in the embodiment of the present disclosure, and will not be repeated here.

In summary, the RPA system determines the sub-file to be marked to which the area belongs according to the vertex coordinate information of the area and the height information of the sub-file to be marked in the file to be marked; the RPA system determines the area to be marked relative to the area to be marked The position information of the target sub-picture corresponding to the sub-file; the RPA system determines the text labeling result within the area range in the first text information and the position information corresponding to each text segment of the first text information according to the position information. In this way, the text labeling result within the area can be accurately determined, so that the text information within the marked area and the position information of the text segment in the text information can be obtained.

In order to accurately determine the range of the text label in the target picture, realize the extraction of text information in the picture and the selection of discontinuous characters in the text, as shown in Figure 4, Figure 4 is another method based on Schematic flowchart of the AI and RPA file labeling method. In the embodiment of the present disclosure, the range of text labeling in the target image can be determined through mouse click events, mouse move events, and mouse lift events. The embodiment shown in Figure 4 may include the following steps:

In step 401, the RPA system acquires a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated.

In step 402, the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request.

Step 403, the RPA system draws the target picture corresponding to the file to be marked according to the response result.

In step 404, the RPA system monitors the mouse event of the target image; wherein, the mouse event sequentially includes: a mouse click event, a mouse move event, and a mouse lift event.

In the embodiment of the present disclosure, the RPA system can monitor the mouse event of the target icon through the monitoring function. When the mouse event includes a mouse click event (mousedown event), a mouse movement event (mousemove event) and a mouse up event (mouseup event) in sequence , which can determine the selection of the area range for text annotation in the target image.

Step 405, the RPA system determines the first coordinate of the area range according to the mouse click event.

Furthermore, the RPA system can use the coordinates of the mouse click event as the starting coordinate of the area range, that is, the first coordinate, by monitoring the mouse click event.

Step 406, the RPA system determines the second coordinates of the area range according to the mouse moving event and the mouse lifting event.

Further, the RPA system can determine the end coordinate of the area range, that is, the second coordinate, by monitoring the mouse movement event and the mouse lift event.

Step 407, the RPA system determines the height value and width value of the area range according to the first coordinate and the second coordinate.

For example, the abscissa in the first coordinate may be subtracted from the abscissa in the second coordinate, and the absolute value of the subtraction result may be used as the width value of the area range. The ordinate in the first coordinate is subtracted from the ordinate in the second coordinate, and the absolute value of the subtraction result is used as the height value of the area range.

In step 408, the RPA system uses the first coordinate, the second coordinate, and the enclosed area of the height value and width value of the area range as the area range marked by the text in the target image.

In the embodiment of the present disclosure, the RPA system adds the abscissa of the first coordinate to the width value of the area range to obtain the third coordinate, and the RPA system adds the ordinate of the first coordinate to the height value of the area range to obtain The fourth coordinate is obtained, and the enclosed area enclosed by the first coordinate, the second coordinate, the third coordinate and the fourth coordinate is used as the range of the text label in the target image. It should be noted that, in order to more accurately determine the text labeling results within the area, the number of text labeling areas in the target image at the same time is one, and the area range can be realized through the tag <div>.

In step 409, the RPA system determines the text labeling result within the region according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information.

In the embodiment of the present disclosure, steps 401-403, 409 may be implemented in any one of the embodiments of the present disclosure, which is not limited in the embodiment of the present disclosure, and will not be repeated here.

In summary, the mouse event of the target image is monitored through the RPA system; among them, the mouse event includes: mouse click event, mouse move event and mouse lift event; the RPA system determines the first coordinate of the area according to the mouse click event; the RPA system According to the mouse movement event and the mouse lift event, determine the second coordinate of the area range; the RPA system uses the first coordinate, the second coordinate, and the enclosed area of the height value and width value of the area range as the text annotation in the target image geographic range. Therefore, in response to the mouse event, the RPA system can accurately determine the range of the text label in the target picture, and realize the extraction of text information in the picture and the selection of discontinuous characters in the text.

In order to obtain the response result corresponding to the request to be tagged, as shown in FIG. 5, FIG. 5 is a schematic flowchart of another AI and RPA-based file tagging method provided by the embodiment of the present disclosure. In the embodiment of the present disclosure, in In the case that the file to be marked is not a picture, the file to be marked can be converted into a converted picture first, and then character recognition is performed on the converted picture according to optical character recognition to obtain the first text information corresponding to the document to be marked and the first text information. The location information corresponding to each text segment. The embodiment shown in Figure 5 may include the following steps:

In step 501, the RPA system obtains a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated.

Step 502, the RPA system obtains the file to be marked corresponding to the file marking request according to the file marking request.

In the embodiment of the present disclosure, the RPA system may acquire the file to be marked corresponding to the file marking request according to the identification of the file to be marked in the file marking request. Wherein, the file marking request may include the identification of the file to be marked.

In step 503, the RPA system performs image conversion on the file to be marked, and obtains the converted picture corresponding to the file to be marked.

In the embodiment of the present disclosure, if the file to be marked is not a picture, the file to be marked may be converted into a picture.

As an example, the document to be marked may be converted into a picture through a document picture conversion technology, and the converted picture may be used as a converted picture. For example, the pdf file can be converted to an image through the pdf.js plug-in.

In step 504, the RPA system performs character recognition on the converted image based on optical character recognition, so as to obtain the first text information corresponding to the document to be marked and the position information corresponding to each text segment of the first text information.

Furthermore, the RPA system performs character recognition on the converted picture based on optical character recognition, uses the recognized text information as the first text information corresponding to the document to be marked, and uses the position information ( For example, the x-axis and y-axis coordinates of each word or word in the page on the top, bottom, left, and right vertices) are used as position information corresponding to each text segment of the first text information. Wherein, it should be noted that, in order to prevent the picture from being unclear, the converted picture may be enlarged by a preset multiple, and the converted picture enlarged by the preset multiple may be sent to the optical character recognition interface for performing character recognition on the converted picture.

Step 505, the RPA system takes the file to be marked, the first text information corresponding to the file to be marked, and the location information corresponding to each text segment of the first text information as a response result corresponding to the file marking request.

In the embodiment of the present disclosure, the RPA system may use the position information corresponding to the file to be marked corresponding to the file marking request, the first text information corresponding to the file to be marked, and each text segment of the first text information as the corresponding position information of the file marking request Response results.

Step 506, the RPA system draws the target picture corresponding to the document to be marked according to the response result.

In step 507, the RPA system determines the area range of the text annotation in the target picture in response to the mouse event.

In step 508, the RPA system determines the text labeling result within the region according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information.

In the embodiment of the present disclosure, steps 501, 506-508 may be implemented in any one of the embodiments of the present disclosure, which is not limited in the embodiment of the present disclosure, and will not be repeated here.

To sum up, the RPA system obtains the file to be marked corresponding to the file marking request through the RPA system; the RPA system converts the image of the file to be marked to obtain the converted image corresponding to the file to be marked; the RPA system converts the file based on optical character recognition Character recognition is performed on the picture to obtain the first text information corresponding to the file to be marked and the position information corresponding to each text segment of the first text information; The location information corresponding to each text segment of the information is used as a response result corresponding to the file annotation request. Therefore, the RPA system can accurately obtain the response result corresponding to the request to be marked according to the file marking request.

In order to accurately draw the target picture corresponding to the file to be marked, as shown in Figure 6, Figure 6 is a schematic flowchart of another AI and RPA-based file marking method provided by the embodiment of the present disclosure. In the embodiment of the present disclosure, The target sub-picture corresponding to the sub-file to be marked can be determined, and the target picture corresponding to the file to be marked can be determined according to a plurality of target sub-pictures. The embodiment shown in Figure 6 may include the following steps:

In step 601, the RPA system acquires a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated.

In step 602, the RPA system generates a response result corresponding to the file annotation request in response to the file annotation request.

Step 603, the RPA system acquires multiple sub-files to be marked of the file to be marked in the response result.

In the embodiment of the present disclosure, the file to be marked may include multiple sub-files to be marked or one sub-file to be marked. For example, the file to be marked is a pdf file, and the number of pages of the pdf file can be one or more pages. When the number of pages of the pdf file is one page, the file to be marked only includes one sub-file to be marked. When there are multiple pages, the RPA system can use each page of the pdf file as a sub-file to be marked. A multi-page pdf file can include multiple sub-files to be annotated. For each sub-file to be marked in the multi-page pdf file, the RPA system can identify each sub-file to be marked, and the mark can be used to identify the position of each sub-file to be marked in the file to be marked.

Step 604, for each sub-file to be marked, the RPA system creates a drawing object corresponding to the sub-file to be marked.

In the embodiment of the present disclosure, for each subfile to be marked, a drawing object corresponding to the subfile to be marked can be created, and a page object can be created according to the attribute information of the subfile to be marked, for example, the drawing object is a canvas object, and the page object It is a page object, and the page object includes the height information and width information of the subfile to be marked.

It can be understood that when a drawing object is created, the RPA system sets a default width value and height value for it. In order to make the drawn target picture have a corresponding relationship with the size of the file to be marked (for example, the size is consistent), in the embodiment of the present disclosure , the RPA system can adjust the size information of the drawing object according to the attribute information of the sub-file to be marked in the page object.

Step 605, the RPA system determines the text information corresponding to the sub-file to be marked in the first text information according to the position information of the sub-file to be marked in the file to be marked.

As an example, for each sub-file to be marked, the RPA system can identify each sub-file to be marked, which can be used to identify the position of each sub-file to be marked in the file to be marked, and the RPA system can according to the Mark the position information of the sub-file in the file to be marked, and determine the text information corresponding to the sub-file to be marked in the first text information. For example, the position of the to-be-labeled sub-file in the to-be-labeled file is the second page, and the text information corresponding to the second page of the to-be-labeled sub-file can be obtained from the first text information.

Step 606, according to the corresponding relationship between the text information and the first text information, determine the position information corresponding to each text segment of the text information in the position information corresponding to each text segment of the first text information.

Further, after determining the text information corresponding to the subfile to be marked in the first text information, according to the correspondence between the text information and the first text information, the position information corresponding to each text segment of the first text information can be , determine the position information corresponding to each text segment of the text information.

Step 607, the RPA system draws the target sub-picture corresponding to the sub-file to be marked according to the size information of the drawing object, the text information corresponding to the sub-file to be marked and the position information corresponding to each text fragment of the text information.

Furthermore, the RPA system draws the target sub-picture corresponding to the sub-file to be marked according to the size information of the drawing object, combined with the text information corresponding to the sub-file to be marked and the position information corresponding to each text segment of the text information. Wherein, it should be noted that the size of the target sub-picture is the same as that of the sub-file to be marked, and the target sub-picture may include text information of the sub-file to be marked and position information corresponding to each text segment of the text information.

In step 608, the RPA system stitches together the target sub-pictures corresponding to the multiple sub-files to be marked to obtain the target picture.

Further, the RPA system splices the target sub-pictures corresponding to the multiple sub-files to be marked, and takes the splicing result of the multiple target sub-pictures as the target picture.

In step 609, the RPA system determines the area range of the text annotation in the target picture in response to the mouse event.

In step 610, the RPA system determines the text labeling result within the area according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information.

In the embodiment of the present disclosure, steps 601-602 and 609-610 may be implemented in any one of the embodiments of the present disclosure respectively, which is not limited in the embodiment of the present disclosure, and will not be repeated here.

To sum up, multiple sub-files to be marked in the response result are obtained through the RPA system; for each sub-file to be marked, the RPA system creates a drawing object corresponding to the sub-file to be marked; according to the text information and the first text The corresponding relationship of information, determine the position information corresponding to each text segment of the text information in the position information corresponding to each text segment of the first text information; The RPA system is based on the size information of the drawing object, and the text information corresponding to the sub-file to be marked and The position information corresponding to each text segment of the text information, draws the target sub-picture corresponding to the sub-file to be marked; the RPA system stitches the target sub-pictures corresponding to multiple sub-files to be marked to obtain the target picture. Therefore, the RPA system can accurately draw the target picture corresponding to the file to be marked according to the plurality of sub-files to be marked, the text information corresponding to the sub-files to be marked, and the position information corresponding to each text segment of the text information.

In the AI and RPA-based file tagging method of the disclosed embodiment, the file tagging request is obtained through the RPA system; wherein, the file tagging request is used to tag the file to be tagged; the RPA system generates a file tagging request in response to the file tagging request Corresponding response result; RPA system draws the target picture corresponding to the document to be marked according to the response result; RPA system determines the area range of the text label in the target picture in response to the mouse event; RPA system according to the described The first text information obtained by performing optical character recognition (OCR) on the file to be marked and the position information corresponding to each text segment of the first text information determine the text marking result within the area. Therefore, the RPA system realizes the extraction of text information in the image and the selection of discontinuous text in the text by determining the range of the text annotation area in the target image and the text annotation results within the area range, and at the same time can obtain the range of the annotation area The text information in the text information and the position information of the text fragments in the text information can meet the needs of model training.

Corresponding to the AI and RPA-based file tagging method provided by the above-mentioned embodiments of FIGS. The tagging device corresponds to the AI-based and RPA-based file tagging method provided in the embodiments of FIGS. The file tagging device for is not described in detail in the embodiments of the present disclosure.

FIG. 7 is a schematic structural diagram of an AI and RPA-based file tagging device provided by an embodiment of the present disclosure.

As shown in FIG. 7 , the AI and RPA-based document tagging device 700 may include: an acquisition module 710 , a generation module 720 , a drawing module 730 , a first determination module 740 and a second determination module 750 .

Wherein, the acquisition module 710 is used to obtain the file annotation request; wherein the file annotation request is used to annotate the file to be annotated; the generation module 720 is used to generate a response result corresponding to the file annotation request in response to the file annotation request; the drawing module 730, for drawing the target picture corresponding to the file to be marked according to the response result; the first determining module 740, for responding to the mouse event, determining the range of the text label in the target picture; the second determining module 750, for according to The first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information determine the text marking result within the area.

As a possible implementation of the embodiment of the present disclosure, the second determination module 750 is configured to: determine the sub-file to be marked to which the area range belongs according to the vertex coordinate information of the area range and the height information of the sub-file to be marked in the file to be marked file; determine the position information of the target sub-picture corresponding to the sub-file to be marked to which the area range belongs; according to the position information, in the first text information and the position information corresponding to each text segment of the first text information, determine the area Text labeling results within the range.

As a possible implementation of the embodiment of the present disclosure, the second determination module 750 is further configured to: determine the location of the subfile to be marked to which the region range belongs according to the position information of the subfile to be marked to which the region range belongs. The corresponding second text information in the first text information; according to the corresponding relationship between the second text information and the first text information, determine the location information corresponding to each text segment of the second text information in the position information corresponding to each text segment of the first text information Position information; according to the position information of the area range relative to the sub-file to be marked to which the area range belongs, determine the third text information in the second text information within the area range; according to the correspondence between the third text information and the second text information , determining the position information corresponding to each text segment of the third text information in the position information corresponding to each text segment of the second text information; using the third text information and the position information corresponding to each text segment of the third text information as an area Text labeling results within the range.

As a possible implementation of the embodiment of the present disclosure, the first determination module 740 is configured to: monitor the mouse event of the target picture; wherein, the mouse event includes in sequence: a mouse click event, a mouse move event, and a mouse lift event; Click the event to determine the first coordinate of the area; according to the mouse movement event and the mouse lift event, determine the second coordinate of the area; according to the first coordinate and the second coordinate, determine the height and width of the area; set the second The area enclosed by the first coordinate, the second coordinate, and the height and width values of the area range is used as the area range marked by the text in the target image.

As a possible implementation of the embodiment of the present disclosure, the generating module 720 is configured to: acquire the file to be marked corresponding to the file marking request according to the file marking request; perform image conversion on the file to be marked to obtain the corresponding file to be marked Converting the picture; performing character recognition on the converted picture based on optical character recognition OCR, to obtain the first text information corresponding to the document to be marked and the position information corresponding to each text segment of the first text information; the document to be marked , the first text information corresponding to the file to be marked and the location information corresponding to each text segment of the first text information, as a response result corresponding to the file marking request.

As a possible implementation of the embodiment of the present disclosure, the drawing module 730 is configured to: obtain multiple sub-files to be marked in the response result; for each sub-file to be marked in the file to be marked, create and The drawing object corresponding to the subfile to be marked; according to the position information of the subfile to be marked in the file to be marked, determine the text information corresponding to the subfile to be marked in the first text information; according to the text information and the first text information Correspondence, in the position information corresponding to each text segment in the first text information, determine the position information corresponding to each text segment of the text information; according to the size information of the drawing object, the text information corresponding to the sub-file to be marked and each of the text information The position information corresponding to the text segment is used to draw the target sub-picture corresponding to the sub-file to be marked; the target sub-picture corresponding to the multiple sub-files to be marked is spliced to obtain the target picture.

As a possible implementation of the embodiment of the present disclosure, the AI and RPA-based document tagging apparatus 700 further includes: a processing module. Wherein, the processing module is used for labeling the position information of the sub-file to be marked with respect to the region range, and the third text information within the region range and the position information corresponding to each text segment of the third text information Annotate and save as training data for the model.

The AI and RPA-based file tagging device of the disclosed embodiment obtains the file tagging request through the RPA system; wherein, the file tagging request is used to tag the file to be tagged; the RPA system generates a file tagging request in response to the file tagging request Corresponding response result; RPA system draws the target picture corresponding to the document to be marked according to the response result; RPA system determines the area range of the text label in the target picture in response to the mouse event; RPA system according to the described The first text information obtained by performing optical character recognition (OCR) on the file to be marked and the position information corresponding to each text segment of the first text information determine the text marking result within the range of the area. Therefore, the RPA system realizes the extraction of text information in the image and the selection of discontinuous text in the text by determining the range of the text annotation area in the target image and the text annotation results within the area range, and at the same time can obtain the range of the annotation area The text information in the text information and the position information of the text fragments in the text information can meet the needs of model training.

In order to realize the above-mentioned embodiments, an embodiment of the present disclosure also proposes an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, the The AI and RPA-based document labeling method as described in any of the foregoing method embodiments.

In order to realize the above-mentioned embodiments, the embodiments of the present disclosure also propose a non-transitory computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method based on AI and RPA document annotation methods.

In order to realize the above-mentioned embodiment, the embodiment of the present disclosure also proposes a computer program product, when the instruction processor in the computer program product is executed, the file annotation based on AI and RPA as described in any of the foregoing method embodiments is realized method.

As shown in FIG. 8 , FIG. 8 is a block diagram of an electronic device according to an AI and RPA-based file tagging method provided by an embodiment of the present disclosure. Electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in FIG. 8, the electronic device includes: one or more processors 801, a memory 802, and interfaces for connecting various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and can be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on the memory, to display graphical information of a GUI on an external input/output device such as a display device coupled to an interface. In other implementations, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, as desired. Likewise, multiple electronic devices may be connected, with each device providing some of the necessary operations (eg, as a server array, a set of blade servers, or a multi-processor system). In FIG. 8, a processor 801 is taken as an example.

The memory 802 is a non-transitory computer-readable storage medium provided in the present disclosure. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the AI and RPA-based file tagging method provided by the present disclosure. The non-transitory computer-readable storage medium of the present disclosure stores computer instructions, and the computer instructions are used to enable a computer to execute the AI and RPA-based file tagging method provided in the present disclosure.

The memory 802, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/ modules (for example, the acquisition module 710, the generation module 720, the drawing module 730, the first determination module 740 and the second determination module 750 shown in FIG. 7). The processor 801 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 802, that is, implements the AI and RPA-based file marking method in the above method embodiments.

The memory 802 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; data etc. In addition, the memory 802 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the storage 802 may optionally include storages that are set remotely relative to the processor 801, and these remote storages may be connected to the electronic device for document annotation based on AI and RPA through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the AI and RPA-based document tagging method may further include: an input device 803 and an output device 804 . The processor 801, the memory 802, the input device 803, and the output device 804 may be connected through a bus or in other ways. In FIG. 8, connection through a bus is taken as an example.

The input device 803 can receive input numbers or character information, and generate key signal inputs related to the user settings and function control of the generated electronic equipment based on AI and RPA file annotation, such as touch screen, small keyboard, mouse, trackpad, Input devices such as a touchpad, pointing stick, one or more mouse buttons, trackball, joystick, etc. The output device 804 may include a display device, an auxiliary lighting device (eg, LED), a tactile feedback device (eg, a vibration motor), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.

These computing programs (also referred to as programs, software, software applications, or codes) include machine instructions for a programmable processor and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine language calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or means for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with the user, the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.

The systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.

A computer system may include clients and servers. Clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

In addition, the acquisition, storage, and application of information involved in the technical solutions of the present disclosure comply with relevant laws and regulations, and do not violate public order and good customs.

It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution proposed in the present disclosure can be achieved, no limitation is imposed herein.

The specific implementation manners described above do not limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims

A document labeling method based on artificial intelligence AI and robotic process automation RPA, characterized in that it includes:

The RPA system obtains a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated;

The RPA system generates a response result corresponding to the file annotation request in response to the file annotation request;

The RPA system draws the target picture corresponding to the file to be marked according to the response result;

The RPA system determines the area range of the text label in the target picture in response to the mouse event;

The RPA system determines the text labeling result within the area range according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information.
The method according to claim 1, wherein the RPA system is based on the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the corresponding positions of each text segment of the first text information information to determine the text labeling results within the range of the region, including:

The RPA system determines the sub-file to be marked to which the area range belongs according to the vertex coordinate information of the area range and the height information of the sub-file to be marked in the file to be marked;

The RPA system determines the location information of the region range relative to the target sub-picture corresponding to the sub-file to be marked to which the region range belongs;

According to the position information of the target sub-picture corresponding to the to-be-marked sub-file to which the area range belongs, the RPA system sets the first text information and the position information corresponding to each text segment of the first text information , determine the text labeling results within the range of the region.
The method according to claim 1 or 2, wherein the RPA system, according to the position information of the region range relative to the target sub-picture corresponding to the sub-file to be marked to which the region range belongs, in the first text information In the position information corresponding to each text fragment of the first text information, the text labeling result within the range of the region is determined, including:

The RPA system determines the second text corresponding to the sub-file to be marked to which the area range belongs in the first text information according to the position information of the sub-file to be marked to which the area range belongs in the file to be marked information;

According to the correspondence between the second text information and the first text information, the RPA system determines the location information corresponding to each text segment of the second text information in the position information corresponding to each text segment of the first text information location information;

The RPA system determines the third text information in the second text information within the area range according to the position information of the area range relative to the subfile to be marked to which the area range belongs;

The RPA system, according to the corresponding relationship between the third text information and the second text information, determines in the position information corresponding to each text segment of the second text information location information;

The RPA system uses the third text information and position information corresponding to each text segment of the third text information as a text labeling result within the range of the area.
The method according to any one of claims 1 to 3, wherein the RPA system determines the area range of the text label in the target picture in response to the mouse event, including:

The RPA system monitors the mouse event of the target picture; wherein, the mouse event includes successively: a mouse click event, a mouse movement event and a mouse lift event;

The RPA system determines the first coordinates of the area range according to the mouse click event;

The RPA system determines the second coordinates of the area range according to the mouse movement event and the mouse lift event;

The RPA system determines the height value and width value of the area range according to the first coordinate and the second coordinate;

The RPA system uses the first coordinates, the second coordinates, and the enclosed area of the height value and width value of the area range as the area range marked by the text in the target picture.
The method according to any one of claims 1 to 4, wherein the RPA system generates a response result corresponding to the file labeling request in response to the file labeling request, including:

The RPA system obtains the file to be marked corresponding to the file mark request according to the file mark request;

The RPA system performs image conversion on the file to be marked, and obtains a converted picture corresponding to the file to be marked;

The RPA system performs character recognition on the converted picture based on optical character recognition OCR, so as to obtain the first text information corresponding to the document to be marked and the position information corresponding to each text segment of the first text information;

The RPA system uses the file to be marked, the first text information corresponding to the file to be marked, and the position information corresponding to each text segment of the first text information as a response result corresponding to the file marking request.
The method according to any one of claims 1 to 5, wherein the RPA system draws the target picture corresponding to the file to be marked according to the response result, including:

The RPA system obtains a plurality of subfiles to be marked of the document to be marked in the response result;

The RPA system creates a drawing object corresponding to the sub-file to be marked for each sub-file to be marked;

The RPA system determines the text information corresponding to the subfile to be marked in the first text information according to the position information of the subfile to be marked in the file to be marked;

According to the corresponding relationship between the text information and the first text information, determine the position information corresponding to each text segment of the text information in the position information corresponding to each text segment of the first text information;

The RPA system draws the target sub-picture corresponding to the sub-file to be marked according to the size information of the drawing object, the text information corresponding to the sub-file to be marked and the position information corresponding to each text segment of the text information ;

The RPA system stitches the target sub-pictures corresponding to the plurality of sub-files to be marked to obtain the target picture.
The method according to any one of claims 1-6, wherein the method further comprises:

The RPA system uses the position information of the region range relative to the target sub-picture corresponding to the sub-file to be marked to which the region range belongs, and the third text information in the region range and each of the third text information The location information corresponding to the text segment is tagged and saved as the training data of the model.
A file labeling device based on artificial intelligence AI and robotic process automation RPA, characterized in that the file labeling device is applied to the RPA system, including:

An acquisition module, configured to acquire a file annotation request; wherein, the file annotation request is used to annotate the file to be annotated;

A generation module, configured to generate a response result corresponding to the file annotation request in response to the file annotation request;

A drawing module, configured to draw a target picture corresponding to the file to be marked according to the response result;

The first determining module is used to determine the area range of the text label in the target picture in response to the mouse event;

The second determination module is configured to determine the text within the region according to the first text information obtained by performing optical character recognition (OCR) on the document to be marked and the position information corresponding to each text segment of the first text information Label the results.
The device according to claim 8, wherein the second determining module is configured to:

According to the vertex coordinate information of the area range and the height information of the sub-file to be marked in the file to be marked, determine the sub-file to be marked to which the area range belongs;

Determine the location information of the region range relative to the target sub-picture corresponding to the to-be-marked sub-file to which the region range belongs;

According to the position information of the area range relative to the target sub-picture corresponding to the sub-file to be marked to which the area range belongs, in the first text information and the position information corresponding to each text segment of the first text information, determine the Text labeling results within the above range.
The device according to claim 8 or 9, wherein the second determination module is further configured to:

Determining second text information corresponding to the sub-file to be marked to which the area range belongs in the first text information according to the position information of the sub-file to be marked to which the area range belongs in the file to be marked;

According to the corresponding relationship between the second text information and the first text information, determine the position information corresponding to each text segment of the second text information in the position information corresponding to each text segment of the first text information;

determining third text information in the second text information within the area range according to the position information of the area range relative to the sub-file to be marked to which the area range belongs;

According to the corresponding relationship between the third text information and the second text information, determine the position information corresponding to each text segment of the third text information in the position information corresponding to each text segment of the second text information;

The third text information and the position information corresponding to each text segment of the third text information are used as a text labeling result within the range of the area.
The device according to any one of claims 8 to 10, wherein the first determining module is configured to:

Listening to the mouse event of the target picture; wherein, the mouse event includes in turn: a mouse click event, a mouse movement event and a mouse lift event;

determining the first coordinates of the area range according to the mouse click event;

determining the second coordinates of the area range according to the mouse movement event and the mouse lift event;

determining a height value and a width value of the area range according to the first coordinates and the second coordinates;

An area enclosed by the first coordinates, the second coordinates, and the height value and width value of the area range is used as the area range marked by the text in the target picture.
The device according to any one of claims 8 to 11, wherein the generating module is configured to:

Acquiring, according to the file labeling request, a file to be marked corresponding to the file labeling request;

performing image conversion on the file to be marked, and obtaining a converted picture corresponding to the file to be marked;

performing character recognition on the converted image based on optical character recognition (OCR), to obtain the first text information corresponding to the file to be marked and the position information corresponding to each text segment of the first text information;

The file to be marked, the first text information corresponding to the file to be marked, and the position information corresponding to each text segment of the first text information are used as a response result corresponding to the file mark request.
The device according to any one of claims 8 to 12, wherein the drawing module is configured to:

Obtain multiple sub-files to be marked of the file to be marked in the response result;

For each sub-file to be marked in the file to be marked, create a drawing object corresponding to the sub-file to be marked;

According to the position information of the subfile to be marked in the file to be marked, determine the text information corresponding to the subfile to be marked in the first text information;

According to the corresponding relationship between the text information and the first text information, determine the position information corresponding to each text segment of the text information in the position information corresponding to each text segment in the first text information;

Draw the target sub-picture corresponding to the sub-file to be marked according to the size information of the drawing object, the text information corresponding to the sub-file to be marked and the position information corresponding to each text segment of the text information;

Image splicing is performed on the target sub-pictures corresponding to the plurality of sub-files to be marked to obtain the target picture.
The device according to any one of claims 8-13, wherein the device further comprises:

A processing module, configured to combine the location information of the area range with respect to the sub-file to be marked to which the area range belongs, and the third text information in the area range and the corresponding text fragments of the third text information The location information is tagged and saved as the training data for the model.
An electronic device, characterized in that it includes a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the computer program, any of claims 1-7 can be realized. one of the methods described.
A non-transitory computer-readable storage medium on which a computer program is stored, wherein the computer program implements the method according to any one of claims 1-7 when executed by a processor.
A computer program product, characterized by comprising a computer program, the computer program implementing the method according to any one of claims 1-7 when executed by a processor.