CN113407745A

CN113407745A - Data annotation method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN113407745A
Application number: CN202110733929.5A
Authority: CN
Inventors: 李晨辉; 胡腾; 陈永锋
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2021-09-17

Abstract

The present disclosure provides a data labeling method, device, electronic device and computer readable storage medium, which relates to the field of artificial intelligence such as computer vision and natural language processing, wherein the method can comprise: analyzing the rich document to be processed, and generating a labeling interface corresponding to the content type in the rich document according to an analysis result; and generating a key value pair according to the operation executed by the user aiming at the labeling interface, and taking the obtained key value pair as a data labeling result. By applying the scheme disclosed by the invention, the labeling and the like of various types of data in the rich document can be conveniently and efficiently realized.

Description

Data annotation method and device, electronic equipment and computer readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a data labeling method and apparatus, an electronic device, and a computer-readable storage medium in the fields of computer vision and natural language processing.

Background

The rich document contains rich data, such as text content, table content, picture content and the like, which are required by the artificial intelligence algorithm.

However, how to label data in the rich document for the artificial intelligent algorithm to learn and the like does not have a better implementation mode at present.

Disclosure of Invention

The disclosure provides a data annotation method, a data annotation device, an electronic device and a computer-readable storage medium.

A method of data annotation, comprising:

analyzing a rich document to be processed, and generating a labeling interface corresponding to the content type in the rich document according to an analysis result;

and generating a key value pair according to the operation executed by the user aiming at the labeling interface, and taking the obtained key value pair as a data labeling result.

A data annotation device, comprising: the system comprises an analysis module and a marking module;

the analysis module is used for analyzing the rich document to be processed and generating a labeling interface corresponding to the content type in the rich document according to an analysis result;

and the labeling module is used for generating key value pairs according to the operation executed by the user aiming at the labeling interface and taking the obtained key value pairs as data labeling results.

An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as described above.

A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method as described above.

A computer program product comprising a computer program which, when executed by a processor, implements a method as described above.

One embodiment in the above disclosure has the following advantages or benefits: the annotation interface can be generated by analyzing the rich document, and the key value pair can be generated based on the operation executed by the user on the annotation interface, so that the annotation of various types of data in the rich document is conveniently and efficiently realized.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flow chart of an embodiment of a data annotation method according to the present disclosure;

FIG. 2 is a schematic illustration of the text content of the present disclosure;

FIG. 3 is a schematic illustration of the contents of a table according to the present disclosure;

FIG. 4 is a schematic diagram of textual information in the displayed picture content according to the present disclosure;

FIG. 5 is a schematic diagram illustrating a structure of an embodiment 500 of the data annotation device according to the present disclosure;

FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In addition, it should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

Fig. 1 is a flowchart of an embodiment of a data annotation method according to the present disclosure. As shown in fig. 1, the following detailed implementation is included.

In step 101, the rich document to be processed is parsed, and a labeling interface corresponding to the content type included in the rich document is generated according to the parsing result.

In step 102, a key-value pair is generated according to an operation performed by a user on a labeling interface, and the obtained key-value pair is used as a data labeling result.

In the scheme of the embodiment of the method, the rich document can be analyzed to generate the labeling interface, and the key value pair can be generated based on the operation executed by the user on the labeling interface, so that the labeling of various types of data in the rich document is conveniently and efficiently realized.

The content types included in the rich document may include one or any combination of the following: text content, table content, picture content, and the like.

Correspondingly, after the rich document is analyzed, the labeling interfaces respectively corresponding to different content types in the rich document can be generated according to the analysis result. How the parsing is performed is not limited.

When the rich document comprises the text content, a corresponding text labeling interface can be generated, and the text labeling interface comprises the text content. When the table content is included in the rich document, a corresponding table labeling interface can be generated, and the table labeling interface includes the table content. When the rich document comprises the picture content, a corresponding picture marking interface can be generated, and the picture marking interface comprises the picture content.

It can be seen from the above description that, for different content types, corresponding annotation interfaces can be generated respectively, so that differentiated processing for different content types is realized, and processing is more targeted.

The following describes the processing method for different content types.

1) Text content

As previously described, for textual content, a corresponding textual annotation interface may be generated.

According to the text content displayed in the text labeling interface, the content which is selected by a user and serves as a key and a corresponding value can be obtained, and a key value pair is generated according to the obtained content.

Fig. 2 is a schematic diagram of the text content of the present disclosure. As shown in fig. 2, where "a square" is a key, "a second three building engineering limited company" is a value corresponding to "a square," similarly, "a second square" is a key, and "ABC building materials sales limited company" is a value corresponding to "b square.

How the user selects the content as keys and corresponding values may be determined according to actual needs. For example, as a possible implementation manner, two buttons, which are referred to as a button a and a button b for convenience of description, may be displayed on the text labeling interface, and after the user clicks the button a, the user may select a key, and accordingly, content that is subsequently selected by the user from the text content displayed in the text labeling interface may be used as the key, and if the user clicks the button b, the user may select a value, and accordingly, content that is subsequently selected by the user from the text content displayed in the text labeling interface may be used as the value, and then a key-value pair may be generated according to the obtained key and the value.

In practical applications, a key may correspond to multiple values, and the operations of clicking the button b and selecting the content as the value are repeated, and each time the operation is repeated, a value can be obtained. When the user clicks the button a again, it indicates that a new key-value pair is to be generated, and a new key is selected.

The above describes how a key-value pair is generated when both a key and a corresponding value appear in the text content, and in some cases, only a value may appear in the text content without a key, and for this case, the following processing may be employed.

And aiming at the text content displayed in the text labeling interface, acquiring the content which is selected by a user as a value, determining a key corresponding to the acquired value, and generating a key value pair according to the determined key and the acquired value.

As a possible implementation manner, the user may define one or more keys in advance, for example, a defined key is "first party", then a button corresponding to the key may be clicked to indicate that the selection of the value corresponding to the key is to be performed, and accordingly, the content that is selected by the user from the text content displayed in the text labeling interface later may be used as the value corresponding to the key, so as to generate a key-value pair.

It can be seen from the above description that, when generating a key value pair based on a text labeling interface, the method can support both the case where a key and a corresponding value appear in text content, and the method can support a user to define the key and select the corresponding value in the text content, i.e., has better applicability to various cases.

2) Table content

For the table content, a corresponding table annotation interface may be generated.

The method comprises the steps of aiming at each cell in table content displayed on a table labeling interface, respectively obtaining categories set by a user, wherein the categories can comprise keys and values, obtaining direction information of the values corresponding to the cells of the keys, which are set by the user, and generating key value pairs according to the setting made by the user.

Fig. 3 is a schematic diagram of the table contents of the present disclosure. As shown in fig. 3, the category of each cell can be set separately, where the category may include a key and a value, and may also include other categories, such as the category of the light gray cell shown in fig. 3 is a key, the category of the white cell is a value, the category of the dark gray cell is other (neither a key nor a value), and an arrow indicates direction information where the value corresponding to the cell whose category is a key is located.

As a possible implementation manner, in the table labeling interface, a button may be displayed at a predetermined position on each cell, and after a user clicks any button, categories that can be selected by the user, such as keys, values, and others, may be displayed, and after the user clicks the corresponding category, the category clicked by the user may be used as the category of the cell corresponding to the button. In addition, for each cell, a button for selecting a direction can be displayed, similarly, after the user clicks any button, direction information that can be selected by the user can be displayed, for example, the direction information may include up, down, left, right, and the like, and after the user clicks the corresponding direction information, the direction information clicked by the user can be used as the direction information where the value corresponding to the cell (the category is a key) corresponding to the button is located.

And according to the operation of the user, the key value pair can be correspondingly generated. In practical applications, a key value pair may include only one key and one value, may include one key and multiple values, may include multiple keys and one value, may include multiple keys and multiple values, and the like.

It can be seen that, by adopting the processing mode, a user can respectively process each cell, which is simple and convenient, so that the data marking efficiency is improved, and the method has universal applicability, such as applicability to simple one-dimensional tables, complex multi-dimensional tables and the like.

3) Picture content

And aiming at the picture content, a corresponding picture marking interface can be generated.

The method comprises the steps of acquiring content which is selected by a user and serves as a key and a corresponding value aiming at text information in the picture content displayed in a picture labeling interface, and generating a key value pair according to the acquired content.

Fig. 4 is a schematic diagram of textual information in the displayed picture content according to the present disclosure. As shown in fig. 4, for the picture content, that is, data in the form of a picture, each Character and its position in the picture may be identified by an Optical Character Recognition (OCR) technology, and the position of each identified Character may be framed by a rectangular frame, similar to a text labeling interface, two buttons may be displayed in the picture labeling interface, and a user's clicking operation on the two buttons respectively indicates that the user is about to select a key and a value, and accordingly, the user may respectively frame the content of the key and the value, and further generate a key-value pair, for example, the key is an "import port", and the corresponding value is an "airport", and the like.

Through the processing, a user can directly select keys and values on the picture, flexibility and convenience are achieved, and therefore data labeling efficiency is improved.

The key-value pairs obtained in any of the above-described manners may be stored in a predetermined format. The specific format is not limited, such as a data format for facilitating the learning of an artificial intelligence algorithm.

In addition, for any key-value pair, one or all of the following information may also be stored: position information, direction information. The position information is position information of a key and/or a value in the key value pair in the rich document, such as a page, a paragraph and the like, and the direction information is direction information of the value in the key value pair relative to the key.

For any key-value pair, if location information can be obtained, the obtained location information may be stored, otherwise, the location information may not be stored, and similarly, if direction information can be obtained, the obtained direction information may be stored, otherwise, the direction information may not be stored.

After the key value pairs are uniformly stored according to the mode, the key value pairs can be conveniently managed and used subsequently.

It is noted that while for simplicity of explanation, the foregoing method embodiments are described as a series of acts, those skilled in the art will appreciate that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required for the disclosure.

The above is a description of embodiments of the method, and the embodiments of the apparatus are further described below.

Fig. 5 is a schematic structural diagram illustrating a composition of an embodiment 500 of the data annotation device according to the present disclosure. As shown in fig. 5, includes: a parsing module 501 and a labeling module 502.

And the parsing module 501 is configured to parse the rich document to be processed, and generate a labeling interface corresponding to the content type included in the rich document according to a parsing result.

And the labeling module 502 is configured to generate a key value pair according to an operation performed by a user on a labeling interface, and use the obtained key value pair as a data labeling result.

Correspondingly, after the rich document is parsed, the parsing module 501 may generate the labeling interfaces corresponding to different content types included in the rich document according to the parsing result.

The parsing module 501 may generate a corresponding text labeling interface when the rich document includes text content, where the text labeling interface includes the text content, and generate a corresponding table labeling interface when the rich document includes table content, where the table labeling interface includes the table content, and generate a corresponding picture labeling interface when the rich document includes picture content, where the picture labeling interface includes the picture content.

Further, the labeling module 502 may generate a key-value pair according to an operation performed by a user on the labeling interface, and use the obtained key-value pair as a data labeling result.

For example, the labeling module 502 may obtain, for the text content displayed in the text labeling interface, the content selected by the user as the key and the corresponding value, and generate the key value pair according to the obtained content, and/or obtain, for the text content displayed in the text labeling interface, the content selected by the user as the value, determine the key corresponding to the obtained value, and generate the key value pair according to the determined key and the obtained value.

For another example, the labeling module 502 may obtain categories set by the user, including keys and values, for each cell in the table content displayed on the table labeling interface, obtain direction information where the value corresponding to the cell with the category set by the user is located, and generate a key value pair according to the setting made by the user.

For another example, the labeling module 502 may obtain, for text information in the picture content displayed in the picture labeling interface, content selected by the user as a key and a corresponding value, and generate a key value pair according to the obtained content.

Regardless of the key-value pairs obtained in any of the above manners, the labeling module 502 may store them in a predetermined format. The specific format is not limited, such as a data format for facilitating the learning of an artificial intelligence algorithm.

In addition, for any key-value pair, the tagging module 502 may also store one or all of the following information: position information, direction information. The position information is position information of a key and/or a value in the key value pair in the rich document, such as a page, a paragraph and the like, and the direction information is direction information of the value in the key value pair relative to the key.

For a specific work flow of the apparatus embodiment shown in fig. 5, reference is made to the related description in the foregoing method embodiment, and details are not repeated.

In short, by adopting the scheme of the embodiment of the disclosure, the annotation interface can be generated by analyzing the rich document, and the key value pair can be generated based on the operation executed by the user on the annotation interface, so that the annotation of various types of data in the rich document and the like can be conveniently and efficiently realized.

The scheme disclosed by the disclosure can be applied to the field of artificial intelligence, in particular to the fields of computer vision, natural language processing and the like.

Artificial intelligence is a subject for studying a computer to simulate some thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning and the like) of a human, and has a hardware technology and a software technology, the artificial intelligence hardware technology generally comprises technologies such as a sensor, a special artificial intelligence chip, cloud computing, distributed storage, big data processing and the like, and the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge graph technology and the like.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 601 performs the various methods and processes described above, such as the methods described in this disclosure. For example, in some embodiments, the methods described in this disclosure may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into RAM 603 and executed by the computing unit 601, one or more steps of the methods described in the present disclosure may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured by any other suitable means (e.g., by means of firmware) to perform the methods described in the present disclosure.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and Virtual Private Server (VPS). The server may also be a server of a distributed system, or a server incorporating a blockchain. Cloud computing refers to accessing an elastically extensible shared physical or virtual resource pool through a network, resources can include servers, operating systems, networks, software, applications, storage devices and the like, a technical system for deploying and managing the resources in a self-service mode as required can be achieved, and efficient and powerful data processing capacity can be provided for technical applications and model training of artificial intelligence, block chains and the like through a cloud computing technology.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of data annotation, comprising:

2. The method of claim 1, wherein,

the content type included in the rich document comprises one or any combination of the following: text content, table content, and picture content;

the generating of the annotation interface corresponding to the content type included in the rich document includes:

when the rich document comprises text content, generating a corresponding text labeling interface, wherein the text labeling interface comprises the text content;

when the rich document comprises table content, generating a corresponding table labeling interface, wherein the table labeling interface comprises the table content;

and when the rich document comprises the picture content, generating a corresponding picture marking interface, wherein the picture marking interface comprises the picture content.

3. The method of claim 2, wherein,

when the labeling interface is a text labeling interface, the generating of the key value pair according to the operation executed by the user aiming at the labeling interface comprises one or all of the following steps:

acquiring content which is selected by a user and serves as a key and a corresponding value aiming at the text content displayed in the text labeling interface, and generating a key value pair according to the acquired content;

and aiming at the text content displayed in the text labeling interface, acquiring content which is selected by a user and serves as a value, determining a key corresponding to the acquired value, and generating a key value pair according to the determined key and the acquired value.

4. The method of claim 2, wherein,

when the labeling interface is a table labeling interface, the generating a key value pair according to the operation executed by the user aiming at the labeling interface comprises:

and aiming at each cell in the table content displayed on the table labeling interface, respectively acquiring the category set by the user, wherein the category comprises a key and a value, acquiring the direction information of the value corresponding to the cell of which the category set by the user is the key, and generating a key value pair according to the setting made by the user.

5. The method of claim 2, wherein,

when the labeling interface is a picture labeling interface, the generating a key value pair according to the operation executed by the user aiming at the labeling interface comprises:

and acquiring content which is selected by a user as a key and a corresponding value aiming at the text information in the picture content displayed in the picture labeling interface, and generating a key value pair according to the acquired content.

6. The method of any of claims 1-5, further comprising: and storing the obtained key value pairs according to a preset format.

7. The method of claim 6, further comprising:

for any key-value pair, one or all of the following information is stored: position information, direction information;

the position information is the position information of the key and/or the value in the key value pair in the rich document, and the direction information is the direction information of the value in the key value pair relative to the key.

8. A data annotation device, comprising: the system comprises an analysis module and a marking module;

9. The apparatus of claim 8, wherein,

the analysis module generates a corresponding text labeling interface when the rich document comprises text content, the text labeling interface comprises the text content, and generates a corresponding table labeling interface when the rich document comprises the table content, the table labeling interface comprises the table content, and generates a corresponding picture labeling interface when the rich document comprises the picture content, and the picture labeling interface comprises the picture content.

10. The apparatus of claim 9, wherein,

the labeling module acquires contents which are selected by a user as keys and corresponding values aiming at the text contents displayed in the text labeling interface, and generates key value pairs according to the acquired contents;

and/or the labeling module acquires content which is selected by a user as a value according to the text content displayed in the text labeling interface, determines a key corresponding to the acquired value, and generates a key value pair according to the determined key and the acquired value.

11. The apparatus of claim 9, wherein,

the labeling module respectively acquires the categories set by the user aiming at each cell in the table content displayed on the table labeling interface, wherein the categories comprise keys and values, acquires the direction information of the values corresponding to the cells of which the categories are the keys and is set by the user, and generates key value pairs according to the setting made by the user.

12. The apparatus of claim 9, wherein,

and the labeling module acquires contents which are selected by a user as keys and corresponding values aiming at the text information in the picture contents displayed in the picture labeling interface, and generates key value pairs according to the acquired contents.

13. The apparatus of any one of claims 8 to 12,

and the marking module is further used for storing the obtained key value pairs according to a preset format.

14. The apparatus of claim 13, wherein,

the labeling module is further configured to, for any key-value pair, store one or all of the following information: position information, direction information; the position information is the position information of the key and/or the value in the key value pair in the rich document, and the direction information is the direction information of the value in the key value pair relative to the key.

15. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1-7.

17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-7.