CN111310750B

CN111310750B - Information processing method, device, computing equipment and medium

Info

Publication number: CN111310750B
Application number: CN201811513713.2A
Authority: CN
Inventors: 陆忠芳; 王涛; 金磊豪
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-12-11
Filing date: 2018-12-11
Publication date: 2023-04-25
Anticipated expiration: 2038-12-11
Also published as: CN111310750A

Abstract

The invention discloses an information processing method, an information processing device, a computing device and a medium, wherein the information processing method comprises the following steps: carrying out text information recognition on each image in the image set to acquire one or more pieces of key text information contained in the image and position information of the key text information; and correlating the key text information of the image with the corresponding position information to generate corresponding key data.

Description

Information processing method, device, computing equipment and medium

Technical Field

The present invention relates to the field of image processing and internet technologies, and in particular, to an information processing method, an information processing device, a computing device, and a medium.

Background

At present, document recognition services based on OCR (Optical Character Recognition, optical character recognition, that is, a process of checking characters printed on paper by electronic devices such as scanners or digital cameras, determining their shapes by detecting dark and bright patterns, and then translating the shapes into computer characters by a character recognition method) have been used in enterprises to effectively proxy manual entry information, thereby greatly improving work efficiency.

However, conventional document recognition only enables converting paper graphic files into text format, but there are still some problems when applied in more specialized fields. For example, in an actual business scenario, key information in a paper document is usually identified according to the document type, structured data is generated and then stored in a warehouse, but the existing scheme only converts the content on the paper document into an electronic document through an OCR technology, and then more cases are that useful information is manually extracted from the electronic document to be submitted and put in a warehouse, and a more intelligent and more automatic mode is not adopted.

In addition, considering that errors exist in character recognition, further editing and verification are often needed in practical application, but manual operation is unavoidable in the part, and if no good auxiliary means exist, the working efficiency is also affected. For example, the editing area may be highlighted by marking the position by an element of the DOM (Document Object Model, a document object model), introducing a third party JS (JavaScript, an transliterated script language) class library, or the like, to prompt the current operation position, thereby assisting the user in completing editing. However, many additional DOM elements are generated based on the DOM element mark positions, the performance is poor and the method is not suitable for drawing irregular areas, and the third party JS class library needs to additionally introduce script files and only uses a small part of functions, so that the method cannot be reimbursed.

Based on this, how to correctly identify the key information in the document and provide better auxiliary means to promote the accuracy and convenience of editing and checking becomes the key to solve the above problems. Accordingly, there is a need to provide a technical solution to optimize the above-mentioned process.

Disclosure of Invention

To this end, the present invention provides an information processing scheme in an effort to solve or at least alleviate the above-identified problems.

According to an aspect of the present invention, there is provided an information processing method including the steps of: firstly, recognizing text information of each image in an image set to acquire one or more pieces of key text information contained in the image and position information of the key text information; and correlating the key text information of the image with the corresponding position information to generate corresponding key data.

Optionally, in the information processing method according to the present invention, performing text information recognition on each image in the image set includes: performing photoelectric character recognition on each image in the image set to acquire text information contained in the image and position information of the text information; extracting the title of each acquired text message to determine one or more titles in the text corresponding to the image set; classifying each text message according to the title to obtain a text set corresponding to the title and the category of the text set, wherein the text set comprises one or more text messages corresponding to the title; the text set is input into a named entity recognition model corresponding to the category of the text set so as to extract one or more pieces of key text information contained in the text set.

Optionally, in the information processing method according to the present invention, performing text information recognition on each image in the image set includes: and determining the position information of the key text information according to the position information of each text information in the text set to which the key text information belongs.

Optionally, in the information processing method according to the present invention, further comprising acquiring an image set in advance, the acquiring the image set in advance includes: converting the document in the first format to generate a plurality of images corresponding to the document; if the edge intensity of the image is smaller than a preset intensity threshold value, judging that the image is a blank image; the multiple images corresponding to the document are partitioned based on the blank images to form one or more image sets.

Optionally, in the information processing method according to the present invention, the document includes a legal document.

Optionally, in the information processing method according to the present invention, the position information of the keyword information includes coordinate information, and an identification of an image from which the keyword information is derived.

Optionally, in the information processing method according to the present invention, further comprising: generating a set identifier of the image set; and storing the set identifier and the image set and key data corresponding to each image in the image set in an associated manner.

Optionally, in the information processing method according to the present invention, further comprising: responding to a request of a client, wherein the request comprises a set identifier to be searched; searching an image set associated with the set identifier to be searched according to the request; and sending the searched image set and the key data corresponding to each image in the image set to the client so that the client can display in the display interface.

According to still another aspect of the present invention, there is provided an information processing method including the steps of: firstly, detecting the current position of a cursor in a display interface, wherein the display interface comprises a first area and a second area, and the second area comprises one or more preset areas; if the current position is in the preset area, marking the display content corresponding to the preset area in the first area; the first area displays the images in the image set of the information processing method according to the invention, and the second area displays the key text information corresponding to the images.

Optionally, in the information processing method according to the present invention, before detecting the current position of the cursor in the display interface, the method further includes: responding to the operation of a user, and acquiring a set identifier to be searched according to the operation; and sending a request to a server according to the set identifier forming request so as to acquire an image set associated with the set identifier to be searched and key data corresponding to each image in the image set.

Optionally, in the information processing method according to the present invention, further comprising: receiving an image set issued by a server and key data corresponding to each image in the image set, wherein the key data comprises key text information and position information of the key text information contained in the image; the image is displayed in the first area, and the key text information contained in the image is displayed in the second area.

Optionally, in the information processing method according to the present invention, displaying the keyword information contained in the image in the second area includes: and correspondingly displaying the key text information contained in the image in a preset area in the second area.

Optionally, in the information processing method according to the present invention, marking the display content corresponding to the preset area in the first area includes: acquiring position information of key text information corresponding to a preset area; and determining display contents corresponding to the preset area in the first area according to the position information, and marking the display contents.

Optionally, in the information processing method according to the present invention, marking the display content includes: the content is displayed by superimposing canvas element tags in the first region.

Optionally, in the information processing method according to the present invention, the position information of the keyword information includes coordinate information, and an identification of an image from which the keyword information is derived, and the method further includes: and if the number of the marks of the images from which the key text information is derived in the preset area where the cursor is positioned is larger than 1, generating a corresponding switching icon in the first area.

According to still another aspect of the present invention, there is provided an information processing apparatus including an identification module and a generation module. The recognition module is suitable for recognizing the text information of each image in the image set to acquire one or more pieces of key text information contained in the image and position information of the key text information; the generation module is suitable for associating the key text information of the image with the corresponding position information to generate corresponding key data.

According to still another aspect of the present invention, there is provided an information processing apparatus including a detection module and a marking module. The detection module is suitable for detecting the current position of a cursor in the display interface, the display interface comprises a first area and a second area, the second area comprises one or more preset areas, the first area is displayed with images in the image set of the information processing device, and the second area is displayed with key text information corresponding to the images; the marking module is suitable for marking the display content corresponding to the preset area in the first area when the current position is in the preset area.

According to yet another aspect of the present invention, there is provided a computing device comprising one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the information processing method according to the present invention.

According to yet another aspect of the present invention, there is also provided a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform the information processing method according to the present invention.

According to the information processing scheme, the text of the image format is identified by the photoelectric character identification technology, so that one or more pieces of key text information contained in the text and the position information of the key text information are acquired, the key text information and the corresponding position information are associated, structured key data are generated, and the method is more intelligent and automatic. In the text recognition process, each page of document converted into the image format is recognized based on a specific recognition algorithm corresponding to the document type, so that the recognition accuracy is improved.

Further, based on the generated structured key data, a function of online editing and checking by a user is provided, and according to the position information of each key word information, a highlight region of information at the current operation association position of the user in the second region is drawn in the first region of the corresponding image of the display document, so as to assist editing and checking. The highlight region can be drawn by using canvas elements, and compared with DOM elements, a third-party JS class library and the like, the highlight region drawing method is more convenient and suitable for the scenes.

Drawings

To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which set forth the various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to fall within the scope of the claimed subject matter. The above, as well as additional objects, features, and advantages of the present disclosure will become more apparent from the following detailed description when read in conjunction with the accompanying drawings. Like reference numerals generally refer to like parts or elements throughout the present disclosure.

FIG. 1 shows a schematic diagram of an information handling system 100 according to one embodiment of the invention;

FIG. 2 illustrates a block diagram of a computing device 200, according to one embodiment of the invention;

FIG. 3 shows a flow chart of an information processing method 300 according to one embodiment of the invention;

FIG. 4 shows a schematic diagram of images in a collection of images according to one embodiment of the invention;

FIG. 5 shows a flow chart of an information processing method 500 according to one embodiment of the invention;

FIG. 6A shows a schematic diagram of information display in a first region and a second region according to one embodiment of the invention;

FIG. 6B shows a schematic diagram of information display in a first area and a second area after marking according to one embodiment of the invention;

FIG. 6C shows a schematic diagram of information display in a first region and a second region according to yet another embodiment of the invention;

fig. 7 shows a schematic diagram of an information processing apparatus 700 according to an embodiment of the present invention; and

fig. 8 shows a schematic diagram of an information processing apparatus 800 according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

FIG. 1 shows a schematic diagram of an information handling system 100 according to one embodiment of the invention. It should be noted that the information processing system 100 in fig. 1 is merely exemplary, and in a specific practical situation, there may be different numbers of terminal devices and servers in the information processing system 100, and the terminal devices may be electronic devices such as a PC, a smart phone, a tablet computer, etc., which is not a limitation of the present invention.

As shown in fig. 1, the information processing system 100 includes a terminal device 110 and a server 120. In which an information processing apparatus 111 (not shown) resides in the terminal device 110, and an information processing apparatus 121 (not shown) resides in the server 120. Next, the information processing system 100 will be described in a specific application scenario.

In this scenario, the paper document is scanned by scanning to generate a PDF (Portable Document Format ) electronic document, and the obtained PDF electronic document is uploaded to a server 120 communicatively connected to the scanner, where the server 120 converts each page in the PDF electronic document into an image by a conversion tool, for example, a PDF electronic document of 5 pages may be converted into 5 images. After the obtained images are formed into an image set, the server 120 performs text information recognition on each image in the image set through the information processing device 121, so as to obtain one or more pieces of key text information contained in the images and position information of the key text information, and associates the key text information of the images with the corresponding position information to generate corresponding key data. Thereafter, a set identifier of the image set is generated, and the set identifier is associated with the image set and key data corresponding to each image in the image set, and is stored in the server 120, or may be stored in a database server (not shown) connected to the server 120.

For the terminal device 110, the information processing apparatus 111 may be understood as a plug-in disposed on a client (such as a platform system) that is typically a browser or a mode with a Web application or a hybrid application, and is capable of invoking an HTML (HyperText Markup Language ) page to display information. After the user logs in to this client, the information processing apparatus 111 obtains a set identifier to be searched according to an operation by the user in response to the operation, and sends a request to the server 120 according to the set identifier formation request to obtain an image set associated with the set identifier to be searched and key data corresponding to each image in the image set.

The server 120 will respond to the request of the client at this time, search the image set associated with the set identifier to be searched according to the request, and send the searched image set and the key data corresponding to each image in the image set to the client of the terminal device 110, where the key data includes the key text information and the position information of the key text information contained in the image.

In the terminal device 110, the information processing apparatus 111 receives the image set sent by the server 120 and the key data corresponding to each image in the image set, and the current display interface includes a first area and a second area, so that the image is displayed in the first area, and the key text information included in the image is correspondingly displayed in the preset area in the second area. For example, in the current interface, a first image in the image set is displayed in a first area, and key text information contained in the image is correspondingly displayed in a preset area in a second area, where the first image is the first page in the PDF electronic document, and the key text information is the key text content in the first page in the PDF electronic document. And then, detecting the current position of the cursor in the display interface, and if the current position is in the preset area, marking the display content corresponding to the preset area in the first area through a canvas element, so that the user can edit and verify the key word information conveniently.

According to an embodiment of the present invention, the terminal device 110 and the server 120 in the information processing system 100 may be implemented by the computing device 200 as described below. FIG. 2 illustrates a block diagram of a computing device 200, according to one embodiment of the invention.

As shown in FIG. 2, in a basic configuration 202, computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.

Depending on the desired configuration, the processor 204 may be any type of processing including, but not limited to: a microprocessor (μp), a microcontroller (μc), a digital information processor (DSP), or any combination thereof. Processor 204 may include one or more levels of cache, such as a first level cache 210 and a second level cache 212, a processor core 214, and registers 216. The example processor core 214 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations, the memory controller 218 may be an internal part of the processor 204.

Depending on the desired configuration, system memory 206 may be any type of memory including, but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The system memory 206 may include an operating system 220, one or more programs 222, and program data 224. In some implementations, the program 222 may be arranged to execute instructions on an operating system by the one or more processors 204 using the program data 224.

Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to basic configuration 202 via bus/interface controller 230. The example output device 242 includes a graphics processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. The example peripheral interface 244 may include a serial interface controller 254 and a parallel interface controller 256, which may be configured to facilitate communication via one or more I/O ports 258 and external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.). The example communication device 246 may include a network controller 260 that may be arranged to facilitate communication with one or more other computing devices 262 over a network communication link via one or more communication ports 264.

The network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media in a modulated data signal, such as a carrier wave or other transport mechanism. A "modulated data signal" may be a signal that has one or more of its data set or changed in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or special purpose network, and wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR) or other wireless media. The term computer readable media as used herein may include both storage media and communication media.

Computing device 200 may be implemented as a server, such as a file server, database server, application server, WEB server, etc., as part of a small-sized portable (or mobile) electronic device, such as a cellular telephone, personal Digital Assistant (PDA), personal media player device, wireless WEB-watch device, personal headset device, application-specific device, or a hybrid device that may include any of the above functions. Computing device 200 may also be implemented as a personal computer including desktop and notebook computer configurations.

In some embodiments, computing device 200 is implemented as a terminal device 110 and/or a server 120 and is configured to perform information processing method 300 and/or information processing method 500 according to the present invention. The program 222 of the computing device 200 includes a plurality of program instructions for executing the information processing method 300 and/or the information processing method 500 according to the present invention, and the program data 224 may also store configuration information of the information processing system 100, etc.

Fig. 3 shows a flow chart of an information processing method 300 according to an embodiment of the invention. As shown in fig. 3, the method 300 begins at step S310. In step S310, text information recognition is performed on each image in the image set to obtain one or more pieces of key text information included in the image and location information of the key text information. In view of the above-described need for pre-acquisition of image sets, according to one embodiment of the present invention, the image sets may be pre-acquired before step S310 as follows.

In this embodiment, first, a document in a first format is converted to generate a plurality of images corresponding to the document, if the edge intensity of the images is smaller than a preset intensity threshold, the images are determined to be blank images, and then the plurality of images corresponding to the document are divided based on the blank images to form one or more image sets. Wherein the document comprises a legal document and the first format comprises a PDF.

For example, the PDF document a is subjected to conversion processing, and since the document a has 50 pages of contents, the generated images corresponding to the document a are 50 sheets, and the 1 st to 50 th images correspond to the contents of the 1 st to 50 th pages of documents in order. At this time, the edge detection may be performed on each obtained image by an image edge detection algorithm, such as a Canny edge detection algorithm, and if the edge intensity of the image is smaller than a preset intensity threshold, the image is determined to be a blank image. After the edge detection is finished, it can be determined that the 27 th image is a blank image, which indicates that the 27 th page is a blank page, and then the 50 images corresponding to the document A are divided based on the blank image, so that 2 image sets are formed, wherein the first image set is marked as P1 and comprises 1 to 26 images, and the second image set is marked as P2 and comprises 28 to 50 images.

The specific value of the intensity threshold may be adjusted according to the image edge detection algorithm, the document type, the performance requirement, etc. corresponding to the document, which is not limited in the present invention. In addition, the blank images are used as dividing limits, and the blank pages are used as intervals between different files based on common agreements on business, such as the fact that for documents such as legal documents, the agreements on business are separated by the blank pages.

After the image set to be processed is acquired, according to one embodiment of the present invention, text information recognition may be performed on each image in the image set as follows. First, each image in the image set is subjected to photoelectric character recognition to acquire text information included in the image and position information of the text information. In this embodiment, text information included in an image, and positional information of the text information are acquired by photoelectric character recognition for each image in the image set P1.

FIG. 4 shows a schematic diagram of images in a collection of images according to one embodiment of the invention. As shown in fig. 4, the image is the first image in the image set P1, and after the photoelectric character recognition, the image can be extracted as "civil complaints", "original notices: c1 international trade limited "," residence: text information such as Shanggang Sharp, science museum channel 1234 "and the like, and position information of each text information.

Next, header extraction is performed on each acquired text information to determine one or more headers in the text corresponding to the image set. For example, after the text information acquired in fig. 4 is subjected to the title extraction, the obtained title is "civil complaint", and only the text information corresponding to the 16 th image is extracted from the remaining 2 nd to 26 th images in the image set P1, and the title is "C7 court judgment book".

And classifying each text message according to the title to obtain a text set corresponding to the title and the category of the text set, wherein the text set comprises one or more text messages corresponding to the title. In this embodiment, only 2 titles are associated with the images in the image set P1, and "civil complaint" and "C7 court judgment" correspond to the two title categories of complaint and judgment. Of course, the title category is not limited to the prosecution and judgment, but includes a category of legal documents such as a prosecution book, court ticket, enforcement book, and the like. Based on this, the text information corresponding to the image set P1 is classified, the text information included in the 1 st to 15 th images is formed into a corresponding text set, and is denoted as T1, the category thereof is a prosecution, the text information included in the 16 th to 26 th images is formed into a corresponding text set, and is denoted as T2, and the category thereof is a decision book.

And finally, inputting the text set into a named entity recognition model corresponding to the category of the text set to extract one or more pieces of key text information contained in the text set. According to one embodiment of the invention, the text set T1 is input into a named entity recognition model corresponding to a prosecution, and the text set T2 is input into a named entity recognition model corresponding to a decision book, so as to extract one or more pieces of key text information contained in the text sets T1 and T2.

Taking the text set T1 as an example, the text set of the prosecution type includes basic information to be included in prosecution such as original notice (may be plural), case law, litigation request, facts and reasons. Further, the named entity recognition model corresponding to the prosecution should be the key text information including the original notice, the case, the litigation request, the fact and the reason, and the named entity recognition model is preferably a sequence labeling model. The sequence labeling model may be a CRF (Conditional Random Field ) model, a BiLSTM-CRF (Bi-directional Long Short-Term Memory-Conditional Random Field, two-way long and short Term Memory network-conditional random field) model, and may be appropriately adjusted according to the actual application scenario, network training situation, system configuration, performance requirement, etc., which are easily conceivable by the skilled person knowing the scheme of the present invention, and are also within the protection scope of the present invention, which is not described herein.

Of course, before the named entity recognition model corresponding to the prosecution is applied, the named entity recognition model needs to be trained in advance so that the output of the named entity recognition model indicates the key text information existing in the input text information. According to one embodiment of the invention, the entity training data set comprises a plurality of pieces of entity training data, each piece of entity training data comprises a first training text and a second training text, and the second training text is formed by marking the key word information in the first training text by an entity. Specifically, when training a named entity recognition model, firstly, taking a first training text in entity training data as input for each piece of entity training data in an entity training data set, inputting the first training text into the named entity recognition model to obtain a marked text which is output by the named entity recognition model and is marked with key text information corresponding to the first training text information, and then adjusting network parameters of the named entity recognition model based on the marked text and a second training text corresponding to the first training text in the entity training data.

In this embodiment, the network parameters of the named entity recognition model may be adjusted using a back propagation algorithm. And after model training is carried out on a large amount of entity training data in the entity training data set, a trained named entity recognition model is obtained. The entity training data set for training the named entity recognition model is composed of entity training data which is formed by extracting a large number of prosecuted documents related to original notices/reported/case, litigation requests/facts and reasons and the like from legal document resources and carrying out entity marking processing based on the extracted text information.

Based on this, after character information recognition is performed on each image in the image set P1, the obtained key character information is as follows:

original report: c1 International trade Limited

Is informed: yangzhou C3 Co Ltd

And (2) a second report: hangzhou C4 advertising Inc

The scheme is as follows: infringement of registered trademark exclusive rights

Litigation request:

1. the method comprises the steps of immediately stopping the act of infringing the exclusive right of the original registered trademark after the report is informed, and publishing an apology statement on a first page (the domain name is www.C4.com) of a C4 website;

2. the interpretation is reported to make the reasonable expenditure paid by the original reimbursement loss and the infringement prevention account for the RMB 5 ten thousand yuan;

3. The arbitration is followed by deletion of an infringement link under the C4 website.

Facts and reasons:

the C5 limited liability company is the owner of the "ZZZ" trademark and copyright, and the original notice is the exclusive total licensee of the registered trademark in the china area. Through investigation, the original notice finds that the two notices have the action of infringing the special right of the original notice registered trademark, and specially mentions litigation. The reason for this is as follows:

1. original notice enjoys special right of prior registered trademark

The original notice is the exclusive total licensee of the trademark 1928349797 of the 20 th category, which enjoys the exclusive right of the registered trademark, and the commodity is approved to be used as a cushion, a pillow and the like.

(omitting subsequent content for reasons of space)

Further, after the key text information is determined, the position information of each key text information needs to be acquired. According to one embodiment of the invention, the position information of the key text information can be determined according to the position information of each text information in the text set to which the key text information belongs. The position information of the key word information comprises coordinate information and an identification of an image from which the key word information is derived. The coordinate information included in the position information of the keyword information is a coordinate set formed by the coordinate information (4 vertices of the rectangular region) of each or each line of the keyword information.

In this embodiment, the identification of the images may be represented by the serial numbers of the images in the image set, for example, the identification of each image in the image set P1 is sequentially 1-15. From this, it can be determined that the identifications of the original notice, notice one, notice two, case by and litigation request source images are all 1, the coordinate information is represented by L1, L2, L3, L4 and L5, respectively, and the identifications of the fact and reason source images are 1 to 3, and the coordinate information is represented by L6.

Subsequently, step S320 is performed to associate the keyword information of the image with the corresponding position information, and generate corresponding keyword data. According to an embodiment of the present invention, for each image in the image set P1, such as the image shown in fig. 4, the keyword information of the image is associated with the corresponding position information, that is, the original notice is associated with the identifier 1 and the coordinate information L1 of the image, the notice is associated with the identifier 1 and the coordinate information L2 of the image, the notice is associated with the identifier 1 and the coordinate information L3 of the image, the counter is associated with the identifier 1 and the coordinate information L4 of the image, the litigation request is associated with the identifier 1 and the coordinate information L5 of the image, and the facts are associated with the identifiers 1 to 3 and the coordinate information L6 of the image.

On the basis, a set identifier of the image set is generated, and the set identifier is associated and stored with the image set and key data corresponding to each image in the image set. In this embodiment, the set identifier of the image set P1 is generated as ID1, and the set identifier ID1 is associated with the image set P1 and key data corresponding to each image in the image set P1, and stored in the storage unit in the server 120 or in a database server connected to the server 120.

Thereafter, the server 120 may receive a request from the client and perform a lookup and issue of critical data in response to the request. According to one embodiment of the present invention, in response to a request of a client residing in the terminal device 110 connected to the server 120, the request includes a set identifier to be searched, an image set associated with the set identifier to be searched is searched according to the request, and the searched image set and key data corresponding to each image in the image set are issued to the client so that the client displays in a display interface.

Fig. 5 shows a flow chart of an information processing method 500 according to an embodiment of the invention. As shown in fig. 5, the method 500 begins at step S510. In step S510, the current position of the cursor in the display interface is detected, where the display interface includes a first area and a second area, and the second area includes one or more preset areas. The first area displays an image in the image set according to the method 300, and the second area displays keyword information corresponding to the image.

Considering that the contents displayed in the first area and the second area need to be acquired in advance, according to one embodiment of the present invention, before step S510, in response to the operation of the user, the set identifier to be searched is acquired according to the operation, and a request is sent to the server 120 according to the set identifier forming request, so as to acquire the image set associated with the set identifier to be searched and the key data corresponding to each image in the image set. In this embodiment, the user logs in to the client in the terminal device 110, and selects a document number of a legal document through, for example, clicking operation in a display interface of the client, at this time, in response to the operation of the user, the set identifier to be searched is obtained as ID1 according to the operation, and a request is formed according to the set identifier ID1, and a request is sent to the server 120 to obtain an image set associated with the set identifier ID1 and key data corresponding to each image in the image set.

In response to the request, the server 120 finds the image set P1 associated with the set ID1 according to the request, and sends the image set P1 and key data corresponding to each image in the image set P1 to the client. According to one embodiment of the present invention, the image set P1 issued by the server 120 and the key data corresponding to each image in the image set P1 are received, where the key data includes the key text information and the location information of the key text information included in the image. The position information of the key word information comprises coordinate information and an identification of an image from which the key word information is derived.

In this embodiment, in the image set P1, in the key data corresponding to the first image (the image is identified as 1), the key text information includes the original notice, the first notice, the second notice, the case, the litigation request, the fact and the reason, the position information of the key text information "original notice" includes the coordinate information L1 and the image is identified as 1, the position information of the key text information "first notice" includes the coordinate information L2 and the image is identified as 1, the position information of the key text information "second notice" includes the coordinate information L3 and the image is identified as 1, the position information of the key text information "case" includes the coordinate information L4 and the image is identified as 1, the position information of the key text information "litigation request" includes the coordinate information L5 and the image is identified as 1 and the fact and the position information of the key text information "and the reason" includes the coordinate information L6 and the image is identified as 1 to 3. In the key data corresponding to the second image (the image is marked with 2) and the third image (the image is marked with 3), the key text information only comprises the fact and the reason, and the position information of the key text information 'fact and reason' is the same as the above.

After receiving the image set and the key data issued by the server 120, each image response in the image set is displayed in the first area, and the key text information contained in the image is displayed in the second area. Preferably, the keyword information contained in the image may be correspondingly displayed in a preset area in the second area. Fig. 6A shows a schematic diagram of information display in a first area and a second area according to an embodiment of the invention. In the display interface shown in fig. 6A, the left half is a first area, the right half is a second area, the first area currently displays a first image in the image set P1, the second area is distributed with preset areas of 6 text box styles, and key text information in the first area is displayed in the corresponding preset areas. In the second area, 6 preset areas from top to bottom are sequentially displayed with the contents of 6 key text information, namely, a primary notice (for explaining the number of primary notices is 1, the primary notice is represented in fig. 6A), a primary notice, a secondary notice, a case, a litigation request and facts and reasons. It should be noted that, in view of the display effect and resolution, fig. 6A (and fig. 6B and 6C) in the drawings of the specification are displayed after being rotated clockwise by 90 degrees on the basis of the original orientation, and the above description about fig. 6A (and fig. 6B and 6C) is based on the orientation in the original figure.

When step S310 is executed, it is detected that the current position of the cursor in the display interface is in the preset area corresponding to the keyword information, i.e., the litigation request. Further, step S320 is performed, in which if the current position is within the preset area, the display content corresponding to the preset area in the first area is marked. According to an embodiment of the present invention, the display contents corresponding to the preset area in the first area may be marked as follows. First, position information of key word information corresponding to a preset area is acquired, then display content corresponding to the preset area in a first area is determined according to the position information, and the display content is marked. In marking the display content, the display content is preferably marked by superimposing canvas elements in the first area.

In this embodiment, the keyword information corresponding to the preset area where the cursor is currently located is a litigation request, and the position information includes coordinate information L5 and an identifier 1 of the image. According to this position information, the display content corresponding to the preset area is marked in the first area by superimposing canvas elements. Fig. 6B shows a schematic diagram of information display in the first and second areas after marking according to an embodiment of the invention. As shown in fig. 6B, the text content of the litigation request portion in the first region has been highlighted for further editing or proofing by the user. Of course, the above marking process is not limited to highlighting, and may be performed by drawing a triangle, a circle, or the like, which is not limited to the present invention.

In addition, considering the fact and reason that the keyword information such as "facts and reasons" is more content or space, the position of the keyword information spans across multiple images, according to another embodiment of the present invention, for a preset area where the cursor is located, if the number of identifiers of the images from which the keyword information is derived in the preset area is greater than 1, a corresponding switch icon is generated in the first area. In this embodiment, when the cursor is located in the preset area corresponding to the keyword information "fact and reason", the number of the identifications of the images from which the keyword information "fact and reason" is derived is 3 and greater than 1, and a corresponding switch icon is generated in the first area, so that the user performs the page turning operation on the images displayed in the first area.

Fig. 6C shows a schematic diagram of information display in a first area and a second area according to a further embodiment of the invention. As shown in fig. 6C, the text contents of the fact and reason portion in the first area have been highlighted, and a switch icon of an arrow symbol appears on the right side of the first area. The user can switch the image displayed in the first area from the first image to the second image in the image set P1 by performing page-back by clicking the switch icon. Of course, it is also possible to generate a page-forward image on the left side of the first area so that the user switches from the current image to the previous image.

Fig. 7 shows a schematic diagram of an information processing apparatus 700 according to an embodiment of the present invention. As shown in fig. 7, the apparatus 700 includes an identification module 710 and a generation module 720.

The recognition module 710 is adapted to perform text information recognition on each image in the image set to obtain one or more pieces of key text information included in the image, and location information of the key text information.

According to one embodiment of the present invention, the recognition module 710 is further adapted to perform the photo-electric character recognition on each image in the image set to obtain text information included in the image and location information of the text information, and perform the title extraction on each obtained text information to determine one or more titles in the text corresponding to the image set, classify each text information according to the title to obtain a text set corresponding to the title and a category of the text set, wherein the text set includes one or more text information corresponding to the title, and input the text set into a named entity recognition model corresponding to the category thereof to extract one or more keyword information included in the text set.

In this embodiment, the recognition module 710 is further adapted to determine the location information of the keyword information according to the location information of each text information in the text set to which the keyword information belongs. The position information of the key word information comprises coordinate information and an identification of an image from which the key word information is derived.

According to an embodiment of the present invention, the identification module 710 is further adapted to obtain an image set in advance, and further adapted to perform a conversion process on the document in the first format to generate a plurality of images corresponding to the document, determine that the images are blank images when the edge intensity of the images is less than a preset intensity threshold, and divide the plurality of images corresponding to the document based on the blank images to form one or more image sets. Wherein the document comprises legal documents

The generating module 720 is adapted to correlate the keyword information of the image with the corresponding position information, and generate corresponding keyword data.

According to one embodiment of the present invention, the generating module 720 is further adapted to generate a set identifier of the image set, and store the set identifier in association with the image set and key data corresponding to each image in the image set.

In this embodiment, the generating module 720 is further adapted to respond to a request of the client, where the request includes a set identifier to be searched, search an image set associated with the set identifier to be searched according to the request, and send the searched image set and key data corresponding to each image in the image set to the client, so that the client displays the key data in the display interface.

Specific steps and embodiments of the information processing are disclosed in detail in the descriptions based on fig. 3 to 4, and are not repeated here.

Fig. 8 shows a schematic diagram of an information processing apparatus 800 according to an embodiment of the present invention. As shown in fig. 8, the apparatus 800 includes a detection module 810 and a tagging module 820.

The detection module 810 is adapted to detect a current position of a cursor in a display interface, the display interface comprising a first area and a second area, the second area comprising one or more preset areas. The first area displays an image in the image set as described in the information processing apparatus 700, and the second area displays the keyword information corresponding to the image.

According to one embodiment of the present invention, the detection module 810 is further adapted to respond to the operation of the user, obtain the set identifier to be searched according to the operation, and send a request to the server 120 according to the set identifier forming request, so as to obtain the image set associated with the set identifier to be searched, and key data corresponding to each image in the image set.

In this embodiment, the detection module 810 is further adapted to receive the image set sent by the server 120 and the key data corresponding to each image in the image set, where the key data includes the key text information and the position information of the key text information included in the image, display the image in the first area, and display the key text information included in the image in the second area. The detection module 810 is further adapted to correspondingly display the keyword information contained in the image in a preset area in the second area.

The marking module 820 is adapted to mark the display content corresponding to the preset area in the first area when the current position is within the preset area.

According to one embodiment of the present invention, the marking module 820 is further adapted to obtain location information of the keyword information corresponding to the preset area, determine display content corresponding to the preset area in the first area according to the location information, and mark the display content.

In this embodiment, the tagging module 820 is further adapted to tag the display content by superimposing canvas elements in the first region.

According to an embodiment of the present invention, the location information of the keyword information includes coordinate information and an identifier of an image from which the keyword information is derived, and the marking module 820 is further adapted to generate, for a preset area where the cursor is located, a corresponding switching icon in the first area when the number of identifiers of the images from which the keyword information is derived in the preset area is greater than 1.

Specific steps and embodiments of the information processing are disclosed in detail in the descriptions based on fig. 5 to 6C, and are not repeated here.

At present, the document identification and related processing scheme generally only realizes the conversion from a paper graphic file to a text format by an OCR technology, and once the method is required to be applied to more professional fields, a series of problems such as insufficient intelligence, low identification precision, complicated manual operation, poor auxiliary means and the like still exist. According to the information processing scheme provided by the embodiment of the invention, the text of the image format is identified by the photoelectric character identification technology, so that one or more pieces of key text information contained in the text and the position information of the key text information are acquired, the key text information and the corresponding position information are associated, and structured key data are generated, so that the method is more intelligent and automatic. In the text recognition process, each page of document converted into the image format is recognized based on a specific recognition algorithm corresponding to the document type, so that the recognition accuracy is improved.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules or units or groups of devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into a plurality of sub-modules.

Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or groups of embodiments may be combined into one module or unit or group, and furthermore they may be divided into a plurality of sub-modules or sub-units or groups. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.

Furthermore, some of the embodiments are described herein as methods or combinations of method elements that may be implemented by a processor of a computer system or by other means of performing the functions. Thus, a processor with the necessary instructions for implementing the described method or method element forms a means for implementing the method or method element. Furthermore, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is for carrying out the functions performed by the elements for carrying out the objects of the invention.

The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions of the methods and apparatus of the present invention, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.

In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the information processing method of the present invention in accordance with instructions in said program code stored in the memory.

By way of example, and not limitation, computer readable media comprise computer storage media and communication media. Computer-readable media include computer storage media and communication media. Computer storage media stores information such as computer readable instructions, data structures, program modules, or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.

As used herein, unless otherwise specified the use of the ordinal terms "first," "second," "third," etc., to describe a general object merely denote different instances of like objects, and are not intended to imply that the objects so described must have a given order, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined by the appended claims.

Claims

1. An information processing method, comprising:

identifying text information of each image in an image set to acquire one or more pieces of key text information contained in the image and position information of the key text information;

and associating the key text information of the image with the corresponding position information to generate corresponding key data, wherein the text information identification of each image in the image set comprises the following steps:

Performing photoelectric character recognition on each image in an image set to acquire text information included in the image and position information of the text information;

extracting the title of each acquired text message to determine one or more titles in the text corresponding to the image set;

classifying each text message according to the title to obtain a text set corresponding to the title and a category of the text set, wherein the text set comprises one or more text messages corresponding to the title;

and inputting the text set into a named entity recognition model corresponding to the category of the text set to extract one or more pieces of key text information contained in the text set.

2. The method of claim 1, wherein said identifying text information for each image in the set of images comprises:

and determining the position information of the key text information according to the position information of each text information in the text set to which the key text information belongs.

3. The method of claim 1, further comprising pre-acquiring a set of images, the pre-acquiring a set of images comprising:

converting a document in a first format to generate a plurality of images corresponding to the document;

If the edge intensity of the image is smaller than a preset intensity threshold value, judging that the image is a blank image;

and dividing a plurality of images corresponding to the document based on the blank images to form one or more image sets.

4. A method as claimed in claim 3, wherein the document comprises a legal document.

5. The method of claim 1, wherein the location information of the keyword information includes coordinate information and an identification of an image from which the keyword information is derived.

6. The method of claim 1, further comprising:

generating a set identifier of the image set;

and storing the set identifier and the image set and key data corresponding to each image in the image set in an associated manner.

7. The method of claim 6, further comprising:

responding to a request of a client, wherein the request comprises a set identifier to be searched;

searching an image set associated with the set identifier to be searched according to the request;

and sending the searched image set and key data corresponding to each image in the image set to the client so that the client can display in a display interface.

8. An information processing method, comprising:

detecting the current position of a cursor in a display interface, wherein the display interface comprises a first area and a second area, and the second area comprises one or more preset areas;

if the current position is in the preset area, marking the display content corresponding to the preset area in the first area;

wherein the first area displays the images in the image set according to any one of claims 1-7, and the second area displays the key text information corresponding to the images.

9. The method of claim 8, wherein prior to detecting the current position of the cursor in the display interface, further comprising:

responding to the operation of a user, and acquiring a set identifier to be searched according to the operation;

and sending the request to a server according to the set identifier forming request so as to acquire an image set associated with the set identifier to be searched and key data corresponding to each image in the image set.

10. The method of claim 9, further comprising:

receiving an image set issued by the server and key data corresponding to each image in the image set, wherein the key data comprises key text information contained in the image and position information of the key text information;

And displaying the image in the first area, and displaying the key text information contained in the image in the second area.

11. The method of claim 10, wherein displaying the keyword information contained in the image in the second area comprises:

and correspondingly displaying the key text information contained in the image in a preset area in the second area.

12. The method of claim 11, wherein the marking the display content in the first region corresponding to the preset region comprises:

acquiring position information of the key word information corresponding to the preset area;

and determining display contents corresponding to the preset area in the first area according to the position information, and marking the display contents.

13. The method of claim 12, wherein the marking the display content comprises:

and marking the display content by overlaying canvas elements in the first area.

14. The method of claim 12, wherein the location information of the keyword information includes coordinate information and an identification of an image from which the keyword information is derived, the method further comprising:

And if the number of the identifications of the images from which the key text information is derived in the preset area is larger than 1 for the preset area where the cursor is positioned, generating a corresponding switching icon in the first area.

15. An information processing apparatus comprising:

the recognition module is suitable for carrying out photoelectric character recognition on each image in the image set to obtain text information contained in the image and position information of the text information, extracting titles of each obtained text information to determine one or more titles in texts corresponding to the image set, classifying each text information according to the titles to obtain a text set corresponding to the title and a category of the text set, wherein the text set comprises one or more text information corresponding to the title, and inputting the text set into a named entity recognition model corresponding to the category of the text set to extract one or more key text information contained in the text set;

and the generation module is suitable for associating the key text information of the image with the corresponding position information to generate corresponding key data.

16. An information processing apparatus comprising:

The detection module is suitable for detecting the current position of a cursor in a display interface, wherein the display interface comprises a first area and a second area, and the second area comprises one or more preset areas;

the marking module is suitable for marking the display content corresponding to the preset area in the first area when the current position is in the preset area;

wherein the first area displays the images in the image set according to claim 15, and the second area displays the keyword information corresponding to the images.

17. A computing device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-14.

18. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-14.