CN117409430A

CN117409430A - Medical bill information extraction method, device, equipment and storage medium thereof

Info

Publication number: CN117409430A
Application number: CN202311303352.XA
Authority: CN
Inventors: 殷悦迪
Original assignee: Ping An Health Insurance Company of China Ltd
Current assignee: Ping An Health Insurance Company of China Ltd
Priority date: 2023-10-09
Filing date: 2023-10-09
Publication date: 2024-01-16

Abstract

The embodiment of the application belongs to the technical field of digital medical treatment, is applied to a medical bill information structured extraction scene, and relates to a medical bill information extraction method, a device, equipment and a storage medium thereof, wherein the method comprises the steps of obtaining a target bill image; identifying text blocks in the target bill; constructing a desired text box and an actual text box; determining an expected text box corresponding to the actual text box; obtaining a structured finishing result; inputting the bill image and the text block into a text block classification model which is pre-trained; and carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural whole result and a preset priority strategy. By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

Description

Medical bill information extraction method, device, equipment and storage medium thereof

Technical Field

The application relates to the technical field of digital medical treatment, and is applied to a medical bill information structured extraction scene, in particular to a medical bill information extraction method, a device, equipment and a storage medium thereof.

Background

Along with the development of the computer industry and artificial intelligence and the coming of the big data age, the traditional medical mode is gradually converted into the digital medical mode. At present, in order to perfect the construction of a digital medical service system, no matter an online platform or an offline hospital service end is trying to carry out standardized improvement on medical services, for example, on the printing of medical receipts, the gradual alignment of receipts is ensured as much as possible, so that a patient can conveniently check the receipts, and the online platform is also convenient for carrying out information statistics and information extraction in an intelligent mode.

At present, most of the adopted methods adopt an IOU intersection set detection method, detect texts with position deviation during printing, and correct the texts; or, the YOLO algorithm is adopted to detect, and then the text with position deviation during printing is corrected to ensure that the online platform performs information statistics and information extraction, however, in actual business, as long as one printing value has position deviation, all printing values have the same deviation, so that a medical bill information structured extraction method which is simpler and more convenient and accords with the actual business scene is still lacking.

Disclosure of Invention

The embodiment of the application aims to provide a medical bill information extraction method, a device, equipment and a storage medium thereof, so as to solve the problem that the prior art is lack of a medical bill information structured extraction method which is simpler and more convenient and accords with actual business scenes.

In order to solve the above technical problems, the embodiment of the present application provides a medical bill information extraction method, which adopts the following technical scheme:

a medical bill information extraction method comprises the following steps:

step 201, acquiring a bill image corresponding to a target medical bill;

step 202, recognizing text blocks in a target medical bill by adopting a text recognition technology;

step 203, constructing a model according to the text block and a preset frame body, and constructing an expected text frame and an actual text frame;

step 204, adopting a contrast detection algorithm to detect the coincidence relation between each actual text box and all expected text boxes, and determining the expected text boxes corresponding to all the actual text boxes according to the coincidence relation;

step 205, carrying out structural arrangement on the text blocks through expected text boxes corresponding to all the actual text boxes respectively to obtain a structural arrangement result;

Step 206, inputting the bill image and the text block into a text block classification model which is pre-trained, and predicting the classification relation between the text blocks according to a model output result, wherein the text block classification model comprises a text block classification model based on LayoutLMv 3;

and step 207, carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural integer result and a preset priority strategy.

Further, the text blocks include a first type text block and a second type text block, the first type text block includes text blocks corresponding to all template fields, the second type text block includes text blocks corresponding to all non-template fields, and the step of identifying text blocks in the target medical bill by using a text identification technology specifically includes:

and identifying the first type text blocks and the second type text blocks contained in the target medical bill by adopting an OCR text identification technology.

Further, the step of constructing a desired text box and an actual text box according to the text block and the preset frame construction model specifically includes:

according to a preset template field table, identifying the positions of text blocks corresponding to all template fields in the target medical bill as first-class positions;

Identifying the positions of text blocks corresponding to all non-template fields in the target medical bill as second-class positions;

constructing a model according to the first type position and the frame body, and constructing a template text frame corresponding to the first type text block;

building a model according to the template text boxes corresponding to the first type text blocks and the frame body, building expected text boxes corresponding to the second type text blocks, and setting row and column information for all the expected text boxes;

and constructing a model based on the second type position and the frame body, and constructing an actual text box corresponding to the second type text block.

Further, the step of detecting the coincidence relation between each actual text box and all the expected text boxes by adopting a contrast detection algorithm, and determining the expected text boxes corresponding to all the actual text boxes according to the coincidence relation specifically includes:

step 401, arbitrarily selecting a point in the target medical bill as a coordinate origin;

step 402, obtaining vertex coordinates of all actual text boxes and vertex coordinates of all expected text boxes based on the origin of coordinates;

step 403, calculating Y-axis interval sections and X-axis interval sections respectively corresponding to all the actual text boxes according to the vertex coordinates of all the actual text boxes;

Step 404, calculating Y-axis interval sections and X-axis interval sections respectively corresponding to all the expected text boxes according to the vertex coordinates of all the expected text boxes;

step 405, taking each actual text box as a current detection box in turn;

step 406, comparing the Y-axis section and the X-axis section of the current detection frame with the Y-axis section and the X-axis section corresponding to all the expected text frames one by one, identifying the expected text frames which have the overlapping relation of the Y-axis section and the X-axis section with the current detection frame as target text frames, and counting the number of the target text frames corresponding to the current detection frame;

step 407, repeatedly executing steps 405 to 406, and counting the number of target text boxes corresponding to all the actual text boxes respectively;

step 408, if the number of target text boxes corresponding to the actual text boxes is 0, the Y-axis interval segments of all the actual text boxes are adjusted integrally, or the X-axis interval segments of all the actual text files are adjusted integrally, after adjustment, the Y-axis interval segments and the X-axis interval segments corresponding to all the actual text boxes are obtained, and steps 405 to 407 are repeatedly executed until the number of target text boxes corresponding to all the actual text boxes is not 0, and the repeated execution is stopped, and the expected text boxes corresponding to all the actual text boxes are identified.

Further, the step of obtaining a structural result by performing structural arrangement on the text blocks through the expected text boxes respectively corresponding to all the actual text boxes specifically includes:

according to the row and column information respectively corresponding to all the expected text boxes, identifying row and column information respectively corresponding to all the actual text boxes;

the row and column information corresponding to all the actual text boxes respectively and the row and column information corresponding to all the template text boxes respectively are used as text block identifiers to be assigned to the corresponding text blocks;

and finishing the same-row text blocks and the same-column text blocks through the text block identifiers to obtain the structured finishing result.

Further, before executing the step of inputting the ticket image and the text block into the text block classification model after the pre-training, predicting the classification relation between the text blocks according to the model output result, the method further comprises:

obtaining medical notes of the same kind as the target medical notes in batches, and constructing a medical note sample set;

sequentially taking different medical notes in the medical note sample set as the target medical notes, and executing steps 201 to 202 to obtain note images and text blocks respectively corresponding to all medical notes in the medical note sample set;

Taking the bill images and text blocks corresponding to the same medical bill as training combination samples, and obtaining training combination samples respectively corresponding to all medical bills in the medical bill sample set;

inputting training combination samples corresponding to all medical notes in the medical note sample set into a text block classification model to be trained based on LayoutLMv3, and performing model pre-training to obtain a pre-trained text block classification model, wherein the pre-trained text block classification model can identify row and column information of all text blocks in the corresponding training combination sample according to note images, and output the row and column information of all text blocks as a model output result;

the step of inputting the bill image and the text block into a pre-trained text block classification model and predicting the classification relation between the text blocks according to a model output result specifically comprises the following steps:

inputting the bill image and the text block into a text block classification model which is pre-trained and completed by taking the bill image and the text block as prediction combination samples;

obtaining an output result of the text block classification model after the pre-training, and obtaining row and column information of the text block by analyzing the output result;

And dividing the text blocks in the same row and the same column according to the row and column information of the text blocks, and taking the text blocks in the same row and the same column as classification relations among the text blocks.

Further, the step of performing structural extraction on the medical bill information according to the classification relationship between the text blocks, the structural integer result and a preset priority policy specifically includes:

obtaining the same-row text blocks and the same-column text blocks in the target medical bill as first knowledge through the classification relation among the text blocks;

obtaining the same-line text block and the same-column text block in the target medical bill as second knowledge through the structured finishing result;

comparing the first knowledge with the second knowledge, and identifying whether the first knowledge is consistent with the second knowledge according to a comparison result;

if the first knowledge is consistent with the second knowledge, the rank information of the text block corresponding to the first knowledge or the rank information of the text block corresponding to the second knowledge is used as the rank information of the text block in the target medical bill;

if the first knowledge and the second knowledge are inconsistent, the rank information of the text block corresponding to the second knowledge is used as the rank information of the text block in the target medical bill according to the priority strategy;

And carrying out structural extraction on the medical bill information according to the row and column information of the text block in the target medical bill.

In order to solve the above technical problems, the embodiment of the present application further provides a medical bill information extraction device, which adopts the following technical scheme:

a medical ticket information extraction apparatus comprising:

the bill image acquisition module is used for acquiring a bill image corresponding to the target medical bill;

the text block identification module is used for identifying text blocks in the target medical bill by adopting a text identification technology;

the text box construction module is used for constructing a model according to the text blocks and a preset frame body to construct an expected text box and an actual text box;

the coincidence relation detection module is used for detecting the coincidence relation between each actual text box and all expected text boxes by adopting a comparison detection algorithm, and determining the expected text boxes corresponding to all the actual text boxes according to the coincidence relation;

the structural finishing module is used for carrying out structural finishing on the text blocks through expected text boxes corresponding to all the actual text boxes respectively to obtain a structural finishing result;

the model classification prediction module is used for inputting the bill image and the text block into a pre-trained text block classification model and predicting the classification relation between the text blocks according to a model output result, wherein the text block classification model comprises a text block classification model based on LayoutLMv 3;

And the structured extraction module is used for carrying out structured extraction on the medical bill information according to the classification relation among the text blocks, the structured management result and the preset priority strategy.

In order to solve the above technical problems, the embodiments of the present application further provide a computer device, which adopts the following technical schemes:

a computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the medical ticket information extraction method described above.

In order to solve the above technical problems, embodiments of the present application further provide a computer readable storage medium, which adopts the following technical solutions:

a computer readable storage medium having stored thereon computer readable instructions which when executed by a processor perform the steps of a medical ticket information extraction method as described above.

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

according to the medical bill information extraction method, the bill image corresponding to the target medical bill is obtained; identifying text blocks in the target medical bill; constructing a desired text box and an actual text box; detecting the coincidence relation between each actual text box and all expected text boxes, and determining the expected text boxes corresponding to all the actual text boxes; obtaining a structured whole result through the expected text boxes respectively corresponding to all the actual text boxes; inputting the bill image and the text block into a text block classification model which is pre-trained, and predicting classification relations among the text blocks according to model output results; and carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural whole result and a preset priority strategy. By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

Drawings

For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a medical ticket information extraction method according to the present application;

FIG. 3 is a flow chart of one embodiment of step 203 shown in FIG. 2;

FIG. 4 is a flow chart of one embodiment of step 204 shown in FIG. 2;

FIG. 5 is a flow chart of one embodiment of step 205 of FIG. 2;

FIG. 6 is a flow chart of one embodiment of step 207 shown in FIG. 2;

FIG. 7 is a schematic structural view of one embodiment of a medical ticket information extraction apparatus according to the present application;

FIG. 8 is a schematic structural diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to better understand the technical solutions of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture ExpertsGroup Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving PictureExperts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.

It should be noted that, the medical bill information extraction method provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the medical bill information extraction device is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow chart of one embodiment of a medical ticket information extraction method according to the present application is shown. The medical bill information extraction method comprises the following steps:

Step 201, acquiring a bill image corresponding to a target medical bill.

In this embodiment, the bill image corresponding to the target medical bill may be obtained by using a bill scanning or bill photographing mode.

Step 202, recognizing text blocks in the target medical bill by using a text recognition technology.

In this embodiment, the text blocks include a first type text block and a second type text block, where the first type text block includes text blocks corresponding to all template fields, and the second type text block includes text blocks corresponding to all non-template fields. Specifically, for example, in a medical fee receipt, the first text block represents text blocks corresponding to each fee item before printing of a specific fee, and the second text block represents text blocks corresponding to each fee value when printing of a specific fee. In general, the first text block corresponds to a data block pre-cached in the medical fee receipt, and the second text block is a specific fee collection value corresponding to each fee collection item to be printed during medical fee settlement.

In this embodiment, the step of identifying the text block in the target medical ticket by using the text identification technology specifically includes: and identifying the first type text blocks and the second type text blocks contained in the target medical bill by adopting an OCR text identification technology.

And 203, constructing a model according to the text block and a preset frame body, and constructing a desired text box and an actual text box.

With continued reference to fig. 3, fig. 3 is a flow chart of one embodiment of step 203 shown in fig. 2, comprising:

step 301, identifying the positions of text blocks corresponding to all template fields in the target medical bill according to a preset template field table, and taking the positions as first-class positions;

in this embodiment, the preset template field table includes names of all the toll items in the target medical ticket, and the template field refers to the names of all the toll items in the target medical ticket.

In this embodiment, the positions of the text blocks corresponding to all the template fields in the target medical bill are identified according to a preset template field table, specifically, the positions of the text blocks corresponding to all the template fields in the target medical bill are identified by comparing and identifying modes, and then, the positions of the text blocks corresponding to all the template fields in the target medical bill are marked, which may be a region marking method or a center marking method.

Step 302, identifying the positions of text blocks corresponding to all non-template fields in the target medical bill as second-class positions;

Accordingly, the non-template fields refer to respective billing values in the target medical ticket. Similarly, the locations of text blocks corresponding to all non-template fields in the target medical ticket are identified, after which the second type of location may also be marked using either area marking or center point marking.

Step 303, constructing a model according to the first type position and the frame body, and constructing a template text frame corresponding to the first type text block;

in this embodiment, the frame construction model includes a rectangular labeling model, an edge detection model, and the like, and a rectangular frame corresponding to the text block at the first type position can be constructed by using the frame construction model.

Step 304, constructing a model according to the template text boxes corresponding to the first type text blocks and the frame body, constructing expected text boxes corresponding to the second type text blocks, and setting row and column information for all the expected text boxes;

in this embodiment, the frame construction model includes a rectangular labeling model, an edge detection model, and the like, and an expected rectangular frame corresponding to the text block at the second type position can be constructed by using the frame construction model. Typically, the desired particular box should be aligned with a corresponding matrix of template text boxes. The desired text box represents a rectangular box aligned with the corresponding matrix of template text boxes, i.e., in an ideal state, a rectangular box corresponding to the desired location of the second type text block in the target medical ticket. And row and column information is set for all expected text boxes, so that subsequent structural arrangement is facilitated.

And 305, constructing a model based on the second type position and the frame body, and constructing an actual text frame corresponding to the second type text block.

In this embodiment, the actual text box corresponding to the second type text block is a rectangular text box generated according to the actual position corresponding to the second type text block in the target medical ticket.

And 204, detecting the coincidence relation between each actual text box and all expected text boxes by adopting a comparison detection algorithm, and determining the expected text boxes corresponding to all the actual text boxes according to the coincidence relation.

With continued reference to fig. 4, fig. 4 is a flow chart of one embodiment of step 204 shown in fig. 2, comprising:

Step 405, taking each actual text box as a current detection box in turn;

In this embodiment, the contrast detection algorithm includes: acquiring Y-axis interval sections and X-axis interval sections of a current detection frame and Y-axis interval sections and X-axis interval sections respectively corresponding to all expected text frames; and optionally selecting a Y-axis interval section and an X-axis interval section corresponding to the expected text box, and carrying out line segment intersection comparison. Specifically, for example, the Y-axis interval of the current detection frame is [ -1,6], the Y-axis interval corresponding to the expected text frame is [ -5,3], obviously, the intersection line segment [ -1,3] exists between the current detection frame and the expected text frame on the Y-axis, that is, there is a coincidence relation on the Y-axis, and similarly, whether there is a coincidence relation on the X-axis can be identified. The target text box refers to a desired text box with a superposition relationship at the same time on the Y axis and the X axis.

In this embodiment, the integrally adjusting the Y-axis intervals of all the actual text boxes specifically means that, assuming that all the actual text boxes include 3 actual text boxes, the Y-axis intervals corresponding to the 3 actual text boxes are [ -1,4], [3,6], [4,7] respectively, the integrally adjusting means that the integrally moves up and down, assuming that the coordinate unit is 1, the integrally moving means that the integrally moves up by 2 coordinate units, the Y-axis intervals corresponding to the 3 actual text boxes after the integrally adjusting are [1,6], [5,8], [6,9] respectively, and the X-axis intervals are unchanged; correspondingly, the X-axis section of all the actual text files is integrally adjusted, namely, the whole moves left and right, only the section values in the X-axis section are accumulated or subtracted together after adjustment, and the Y-axis section is unchanged.

And 205, carrying out structural arrangement on the text blocks through expected text boxes corresponding to all the actual text boxes respectively to obtain a structural arrangement result.

With continued reference to fig. 5, fig. 5 is a flow chart of one embodiment of step 205 shown in fig. 2, comprising:

step 501, identifying row and column information corresponding to all actual text boxes according to row and column information corresponding to all expected text boxes respectively;

step 502, the row and column information corresponding to all the actual text boxes respectively and the row and column information corresponding to all the template text boxes respectively are used as text block identifiers to be assigned to the corresponding text blocks;

And step 503, sorting out the text blocks in the same row and the text blocks in the same column through the text block identification, and obtaining the structured finishing result.

And 206, inputting the bill image and the text block into a pre-trained text block classification model, and predicting the classification relation between the text blocks according to a model output result, wherein the text block classification model comprises a text block classification model based on LayoutLMv 3.

In this embodiment, before the step of inputting the ticket image and the text block into the pre-trained text block classification model and predicting the classification relationship between the text blocks according to the model output result, the method further includes: obtaining medical notes of the same kind as the target medical notes in batches, and constructing a medical note sample set; sequentially taking different medical notes in the medical note sample set as the target medical notes, and executing steps 201 to 202 to obtain note images and text blocks respectively corresponding to all medical notes in the medical note sample set; taking the bill images and text blocks corresponding to the same medical bill as training combination samples, and obtaining training combination samples respectively corresponding to all medical bills in the medical bill sample set; inputting training combination samples corresponding to all medical notes in the medical note sample set into a text block classification model to be trained based on LayoutLMv3, and performing model pre-training to obtain a pre-trained text block classification model, wherein the pre-trained text block classification model can identify row and column information of all text blocks in the corresponding training combination sample according to note images, and the row and column information of all text blocks is output as a model output result.

The LayoutLMv3 model directly utilizes the image blocks of the document image, can be suitable for the document AI task which takes the text as the center and takes the image as the center, and predicts the row-column classification relationship among all text blocks by inputting the text blocks in the bill image and the pre-obtained bill image.

In this embodiment, the step of inputting the ticket image and the text block into a pre-trained text block classification model and predicting the classification relationship between the text blocks according to the model output result specifically includes: inputting the bill image and the text block into a text block classification model which is pre-trained and completed by taking the bill image and the text block as prediction combination samples; obtaining an output result of the text block classification model after the pre-training, and obtaining row and column information of the text block by analyzing the output result; and dividing the text blocks in the same row and the same column according to the row and column information of the text blocks, and taking the text blocks in the same row and the same column as classification relations among the text blocks.

With continued reference to fig. 6, fig. 6 is a flow chart of one embodiment of step 207 of fig. 2, comprising:

Step 601, obtaining a same-row text block and a same-column text block in the target medical bill as first knowledge through the classification relation among the text blocks;

step 602, obtaining a same-row text block and a same-column text block in the target medical bill as second knowledge according to the structural integer result;

step 603, comparing the first knowledge with the second knowledge, and identifying whether the first knowledge and the second knowledge are consistent according to a comparison result;

step 604, if the first knowledge is consistent with the second knowledge, using the rank information of the text block corresponding to the first knowledge or the rank information of the text block corresponding to the second knowledge as the rank information of the text block in the target medical bill;

step 605, if the first knowledge and the second knowledge are inconsistent, using the rank information of the text block corresponding to the second knowledge as the rank information of the text block in the target medical bill according to the priority policy;

and step 606, carrying out structural extraction on the medical bill information according to the line and row information of the text blocks in the target medical bill.

In this embodiment, the priority policy, that is, in terms of confidence, the classification relationship between the text blocks is higher than the structured whole result. The accuracy of the classification relation among the text blocks obtained through the text block classification model is preset to be higher than that of the structural whole result obtained in the step 205.

By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

According to the method, the bill image corresponding to the target medical bill is acquired; identifying text blocks in the target medical bill; constructing a desired text box and an actual text box; detecting the coincidence relation between each actual text box and all expected text boxes, and determining the expected text boxes corresponding to all the actual text boxes; obtaining a structured whole result through the expected text boxes respectively corresponding to all the actual text boxes; inputting the bill image and the text block into a text block classification model which is pre-trained, and predicting classification relations among the text blocks according to model output results; and carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural whole result and a preset priority strategy. By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

According to the method and the device, text blocks in the target medical bill are identified, the text blocks are analyzed independently to obtain text block rank information, and model prediction text block rank information is adopted in combination with the bill image, so that rank information of relevant data in the medical bill is identified more accurately, and structured extraction of the medical bill information is facilitated.

With further reference to fig. 7, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a medical ticket information extracting apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 7, the medical ticket information extracting apparatus 700 according to the present embodiment includes: a ticket image acquisition module 701, a text block identification module 702, a text box construction module 703, a coincidence relation detection module 704, a structured finishing module 705, a model classification prediction module 706 and a structured extraction module 707. Wherein:

a bill image acquisition module 701, configured to acquire a bill image corresponding to a target medical bill;

a text block identification module 702 for identifying text blocks in the target medical ticket using text identification techniques;

a text box construction module 703, configured to construct a desired text box and an actual text box according to the text block and a preset frame construction model;

the coincidence relation detection module 704 is configured to detect a coincidence relation between each actual text box and all the expected text boxes by using a comparison detection algorithm, and determine the expected text boxes corresponding to all the actual text boxes according to the coincidence relation;

The structural finishing module 705 is configured to perform structural finishing on the text blocks through expected text boxes corresponding to all actual text boxes respectively, so as to obtain a structural finishing result;

a model classification prediction module 706, configured to input the ticket image and the text block into a pre-trained text block classification model, and predict a classification relationship between the text blocks according to a model output result, where the text block classification model includes a text block classification model based on LayoutLMv 3;

and the structured extraction module 707 is configured to perform structured extraction on the medical bill information according to the classification relationship between the text blocks, the structured integer result and a preset priority policy.

Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by computer readable instructions, stored on a computer readable storage medium, that the program when executed may comprise the steps of embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 8, fig. 8 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 8 comprises a memory 8a, a processor 8b, a network interface 8c communicatively connected to each other via a system bus. It should be noted that only computer device 8 having components 8a-8c is shown in the figures, but it should be understood that not all of the illustrated components need be implemented, and that more or fewer components may alternatively be implemented. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 8a includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 8a may be an internal storage unit of the computer device 8, such as a hard disk or a memory of the computer device 8. In other embodiments, the memory 8a may also be an external storage device of the computer device 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 8. Of course, the memory 8a may also comprise both an internal memory unit of the computer device 8 and an external memory device. In this embodiment, the memory 8a is generally used to store an operating system and various application software installed on the computer device 8, such as computer readable instructions of a medical bill information extraction method. Further, the memory 8a may be used to temporarily store various types of data that have been output or are to be output.

The processor 8b may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 8b is typically used to control the overall operation of the computer device 8. In this embodiment, the processor 8b is configured to execute computer readable instructions stored in the memory 8a or process data, such as computer readable instructions for executing the medical ticket information extraction method.

The network interface 8c may comprise a wireless network interface or a wired network interface, which network interface 8c is typically used to establish a communication connection between the computer device 8 and other electronic devices.

The computer equipment provided by the embodiment belongs to the technical field of digital medical treatment, and is applied to a medical bill information structured extraction scene. According to the method, the bill image corresponding to the target medical bill is acquired; identifying text blocks in the target medical bill; constructing a desired text box and an actual text box; detecting the coincidence relation between each actual text box and all expected text boxes, and determining the expected text boxes corresponding to all the actual text boxes; obtaining a structured whole result through the expected text boxes respectively corresponding to all the actual text boxes; inputting the bill image and the text block into a text block classification model which is pre-trained, and predicting classification relations among the text blocks according to model output results; and carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural whole result and a preset priority strategy. By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

The present application also provides another embodiment, namely, a computer readable storage medium storing computer readable instructions executable by a processor to cause the processor to perform the steps of the medical ticket information extraction method as described above.

The computer readable storage medium provided by the embodiment belongs to the technical field of digital medical treatment, and is applied to a medical bill information structured extraction scene. According to the method, the bill image corresponding to the target medical bill is acquired; identifying text blocks in the target medical bill; constructing a desired text box and an actual text box; detecting the coincidence relation between each actual text box and all expected text boxes, and determining the expected text boxes corresponding to all the actual text boxes; obtaining a structured whole result through the expected text boxes respectively corresponding to all the actual text boxes; inputting the bill image and the text block into a text block classification model which is pre-trained, and predicting classification relations among the text blocks according to model output results; and carrying out structural extraction on the medical bill information according to the classification relation among the text blocks, the structural whole result and a preset priority strategy. By identifying text blocks in the target medical bill, text block rank information is obtained by analyzing the text blocks independently, and text block rank information is predicted by combining a bill image and a model, so that rank information of related data in the medical bill is more accurately identified, and structured extraction of the medical bill information is facilitated.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.

It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims

1. The medical bill information extraction method is characterized by comprising the following steps:

step 201, acquiring a bill image corresponding to a target medical bill;

2. The method for extracting medical bill information according to claim 1, wherein the text blocks include a first type text block and a second type text block, the first type text block includes text blocks corresponding to all template fields, the second type text block includes text blocks corresponding to all non-template fields, and the step of identifying text blocks in the target medical bill by using a text identification technology specifically includes:

3. The medical bill information extraction method according to claim 2, wherein the step of constructing a desired text box and an actual text box according to the text block and a preset frame construction model specifically comprises:

4. The medical bill information extraction method according to claim 1 or 3, wherein the step of detecting the coincidence relation between each actual text box and all the expected text boxes respectively by using a contrast detection algorithm, and determining the expected text boxes corresponding to all the actual text boxes respectively according to the coincidence relation specifically comprises:

Step 405, taking each actual text box as a current detection box in turn;

5. The method for extracting medical bill information according to claim 3, wherein the step of structurally sorting the text blocks through the expected text boxes corresponding to all the actual text boxes respectively to obtain a structural sorting result specifically comprises:

6. The medical ticket information extraction method of claim 1, wherein prior to the step of performing the step of inputting the ticket image and the text block into a pre-trained text block classification model, predicting classification relationships between the text blocks based on model output results, the method further comprises:

7. The method for extracting medical bill information according to claim 1, wherein the step of structurally extracting the medical bill information according to the classification relation among the text blocks, the structural integer result and a preset priority policy specifically comprises:

8. A medical ticket information extraction device, characterized by comprising:

9. A computer device comprising a memory having stored therein computer readable instructions which when executed by the processor implement the steps of the medical ticket information extraction method of any of claims 1 to 7.

10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor perform the steps of the medical ticket information extraction method according to any of claims 1 to 7.