WO2021151270A1 - 图像结构化数据提取方法、装置、设备及存储介质 - Google Patents

图像结构化数据提取方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021151270A1
WO2021151270A1 PCT/CN2020/098946 CN2020098946W WO2021151270A1 WO 2021151270 A1 WO2021151270 A1 WO 2021151270A1 CN 2020098946 W CN2020098946 W CN 2020098946W WO 2021151270 A1 WO2021151270 A1 WO 2021151270A1
Authority
WO
WIPO (PCT)
Prior art keywords
recognized
preset
result
recognition result
image
Prior art date
Application number
PCT/CN2020/098946
Other languages
English (en)
French (fr)
Inventor
施伟斌
刘鹏
刘玉宇
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151270A1 publication Critical patent/WO2021151270A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, equipment, and storage medium for extracting image structured data.
  • the traditional Optical Character Recognition (OCR) model needs to train a model for each field separately.
  • OCR Optical Character Recognition
  • this application provides a method, device, device and storage medium for extracting image structured data, the purpose of which is to solve the technical problem that the traditional OCR recognition model in the prior art cannot extract the structured data in the recognition result.
  • the present application provides a method for extracting structured image data, which includes:
  • Receiving step receiving a request sent by a user to extract structured data of an image, and obtaining the original image of the structured data to be extracted carried in the request;
  • Recognition step input the original image into a pre-trained position detection model to obtain position coordinate information of multiple regions to be recognized in the original image, and cut the multiple regions to be recognized based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized.
  • a screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results;
  • Extraction step use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each area to be recognized and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the area to be recognized Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • this application also provides a device for extracting structured image data, the device comprising:
  • Receiving module used to receive a request sent by a user to extract structured image data, and obtain the original image of the structured data to be extracted carried in the request;
  • Recognition module used to input the original image into a pre-trained position detection model to obtain position coordinate information of multiple regions to be recognized in the original image, and perform operations on the multiple regions to be recognized based on the position coordinate information Cutting, performing image transformation processing on the area to be recognized after cutting, inputting the pre-trained recognition model to the area to be recognized before and after performing the transformation processing, and obtaining the initial recognition results corresponding to each of the areas to be recognized. Filtering the target recognition result corresponding to each region to be recognized from each of the initial recognition results based on a preset screening rule; and
  • Extraction module used to calculate the similarity between the character corresponding to the target recognition result of each to-be-recognized area and each category of the character in the preset thesaurus by using a preset algorithm, and select the category character corresponding to the maximum similarity value as the to-be-recognized area For the classification result, fill each classification result and target recognition result into a preset template file to generate a structured data file of the original image, and feed back the structured data file to the user.
  • the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed. The following steps:
  • Receiving step receiving a request sent by a user to extract structured data of an image, and obtaining the original image of the structured data to be extracted carried in the request;
  • Recognition step input the original image into a pre-trained position detection model to obtain position coordinate information of multiple regions to be recognized in the original image, and cut the multiple regions to be recognized based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized.
  • a screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results;
  • Extraction step use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each area to be recognized and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the area to be recognized Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to implement the following steps:
  • Receiving step receiving a request sent by a user to extract structured data of an image, and obtaining the original image of the structured data to be extracted carried in the request;
  • Recognition step input the original image into a pre-trained position detection model to obtain position coordinate information of a plurality of regions to be recognized in the original image, and cut the plurality of regions to be recognized based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized.
  • a screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results;
  • Extraction step use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each area to be recognized and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the area to be recognized Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • This application performs image transformation processing on the output results of the detection model, expands the data of the same area to be identified, and then inputs the untransformed and transformed images to be identified into the identification model, and further screens and compares different identification results. Yes, the optimal result is obtained as the output result, which can improve the accuracy of the output result of the recognition model.
  • the method of regular matching and database search makes up for the shortcomings of the recognition model in obtaining structured data. Compared with the traditional OCR scheme, model training The training data required is relatively small, which saves system memory.
  • Figure 1 is an application environment diagram of a preferred embodiment of the computer equipment of this application
  • Figure 2 is a schematic diagram of modules of an image structured data extraction device
  • FIG. 3 is a schematic flowchart of a preferred embodiment of a method for extracting structured image data according to this application.
  • FIG. 1 it is a schematic diagram of a preferred embodiment of the computer device 1 of this application.
  • the computer device 1 includes, but is not limited to: a memory 11, a processor 12, a display 13, and a network interface 14.
  • the computer device 1 is connected to the network through the network interface 14 to obtain original data.
  • the network may be an intranet (Intranet), the Internet (Internet), a global mobile communication system (Global System of Mobile communication, GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G network, Bluetooth (Bluetooth), Wi-Fi, call network and other wireless or wired networks.
  • the memory 11 includes at least one type of readable storage medium
  • the readable storage medium includes flash memory, hard disk, multimedia card, card type memory (for example, SD or DX memory, etc.), random access memory (RAM), static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 11 may be an internal storage unit of the computer device 1, for example, a hard disk or a memory of the computer device 1.
  • the memory 11 may also be an external storage device of the computer device 1, such as a plug-in hard disk equipped with the computer device 1, a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, Flash Card, etc.
  • the memory 11 may also include both the internal storage unit of the computer device 1 and its external storage device.
  • the memory 11 is generally used to store the operating system and various application software installed in the computer device 1, such as the program code of the image structured data extraction program 10, and so on.
  • the memory 11 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 12 may be a central processing unit (Central Processing Unit) in some embodiments. Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip.
  • the processor 12 is generally used to control the overall operation of the computer device 1, such as performing data interaction or communication-related control and processing.
  • the processor 12 is configured to run the program code or process data stored in the memory 11, for example, run the program code of the image structured data extraction program 10, and so on.
  • the display 13 may be referred to as a display screen or a display unit.
  • the display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an organic light-emitting diode (Organic Light Emitting Diode). Light-Emitting Diode, OLED) touch device, etc.
  • the display 13 is used for displaying the information processed in the computer device 1 and for displaying a visualized work interface, for example, displaying the results of data statistics.
  • the network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the network interface 14 is usually used to establish a communication connection between the computer device 1 and other electronic devices.
  • Figure 1 only shows a computer device 1 with components 11-14 and an image structured data extraction program 10. However, it should be understood that it is not required to implement all of the illustrated components, and more or less may be implemented instead. Components.
  • the computer device 1 may further include a user interface.
  • the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
  • the optional user interface may also include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (Organic Light-Emitting Diode, OLED) touch device, etc.
  • the display can also be called a display screen or a display unit as appropriate, and is used to display the information processed in the computer device 1 and to display a visualized user interface.
  • the computer equipment 1 may also include a radio frequency (Radio Frequency, RF) circuits, sensors and audio circuits, etc., will not be repeated here.
  • RF Radio Frequency
  • the processor 12 may implement the following steps when executing the image structured data extraction program 10 stored in the memory 11:
  • Receiving step receiving a request sent by a user to extract structured data of an image, and obtaining the original image of the structured data to be extracted carried in the request;
  • Recognition step input the original image into a pre-trained position detection model to obtain position coordinate information of multiple regions to be recognized in the original image, and cut the multiple regions to be recognized based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized.
  • a screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results;
  • Extraction step use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each area to be recognized and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the area to be recognized Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • the storage device may be the memory 11 of the computer device 1 or other storage devices that are communicatively connected with the computer device 1.
  • the image structured data extraction apparatus 100 described in this application can be installed in a computer device. According to the realized function.
  • the module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of a computer device and can complete fixed functions, and are stored in the memory of the computer device.
  • the image structured data extraction device 100 includes: an initialization module 110, an identification module 120, and an extraction module 130.
  • the receiving module 110 is configured to receive a request for extracting structured data of an image sent by a user, and obtain the original image of the structured data to be extracted carried in the request.
  • the request is parsed to obtain the original image of the structured data to be extracted carried in the request, where the request may include the structured data to be extracted.
  • the original image may also include the storage path of the original image from which the structured data is to be extracted and the unique identifier of the original image.
  • the original image can be entered when the user submits the request, or it can be obtained from the address specified in the request after the user submits the image structured data extraction request.
  • the original image can be an ID card image, an invoice image, etc. .
  • the recognition module 120 is configured to input the original image into a pre-trained position detection model to obtain the position coordinate information of a plurality of regions to be recognized in the original image, and compare the plurality of regions to be recognized based on the position coordinate information Carry out cutting, perform image transformation processing on the area to be recognized after cutting, input the pre-trained recognition model to the area to be recognized before and after performing the transformation processing, and obtain the initial recognition results corresponding to each of the areas to be recognized , Filtering the target recognition result corresponding to each region to be recognized from each of the initial recognition results based on a preset screening rule.
  • the original image is input into a pre-trained position detection model to obtain the position coordinate information of multiple regions to be recognized in the original image.
  • the deep learning model of the position detection model can be trained by Faster-RCNN, SSD, or Yolo, etc.
  • a preset labeling tool for example, the Label Imag tool
  • the annotation file in the format of the preset format can be in the Extensible Markup Language (XML) format.
  • the original image input position detection model can obtain the coordinate information corresponding to each area to be recognized in the original image. According to the coordinate information, the corresponding area to be recognized can be cut out in the original image to perform image transformation processing on the cut area to be recognized.
  • performing image transformation processing on the cut to-be-recognized region includes: extracting high-dimensional vectors of each to-be-recognized region, respectively, matching each of the high-dimensional vectors with a preset low-dimensional vector library, if the corresponding Low-dimensional vector, generate paired samples as the feature vector after transformation processing of the region to be identified; if the corresponding low-dimensional vector is not matched, select the preset low-dimensional vector in the low-dimensional vector library as the region to be identified The transformed feature vector.
  • performing image transformation processing on the cut to-be-recognized region further includes: performing up-sampling processing, brightness equalization processing, or random perspective transformation processing on the cut-to-be-recognized region. If the original image is blurry or the area to be recognized is blurry, you can upsampling and transform the area to be recognized to supplement the information of the area to be recognized, and then crop it to make the picture easier to calculate the recognition model, so as to get the corresponding recognition result. For the recognition model, the comparison of illumination affects the recognition effect.
  • the recognition model can be obtained by training the cyclic convolutional neural network model.
  • the area to be identified can also be divided into two levels, “important” and “secondary”.
  • important for ID cards, usually in business logic, name, ID number, and address are more important than other fields, which is more important for accuracy. The requirements are also higher.
  • these three fields can be customized for proprietary detection and recognition models. The other fields are not customized, and the general recognition model is directly used.
  • the output results of the recognition model will increase. Specifically, multiple initial recognition results will be output corresponding to the same area to be recognized.
  • the screening of the target recognition result corresponding to each area to be recognized in the initial recognition results from each of the initial recognition results based on a preset screening rule includes:
  • the transformation process When the confidence level of the initial recognition result before the transformation process is less than the preset threshold, and when the confidence level of the initial recognition result after the transformation process is greater than the confidence level of the initial recognition result before the transformation process, the transformation process will be executed.
  • the initial recognition result is used as the target recognition result.
  • the recognition accuracy is high, and the recognition result before the transformation processing can be directly selected as the target result. If it is less than the preset The threshold value is compared with the recognition result after the transformation processing is performed, and then the target recognition result is selected, which can improve the accuracy of the selected recognition result.
  • the preset threshold for example, 90%
  • the extraction module 130 is used to calculate the similarity between the characters corresponding to the target recognition result of each to-be-recognized area and the characters of each category in the preset thesaurus by using a preset algorithm, and select the category character corresponding to the maximum similarity value as the to-be-recognized area
  • the result of each category and the result of target recognition are filled into a preset template file to generate a structured data file of the original image, and the structured data file is fed back to the user.
  • the recognition model only recognizes the content of each area to be recognized.
  • the semantics or category of the target recognition result cannot be judged, that is, the structured data of the original image cannot be obtained.
  • Obtaining the structured data means obtaining The corresponding category attribute of the target recognition result of each area to be recognized.
  • a preset algorithm for example, a cosine similarity algorithm
  • the preset thesaurus contains keyword information of each category For example, if the original image is an ID document, the preset thesaurus contains a large number of category information such as name, gender, date of birth, ID number, address, and character information corresponding to the category information, and select the category character with the largest similarity value As the result of the classification of the area to be identified, the result of each classification and the result of the target identification are filled into a preset template file to generate a structured data file of the original image, and the structured data file is fed back to the user.
  • category information such as name, gender, date of birth, ID number, address, and character information corresponding to the category information
  • the category result can also be verified.
  • the target recognition result is "Shanghai Sixth Hospital for People"
  • the names in the preset thesaurus are all standard correct That is, when the similarity value is greater than 99%, and the recognition result is different from the name of the preset lexicon, the name of the preset lexicon can be substituted for the recognition result for error correction, which further improves the accuracy of the output result.
  • the characters based on the target recognition result of the region to be recognized A regular expression is constructed separately, the regular expression is matched with various characters in the preset thesaurus, and the matching result is obtained as the category result of the area to be recognized.
  • the target recognition result of a certain area to be recognized is a single character, and the constructed regular expression matches the character of "male” or “female” in the preset thesaurus, then The category result of the area to be identified is "gender”; if the regular expression constructed by the target recognition result of a certain area to be identified is "* ⁇ * ⁇ *", it matches the "province” and "city” in the preset thesaurus , Then the category result of the area to be identified is "Address".
  • this application also provides a method for extracting structured image data.
  • FIG. 3 it is a schematic diagram of a method flow of an embodiment of a method for extracting structured image data of this application.
  • the processor 12 of the computer device 1 executes the image structured data extraction program 10 stored in the memory 11 to implement the following steps of the image structured data extraction method:
  • Step S10 Receive a request from a user to extract structured image data, and obtain the original image of the structured data to be extracted carried in the request.
  • the request is parsed to obtain the original image of the structured data to be extracted carried in the request, where the request may include the structured data to be extracted.
  • the original image may also include the storage path of the original image from which the structured data is to be extracted and the unique identifier of the original image.
  • the original image can be entered when the user submits the request, or it can be obtained from the address specified in the request after the user submits the image structured data extraction request.
  • the original image can be an ID card image, an invoice image, etc. .
  • Step S20 Input the original image into a pre-trained position detection model to obtain position coordinate information of a plurality of regions to be identified in the original image, and cut the plurality of regions to be identified based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized. It is assumed that the screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results.
  • the original image is input into a pre-trained position detection model to obtain the position coordinate information of multiple regions to be recognized in the original image.
  • the deep learning model of the position detection model can be trained by Faster-RCNN, SSD, or Yolo, etc.
  • a preset labeling tool for example, the Label Imag tool
  • the annotation file in the format of the preset format can be in the Extensible Markup Language (XML) format.
  • the original image input position detection model can obtain the coordinate information corresponding to each area to be recognized in the original image. According to the coordinate information, the corresponding area to be recognized can be cut out in the original image to perform image transformation processing on the cut area to be recognized.
  • performing image transformation processing on the cut to-be-recognized region includes: extracting high-dimensional vectors of each to-be-recognized region, respectively, matching each of the high-dimensional vectors with a preset low-dimensional vector library, if the corresponding Low-dimensional vector, generate paired samples as the feature vector after transformation processing of the region to be identified; if the corresponding low-dimensional vector is not matched, select the preset low-dimensional vector in the low-dimensional vector library as the region to be identified The transformed feature vector.
  • performing image transformation processing on the cut to-be-recognized region further includes: performing up-sampling processing, brightness equalization processing, or random perspective transformation processing on the cut-to-be-recognized region. If the original image is blurry or the area to be recognized is blurry, you can upsampling and transform the area to be recognized to supplement the information of the area to be recognized, and then crop it to make the picture easier to calculate the recognition model, so as to get the corresponding recognition result. For the recognition model, the comparison of illumination affects the recognition effect.
  • the recognition model can be obtained by training the cyclic convolutional neural network model.
  • the area to be identified can also be divided into two levels, “important” and “secondary”.
  • important for ID cards, usually in business logic, name, ID number, and address are more important than other fields, which is more important for accuracy. The requirements are also higher.
  • these three fields can be customized for proprietary detection and recognition models. The other fields are not customized, and the general recognition model is directly used.
  • the output results of the recognition model will increase. Specifically, multiple initial recognition results will be output corresponding to the same area to be recognized.
  • the screening of the target recognition result corresponding to each area to be recognized in the initial recognition results from each of the initial recognition results based on a preset screening rule includes:
  • the transformation process When the confidence level of the initial recognition result before the transformation process is less than the preset threshold, and when the confidence level of the initial recognition result after the transformation process is greater than the confidence level of the initial recognition result before the transformation process, the transformation process will be executed.
  • the initial recognition result is used as the target recognition result.
  • the recognition accuracy is high, and the recognition result before the transformation processing can be directly selected as the target result. If it is less than the preset The threshold value is compared with the recognition result after the transformation processing is performed, and then the target recognition result is selected, which can improve the accuracy of the selected recognition result.
  • the preset threshold for example, 90%
  • Step S30 Use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each to-be-recognized area and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the to-be-recognized area Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • the recognition model only recognizes the content of each area to be recognized.
  • the semantics or category of the target recognition result cannot be judged, that is, the structured data of the original image cannot be obtained.
  • Obtaining the structured data means obtaining The corresponding category attribute of the target recognition result of each area to be recognized.
  • a preset algorithm for example, a cosine similarity algorithm
  • the preset thesaurus contains keyword information of each category For example, if the original image is an ID document, the preset thesaurus contains a large number of category information such as name, gender, date of birth, ID number, address, and character information corresponding to the category information, and select the category character with the largest similarity value As the result of the classification of the area to be identified, the result of each classification and the result of the target identification are filled into a preset template file to generate a structured data file of the original image, and the structured data file is fed back to the user.
  • category information such as name, gender, date of birth, ID number, address, and character information corresponding to the category information
  • the category result can also be verified.
  • the target recognition result is "Shanghai Sixth Hospital for People"
  • the names in the preset thesaurus are all standard correct That is, when the similarity value is greater than 99%, and the recognition result is different from the name of the preset lexicon, the name of the preset lexicon can be substituted for the recognition result for error correction, which further improves the accuracy of the output result.
  • the characters based on the target recognition result of the region to be recognized A regular expression is constructed separately, the regular expression is matched with various characters in the preset thesaurus, and the matching result is obtained as the category result of the area to be recognized.
  • the target recognition result of a certain area to be recognized is a single character, and the constructed regular expression matches the character of "male” or “female” in the preset thesaurus, then The category result of the area to be identified is "gender”; if the regular expression constructed by the target recognition result of a certain area to be identified is "* ⁇ * ⁇ *", it matches the "province” and "city” in the preset thesaurus , Then the category result of the area to be identified is "Address".
  • the embodiment of the present application also proposes a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may be a hard disk, a multimedia card, or an SD card. Any one or several of card, flash memory card, SMC, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, etc. Any combination of species.
  • the computer-readable storage medium includes an image structured data extraction program 10, and when the image structured data extraction program 10 is executed by a processor, the following operations are implemented:
  • Receiving step receiving a request sent by a user to extract structured data of an image, and obtaining the original image of the structured data to be extracted carried in the request;
  • Recognition step input the original image into a pre-trained position detection model to obtain position coordinate information of a plurality of regions to be recognized in the original image, and cut the plurality of regions to be recognized based on the position coordinate information, Perform image transformation processing on the region to be recognized after cutting, and input the pre-trained recognition model to the region to be recognized before and after the transformation process, and obtain the initial recognition results corresponding to each of the regions to be recognized.
  • a screening rule selects the target recognition result corresponding to each region to be recognized from each of the initial recognition results;
  • Extraction step use a preset algorithm to calculate the similarity between the characters corresponding to the target recognition result of each area to be recognized and the characters of each category in the preset thesaurus, and select the category character corresponding to the maximum similarity value as the category result of the area to be recognized Filling the result of each category and the result of target recognition into a preset template file to generate a structured data file of the original image, and feeding back the structured data file to the user.
  • the image structured data extraction method provided in this application can be applied to the fields of smart government affairs, smart education, etc., so as to promote the construction of smart cities.
  • the image structured data extraction method provided by this application further ensures the privacy and security of all the above-mentioned data
  • all the above-mentioned data can also be stored in a node of a blockchain.
  • the original image of the structured data to be extracted, or the structured data file, etc., these data can be stored in the blockchain node.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

一种图像结构化数据提取方法、装置、设备及存储介质,涉及人工智能技术。方法将待提取结构化数据的图像输入位置检测模型,得到图像中各个待识别区域的坐标,切割待识别区域后执行变换处理,将变换前及变换后的待识别区域均输入识别模型,得到初始识别结果,从初始识别结果中筛选中各个待识别区域的目标识别结果,根据各目标识别结果的字符与词库中各类别的字符的相似度,选取出类别结果,根据各类别结果与目标识别结果生成结构化数据文件。可以准确提取图像识别结果中的结构化数据。另外,还涉及人工智能中的图像识别技术以及区块链技术,可应用于智慧政务、智慧教育等领域中,从而推动智慧城市的建设。

Description

图像结构化数据提取方法、装置、设备及存储介质
本申请要求于2020年5月20日提交中国专利局、申请号为CN202010431403.7,发明名称为“图像结构化数据提取方法、电子装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种图像结构化数据提取方法、装置、设备及存储介质。
背景技术
传统的光学字符识别 (Optical Character Recognition,OCR)模型需要对每个字段都单独训练一个模型,发明人意识到如果所要处理的图片数据字段类型比较多,则需要大量的标注数据来训练多个模型,开发周期较长,模型训练时所占内存空间也较大,且由于传统的OCR识别模型仅识别文字的信息,不能提取识别结果中的结构化数据,这是本领域技术人员亟待解决的问题。
技术问题
鉴于以上内容,本申请提供一种图像结构化数据提取方法、装置、设备及存储介质,其目的在于解决现有技术中传统的OCR识别模型不能提取识别结果中的结构化数据的技术问题。
技术解决方案
为实现上述目的,本申请提供一种图像结构化数据提取方法,该方法包括:
接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
为了实现上述目的,本申请还提供一种图像结构化数据提取装置,所述装置包括:
接收模块:用于接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别模块:用于将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取模块:用于利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
为实现上述目的,本申请还提供一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如下步骤:
接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
有益效果
本申请通过对检测模型的输出结果进行图像变换处理,扩展同一待识别区域的数据,再将未变换处理和变换处理后的待识别图像均输入识别模型中,对不同识别结果进行进一步的筛选比对,得到最优的结果作为输出结果,可以提高识别模型输出结果的准确率,通过正则匹配和数据库查找的方法弥补了识别模型在获取结构化数据方面的不足,相对于传统OCR方案,模型训练所需要的训练数据相对较少,节省了系统内存。
附图说明
图1为本申请计算机设备较佳实施例的应用环境图;
图2为图像结构化数据提取装置的模块示意图;
图3为本申请图像结构化数据提取方法较佳实施例的流程示意图。
本申请目的的实现、功能特点及优点将结合实施例,参附图做进一步说明。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
参照图1所示,为本申请计算机设备1较佳实施例的示意图。
该计算机设备1包括但不限于:存储器11、处理器12、显示器13及网络接口14。所述计算机设备1通过网络接口14连接网络,获取原始数据。其中,所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi、通话网络等无线或有线网络。
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器11可以是所述计算机设备1的内部存储单元,例如该计算机设备1的硬盘或内存。在另一些实施例中,所述存储器11也可以是所述计算机设备1的外部存储设备,例如该计算机设备1配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。当然,所述存储器11还可以既包括所述计算机设备1的内部存储单元也包括其外部存储设备。本实施例中,存储器11通常用于存储安装于所述计算机设备1的操作系统和各类应用软件,例如图像结构化数据提取程序10的程序代码等。此外,存储器11还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器12在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器12通常用于控制所述计算机设备1的总体操作,例如执行数据交互或者通信相关的控制和处理等。本实施例中,所述处理器12用于运行所述存储器11中存储的程序代码或者处理数据,例如运行图像结构化数据提取程序10的程序代码等。
显示器13可以称为显示屏或显示单元。在一些实施例中显示器13可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。显示器13用于显示在计算机设备1中处理的信息以及用于显示可视化的工作界面,例如显示数据统计的结果。
网络接口14可选地可以包括标准的有线接口、无线接口(如WI-FI接口),该网络接口14通常用于在所述计算机设备1与其它电子设备之间建立通信连接。
图1仅示出了具有组件11-14以及图像结构化数据提取程序10的计算机设备1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选地,所述计算机设备1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在计算机设备1中处理的信息以及用于显示可视化的用户界面。
该计算机设备1还可以包括射频(Radio Frequency,RF)电路、传感器和音频电路等等,在此不再赘述。
在上述实施例中,处理器12执行存储器11中存储的图像结构化数据提取程序10时可以实现如下步骤:
接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
所述存储设备可以为计算机设备1的存储器11,也可以为与计算机设备1通讯连接的其它存储设备。
关于上述步骤的详细介绍,请参照下述图2关于图像结构化数据提取装置100的模块图以及图3关于图像结构化数据提取方法实施例的流程图的说明。
本申请所述图像结构化数据提取装置100可以安装于计算机设备中。根据实现的功能。本发所述模块也可以称之为单元,是指一种能够被计算机设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在计算机设备的存储器中。
参照图2所示,为图像结构化数据提取装置100一实施例的程序模块图。在本实施例中,所述图像结构化数据提取装置100包括:初始化模块110、识别模块120及提取模块130。
接收模块110,用于接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像。
在本实施例中,在接收到用户发出的提取图像结构化数据的请求后,解析该请求,获取请求中携带的待提取结构化数据的原始图像,其中,请求中可以包括待提取结构化数据的原始图像,也可以包括待提取结构化数据的原始图像的存储路径及原始图像的唯一标识。也就是说,原始图像可以是用户在提交请求时一并录入的,也可以是用户提交图像结构化数据提取请求之后从请求指定的地址中获取的,原始图像可以是身份证件图像、发票图像等。
识别模块120,用于将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果。
在本实施例中,将原始图像输入预先训练好的位置检测模型,得到原始图像的多个待识别区域的位置坐标信息,位置检测模型的深度学习模型可以由Faster-RCNN、SSD或Yolo等训练得到,在训练位置检测模型时,可以利用预设的标注工具(例如,Label Imag工具)以矩形框的形式,分别标注出各原始图像的待识别区域,并生成与各原始图像对应的预设格式的标注文件,预设格式的标注文件可以为可扩展标记语言(Extensible Markup Language,XML)格式,通过生成XML格式的标注文件,可使计算机能读取到原始图像的信息,例如,原始图像中各个待识别区域的坐标信息等。
原始图像输入位置检测模型,可以得到每个待识别区域在原图中所对应的坐标信息,根据坐标信息可在原图中切割出相应的待识别区域对切割后的待识别区域执行图像变换处理,进一步地,对切割后的待识别区域执行图像变换处理包括:分别提取各个待识别区域的高维向量,将各所述高维向量分别与预设的低维向量库进行匹配,若匹配到对应的低维向量,则生成配对样本作为该待识别区域变换处理后的特征向量;若未匹配到对应的低维向量,则选取所述低维向量库中预设的低维向量作为该待识别区域变换处理后的特征向量。
通过对位置检测模型的输出结果进行图像变换处理,扩展同一待识别区域的数据,再将变换处理后的待识别区域输入识别模型中,对不同识别结果进行进一步的筛选比对,得到最优的结果作为输出结果,可以提升后续识别模型输出结果的准确率。
在一个实施例中,对切割后的待识别区域执行图像变换处理还包括:对切割后的待识别区域执行上采样处理、亮度均衡处理或随机透视变换处理。若原始图像比较模糊或者待识别区域比较模糊,可以通过对待识别区域进行上采样变换,补充待识别区域的信息,再进行裁剪可以使图片变得更加易于识别模型的计算,从而得出相应的识别结果。对于识别模型而言,光照比较影响识别的效果,如果待识别区域一部分被强光照射或存在反光等现象,这样的待识别区域的图片就不利于识别出准确的结果,因此还可以对待识别区域的图片进行亮度均衡处理。
之后,将执行变换处理前及执行变换处理后的待识别区域,均输入预先训练好的识别模型,得到各待识别区域对应的初始识别结果,基于预设筛选规则从各所述初始识别结果中筛选中各个待识别区域对应的目标识别结果。
识别模型可以通过循环卷积神经网络模型训练的得到。其中,还可以预先将待识别区域分为“重要”、“次要”两个级别,例如对于身份证,通常业务逻辑里面,姓名、身份证号和地址相对于其他字段更重要,对于精度的要求也更高,实际应用场景中,可对这三个字段进行专有检测和识别模型的定制,其余字段不做定制,直接使用通用识别模型。由于执行了图像变换处理,识别模型的输出结果将会增多,具体表现在同一待识别区域会对应输出多个初始识别结果。
在一个实施例中,所述基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果包括:
分别读取各待识别区域的执行变换处理前的初始识别结果的置信度和执行变换处理后的初始识别结果的置信度,当执行变换处理前的初始识别结果的置信度大于或等于预设阈值时,将执行变换处理前的初始识别结果作为所述目标识别结果;
当执行变换处理前的初始识别结果的置信度小于预设阈值,且当执行变换处理后的初始识别结果的置信度大于执行变换处理前的初始识别结果的置信度时,将执行变换处理后的初始识别结果作为所述目标识别结果。
当执行变换处理前的初始识别结果的置信度大于或等于预设阈值(例如,90%),说明识别的准确性高,可直接选取执行变换处理前的识别结果作为目标结果,若小于预设阈值,则与执行变换处理后的识别结果比较置信度大小再选取目标识别结果,可提高选取的识别结果的准确性。
提取模块130,用于利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
在本实施例中,识别模型仅把每个待识别区域的内容识别出来,无法判断的目标识别结果中的语义或所属类别,即不能获取原始图像的结构化数据,获取结构化数据是指获取每个待识别区域的目标识别结果的对应的类别属性。可以利用预设算法(例如,余弦相似度算法)计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,预设词库包含各类别的关键字信息,例如,原始图像为身份证件为例,预设词库包含大量的姓名、性别、出生日期、身份证号、地址等类别信息及与类别信息对应的字符信息,选取相似度值最大的类别字符作为待识别区域的类别结果,将各个类别结果与目标识别结果填充至预设模板文件生成原始图像的结构化数据文件,并将所述结构化数据文件反馈至用户。
在一个实施例中,还可以对类别结果进行验证,例如,目标识别结果为“上海市第六入民医院”,该结果中存在错字,而预设词库里中的名称都是标准正确的,即当相似度值大于99%,且识别结果和预设词库的名称不一样时,则可以将预设词库的名称替代该识别结果进行纠错,进一步提升输出结果的准确率。
具体地,验证各所述待识别区域的类别结果是否符合预设的验证条件,当所述待识别区域的类别结果不符合预设的验证条件时,基于该待识别区域的目标识别结果的字符分别构建正则表达式,将该正则表达式与预设词库中各类字符进行匹配,得到匹配结果作为该待识别区域的类别结果。
例如:以原始图像为身份证图像为例,若某个待识别区域的目标识别结果是单字,构建的正则表达式匹配到预设词库中的“男”或“女”的字符,则该待识别区域的类别结果为“性别”;若某个待识别区域的目标识别结果构建的正则表达式为“*省*市*”,匹配到预设词库中的“省”、“市”,则该待识别区域的类别结果为“住址”。
此外,本申请还提供一种图像结构化数据提取方法。参照图3所示,为本申请图像结构化数据提取方法的实施例的方法流程示意图。计算机设备1的处理器12执行存储器11中存储的图像结构化数据提取程序10时实现图像结构化数据提取方法的如下步骤:
步骤S10:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像。
在本实施例中,在接收到用户发出的提取图像结构化数据的请求后,解析该请求,获取请求中携带的待提取结构化数据的原始图像,其中,请求中可以包括待提取结构化数据的原始图像,也可以包括待提取结构化数据的原始图像的存储路径及原始图像的唯一标识。也就是说,原始图像可以是用户在提交请求时一并录入的,也可以是用户提交图像结构化数据提取请求之后从请求指定的地址中获取的,原始图像可以是身份证件图像、发票图像等。
步骤S20:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果。
在本实施例中,将原始图像输入预先训练好的位置检测模型,得到原始图像的多个待识别区域的位置坐标信息,位置检测模型的深度学习模型可以由Faster-RCNN、SSD或Yolo等训练得到,在训练位置检测模型时,可以利用预设的标注工具(例如,Label Imag工具)以矩形框的形式,分别标注出各原始图像的待识别区域,并生成与各原始图像对应的预设格式的标注文件,预设格式的标注文件可以为可扩展标记语言(Extensible Markup Language,XML)格式,通过生成XML格式的标注文件,可使计算机能读取到原始图像的信息,例如,原始图像中各个待识别区域的坐标信息等。
原始图像输入位置检测模型,可以得到每个待识别区域在原图中所对应的坐标信息,根据坐标信息可在原图中切割出相应的待识别区域对切割后的待识别区域执行图像变换处理,进一步地,对切割后的待识别区域执行图像变换处理包括:分别提取各个待识别区域的高维向量,将各所述高维向量分别与预设的低维向量库进行匹配,若匹配到对应的低维向量,则生成配对样本作为该待识别区域变换处理后的特征向量;若未匹配到对应的低维向量,则选取所述低维向量库中预设的低维向量作为该待识别区域变换处理后的特征向量。
通过对位置检测模型的输出结果进行图像变换处理,扩展同一待识别区域的数据,再将变换处理后的待识别区域输入识别模型中,对不同识别结果进行进一步的筛选比对,得到最优的结果作为输出结果,可以提升后续识别模型输出结果的准确率。
在一个实施例中,对切割后的待识别区域执行图像变换处理还包括:对切割后的待识别区域执行上采样处理、亮度均衡处理或随机透视变换处理。若原始图像比较模糊或者待识别区域比较模糊,可以通过对待识别区域进行上采样变换,补充待识别区域的信息,再进行裁剪可以使图片变得更加易于识别模型的计算,从而得出相应的识别结果。对于识别模型而言,光照比较影响识别的效果,如果待识别区域一部分被强光照射或存在反光等现象,这样的待识别区域的图片就不利于识别出准确的结果,因此还可以对待识别区域的图片进行亮度均衡处理。
之后,将执行变换处理前及执行变换处理后的待识别区域,均输入预先训练好的识别模型,得到各待识别区域对应的初始识别结果,基于预设筛选规则从各所述初始识别结果中筛选中各个待识别区域对应的目标识别结果。
识别模型可以通过循环卷积神经网络模型训练的得到。其中,还可以预先将待识别区域分为“重要”、“次要”两个级别,例如对于身份证,通常业务逻辑里面,姓名、身份证号和地址相对于其他字段更重要,对于精度的要求也更高,实际应用场景中,可对这三个字段进行专有检测和识别模型的定制,其余字段不做定制,直接使用通用识别模型。由于执行了图像变换处理,识别模型的输出结果将会增多,具体表现在同一待识别区域会对应输出多个初始识别结果。
在一个实施例中,所述基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果包括:
分别读取各待识别区域的执行变换处理前的初始识别结果的置信度和执行变换处理后的初始识别结果的置信度,当执行变换处理前的初始识别结果的置信度大于或等于预设阈值时,将执行变换处理前的初始识别结果作为所述目标识别结果;
当执行变换处理前的初始识别结果的置信度小于预设阈值,且当执行变换处理后的初始识别结果的置信度大于执行变换处理前的初始识别结果的置信度时,将执行变换处理后的初始识别结果作为所述目标识别结果。
当执行变换处理前的初始识别结果的置信度大于或等于预设阈值(例如,90%),说明识别的准确性高,可直接选取执行变换处理前的识别结果作为目标结果,若小于预设阈值,则与执行变换处理后的识别结果比较置信度大小再选取目标识别结果,可提高选取的识别结果的准确性。
步骤S30:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
在本实施例中,识别模型仅把每个待识别区域的内容识别出来,无法判断的目标识别结果中的语义或所属类别,即不能获取原始图像的结构化数据,获取结构化数据是指获取每个待识别区域的目标识别结果的对应的类别属性。可以利用预设算法(例如,余弦相似度算法)计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,预设词库包含各类别的关键字信息,例如,原始图像为身份证件为例,预设词库包含大量的姓名、性别、出生日期、身份证号、地址等类别信息及与类别信息对应的字符信息,选取相似度值最大的类别字符作为待识别区域的类别结果,将各个类别结果与目标识别结果填充至预设模板文件生成原始图像的结构化数据文件,并将所述结构化数据文件反馈至用户。
在一个实施例中,还可以对类别结果进行验证,例如,目标识别结果为“上海市第六入民医院”,该结果中存在错字,而预设词库里中的名称都是标准正确的,即当相似度值大于99%,且识别结果和预设词库的名称不一样时,则可以将预设词库的名称替代该识别结果进行纠错,进一步提升输出结果的准确率。
具体地,验证各所述待识别区域的类别结果是否符合预设的验证条件,当所述待识别区域的类别结果不符合预设的验证条件时,基于该待识别区域的目标识别结果的字符分别构建正则表达式,将该正则表达式与预设词库中各类字符进行匹配,得到匹配结果作为该待识别区域的类别结果。
例如:以原始图像为身份证图像为例,若某个待识别区域的目标识别结果是单字,构建的正则表达式匹配到预设词库中的“男”或“女”的字符,则该待识别区域的类别结果为“性别”;若某个待识别区域的目标识别结果构建的正则表达式为“*省*市*”,匹配到预设词库中的“省”、“市”,则该待识别区域的类别结果为“住址”。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,该计算机可读存储介质可以是硬盘、多媒体卡、SD卡、闪存卡、SMC、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器等等中的任意一种或者几种的任意组合。所述计算机可读存储介质中包括图像结构化数据提取程序10,所述图像结构化数据提取程序10被处理器执行时实现如下操作:
接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
在一实施例中,本申请所提供的图像结构化数据提取方法可应用于智慧政务、智慧教育等领域中,从而推动智慧城市的建设。
在另一实施例中,本申请所提供的图像结构化数据提取方法,为进一步保证上述所有出现的数据的私密和安全性,上述所有数据还可以存储于一区块链的节点中。例如待提取结构化数据的原始图像、或结构化数据文件等等,这些数据均可存储在区块链节点中。
需要说明的是,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
本申请之计算机可读存储介质的具体实施方式与上述图像结构化数据提取方法的具体实施方式大致相同,在此不再赘述。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,电子装置,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种图像结构化数据提取方法,应用于计算机设备,其中,所述方法包括:
    接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
    识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
    提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
  2. 如权利要求1所述的图像结构化数据提取方法,其中,所述对切割后的待识别区域执行图像变换处理包括:
    分别提取各个待识别区域的高维向量,将各所述高维向量分别与预设的低维向量库进行匹配,若匹配到对应的低维向量,则生成配对样本作为该待识别区域变换处理后的特征向量;
    若未匹配到对应的低维向量,则选取所述低维向量库中预设的低维向量作为该待识别区域变换处理后的特征向量。
  3. 如权利要求1所述的图像结构化数据提取方法,其中,所述基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果包括:
    分别读取各待识别区域的执行变换处理前的初始识别结果的置信度和执行变换处理后的初始识别结果的置信度,当执行变换处理前的初始识别结果的置信度大于或等于预设阈值时,将执行变换处理前的初始识别结果作为所述目标识别结果;
    当执行变换处理前的初始识别结果的置信度小于预设阈值,且当执行变换处理后的初始识别结果的置信度大于执行变换处理前的初始识别结果的置信度时,将执行变换处理后的初始识别结果作为所述目标识别结果。
  4. 如权利要求1所述的图像结构化数据提取方法,其中,所述提取步骤还包括:
    验证各所述待识别区域的类别结果是否符合预设的验证条件,当所述待识别区域的类别结果不符合预设的验证条件时,基于该待识别区域的目标识别结果的字符分别构建正则表达式,将该正则表达式与预设词库中各类字符进行匹配,得到匹配结果作为该待识别区域的类别结果。
  5. 如权利要求1所述的图像结构化数据提取方法,其中,所述对切割后的待识别区域执行图像变换处理还包括:对切割后的待识别区域执行上采样处理、亮度均衡处理或随机透视变换处理。
  6. 如权利要求1所述的图像结构化数据提取方法,其中,所述预设算法包括余弦相似度算法。
  7. 如权利要求1所述的图像结构化数据提取方法,其中,所述识别模型是通过循环卷积神经网络模型训练的得到。
  8. 一种图像结构化数据提取装置,其中,所述装置包括:
    接收模块:用于接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
    识别模块:用于将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
    提取模块:用于利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:
    接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
    识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
    提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
  10. 如权利要求9所述的计算机设备,其中,所述对切割后的待识别区域执行图像变换处理包括:
    分别提取各个待识别区域的高维向量,将各所述高维向量分别与预设的低维向量库进行匹配,若匹配到对应的低维向量,则生成配对样本作为该待识别区域变换处理后的特征向量;
    若未匹配到对应的低维向量,则选取所述低维向量库中预设的低维向量作为该待识别区域变换处理后的特征向量。
  11. 如权利要求9所述的计算机设备,其中,所述基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果包括:
    分别读取各待识别区域的执行变换处理前的初始识别结果的置信度和执行变换处理后的初始识别结果的置信度,当执行变换处理前的初始识别结果的置信度大于或等于预设阈值时,将执行变换处理前的初始识别结果作为所述目标识别结果;
    当执行变换处理前的初始识别结果的置信度小于预设阈值,且当执行变换处理后的初始识别结果的置信度大于执行变换处理前的初始识别结果的置信度时,将执行变换处理后的初始识别结果作为所述目标识别结果。
  12. 如权利要求9所述的计算机设备,其中,所述提取步骤还包括:
    验证各所述待识别区域的类别结果是否符合预设的验证条件,当所述待识别区域的类别结果不符合预设的验证条件时,基于该待识别区域的目标识别结果的字符分别构建正则表达式,将该正则表达式与预设词库中各类字符进行匹配,得到匹配结果作为该待识别区域的类别结果。
  13. 如权利要求9所述的计算机设备,其中,所述对切割后的待识别区域执行图像变换处理还包括:对切割后的待识别区域执行上采样处理、亮度均衡处理或随机透视变换处理。
  14. 如权利要求9所述的计算机设备,其中,所述预设算法包括余弦相似度算法。
  15. 如权利要求9所述的计算机设备,其中,所述识别模型是通过循环卷积神经网络模型训练的得到。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
    接收步骤:接收用户发出的提取图像结构化数据的请求,获取所述请求中携带的待提取结构化数据的原始图像;
    识别步骤:将所述原始图像输入预先训练好的位置检测模型,得到所述原始图像中多个待识别区域的位置坐标信息,基于所述位置坐标信息对所述多个待识别区域进行切割,对切割后的待识别区域执行图像变换处理,将执行变换处理前及执行变换处理后的待识别区域,输入预先训练好的识别模型,得到各个所述待识别区域对应的初始识别结果,基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果;及
    提取步骤:利用预设算法计算各个待识别区域的目标识别结果对应的字符与预设词库中各类别的字符的相似度,选取最大相似度值对应的类别字符作为该待识别区域的类别结果,将各类别结果与目标识别结果填充至预设模板文件生成所述原始图像的结构化数据文件,并将所述结构化数据文件反馈至所述用户。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述对切割后的待识别区域执行图像变换处理包括:
    分别提取各个待识别区域的高维向量,将各所述高维向量分别与预设的低维向量库进行匹配,若匹配到对应的低维向量,则生成配对样本作为该待识别区域变换处理后的特征向量;
    若未匹配到对应的低维向量,则选取所述低维向量库中预设的低维向量作为该待识别区域变换处理后的特征向量。
  18. 如权利要求16所述的计算机可读存储介质,其中,所述基于预设筛选规则从各个所述初始识别结果中筛选中各个待识别区域对应的目标识别结果包括:
    分别读取各待识别区域的执行变换处理前的初始识别结果的置信度和执行变换处理后的初始识别结果的置信度,当执行变换处理前的初始识别结果的置信度大于或等于预设阈值时,将执行变换处理前的初始识别结果作为所述目标识别结果;
    当执行变换处理前的初始识别结果的置信度小于预设阈值,且当执行变换处理后的初始识别结果的置信度大于执行变换处理前的初始识别结果的置信度时,将执行变换处理后的初始识别结果作为所述目标识别结果。
  19. 如权利要求16所述的计算机可读存储介质,其中,所述提取步骤还包括:
    验证各所述待识别区域的类别结果是否符合预设的验证条件,当所述待识别区域的类别结果不符合预设的验证条件时,基于该待识别区域的目标识别结果的字符分别构建正则表达式,将该正则表达式与预设词库中各类字符进行匹配,得到匹配结果作为该待识别区域的类别结果。
  20. 如权利要求16所述的计算机可读存储介质,其中,所述对切割后的待识别区域执行图像变换处理还包括:对切割后的待识别区域执行上采样处理、亮度均衡处理或随机透视变换处理。
PCT/CN2020/098946 2020-05-20 2020-06-29 图像结构化数据提取方法、装置、设备及存储介质 WO2021151270A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010431403.7 2020-05-20
CN202010431403.7A CN111695439B (zh) 2020-05-20 2020-05-20 图像结构化数据提取方法、电子装置及存储介质

Publications (1)

Publication Number Publication Date
WO2021151270A1 true WO2021151270A1 (zh) 2021-08-05

Family

ID=72478033

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098946 WO2021151270A1 (zh) 2020-05-20 2020-06-29 图像结构化数据提取方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111695439B (zh)
WO (1) WO2021151270A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723347A (zh) * 2021-09-09 2021-11-30 京东科技控股股份有限公司 信息提取的方法、装置、电子设备及存储介质
CN114792422A (zh) * 2022-05-16 2022-07-26 合肥优尔电子科技有限公司 一种基于增强透视的光学文字识别方法
CN114842483A (zh) * 2022-06-27 2022-08-02 齐鲁工业大学 基于神经网络和模板匹配的标准文件信息提取方法及系统
CN114913320A (zh) * 2022-06-17 2022-08-16 支付宝(杭州)信息技术有限公司 基于模板的证件通用结构化方法和系统

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686262A (zh) * 2020-12-28 2021-04-20 广州博士信息技术研究院有限公司 一种基于图像识别技术的手册提取结构化数据并快速归档的方法
CN113011254B (zh) * 2021-02-04 2023-11-07 腾讯科技(深圳)有限公司 一种视频数据处理方法、计算机设备及可读存储介质
CN113011274B (zh) * 2021-02-24 2024-04-09 南京三百云信息科技有限公司 图像识别方法、装置、电子设备及存储介质
CN112990091A (zh) * 2021-04-09 2021-06-18 数库(上海)科技有限公司 基于目标检测的研报解析方法、装置、设备和存储介质
CN113837113A (zh) * 2021-09-27 2021-12-24 中国平安财产保险股份有限公司 基于人工智能的文档校验方法、装置、设备及介质
CN114140810B (zh) * 2022-01-30 2022-04-22 北京欧应信息技术有限公司 用于文档结构化识别的方法、设备和介质
CN114637845B (zh) * 2022-03-11 2023-04-14 上海弘玑信息技术有限公司 模型测试方法、装置、设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180307942A1 (en) * 2015-06-05 2018-10-25 Gracenote, Inc. Logo Recognition in Images and Videos
CN109858414A (zh) * 2019-01-21 2019-06-07 南京邮电大学 一种发票分块检测方法
CN109919014A (zh) * 2019-01-28 2019-06-21 平安科技(深圳)有限公司 Ocr识别方法及其电子设备
CN110135411A (zh) * 2019-04-30 2019-08-16 北京邮电大学 名片识别方法和装置
CN110647829A (zh) * 2019-09-12 2020-01-03 全球能源互联网研究院有限公司 一种票据的文本识别方法及系统
CN110717366A (zh) * 2018-07-13 2020-01-21 杭州海康威视数字技术股份有限公司 文本信息的识别方法、装置、设备及存储介质
US10572725B1 (en) * 2018-03-30 2020-02-25 Intuit Inc. Form image field extraction

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038178B (zh) * 2016-08-03 2020-07-21 平安科技(深圳)有限公司 舆情分析方法和装置
CN107798299B (zh) * 2017-10-09 2020-02-07 平安科技(深圳)有限公司 票据信息识别方法、电子装置及可读存储介质
CN108446621A (zh) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 票据识别方法、服务器及计算机可读存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180307942A1 (en) * 2015-06-05 2018-10-25 Gracenote, Inc. Logo Recognition in Images and Videos
US10572725B1 (en) * 2018-03-30 2020-02-25 Intuit Inc. Form image field extraction
CN110717366A (zh) * 2018-07-13 2020-01-21 杭州海康威视数字技术股份有限公司 文本信息的识别方法、装置、设备及存储介质
CN109858414A (zh) * 2019-01-21 2019-06-07 南京邮电大学 一种发票分块检测方法
CN109919014A (zh) * 2019-01-28 2019-06-21 平安科技(深圳)有限公司 Ocr识别方法及其电子设备
CN110135411A (zh) * 2019-04-30 2019-08-16 北京邮电大学 名片识别方法和装置
CN110647829A (zh) * 2019-09-12 2020-01-03 全球能源互联网研究院有限公司 一种票据的文本识别方法及系统

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723347A (zh) * 2021-09-09 2021-11-30 京东科技控股股份有限公司 信息提取的方法、装置、电子设备及存储介质
CN113723347B (zh) * 2021-09-09 2023-11-07 京东科技控股股份有限公司 信息提取的方法、装置、电子设备及存储介质
CN114792422A (zh) * 2022-05-16 2022-07-26 合肥优尔电子科技有限公司 一种基于增强透视的光学文字识别方法
CN114792422B (zh) * 2022-05-16 2023-12-12 合肥优尔电子科技有限公司 一种基于增强透视的光学文字识别方法
CN114913320A (zh) * 2022-06-17 2022-08-16 支付宝(杭州)信息技术有限公司 基于模板的证件通用结构化方法和系统
CN114842483A (zh) * 2022-06-27 2022-08-02 齐鲁工业大学 基于神经网络和模板匹配的标准文件信息提取方法及系统
CN114842483B (zh) * 2022-06-27 2023-11-28 齐鲁工业大学 基于神经网络和模板匹配的标准文件信息提取方法及系统

Also Published As

Publication number Publication date
CN111695439B (zh) 2024-05-10
CN111695439A (zh) 2020-09-22

Similar Documents

Publication Publication Date Title
WO2021151270A1 (zh) 图像结构化数据提取方法、装置、设备及存储介质
WO2020233332A1 (zh) 文本结构化信息提取方法、服务器及存储介质
WO2019184217A1 (zh) 热点事件分类方法、装置及存储介质
WO2022105122A1 (zh) 基于人工智能的答案生成方法、装置、计算机设备及介质
US11080910B2 (en) Method and device for displaying explanation of reference numeral in patent drawing image using artificial intelligence technology based machine learning
CN110442841B (zh) 识别简历的方法及装置、计算机设备、存储介质
WO2018000998A1 (zh) 界面生成方法、装置和系统
WO2019218514A1 (zh) 网页目标信息的提取方法、装置及存储介质
WO2021135469A1 (zh) 基于机器学习的信息抽取方法、装置、计算机设备及介质
WO2022105179A1 (zh) 生物特征图像识别方法、装置、电子设备及可读存储介质
US20110106805A1 (en) Method and system for searching multilingual documents
WO2021208703A1 (zh) 问题解析方法、装置、电子设备及存储介质
CN112287069B (zh) 基于语音语义的信息检索方法、装置及计算机设备
WO2022048363A1 (zh) 网站分类方法、装置、计算机设备及存储介质
CN111783471B (zh) 自然语言的语义识别方法、装置、设备及存储介质
CN112632278A (zh) 一种基于多标签分类的标注方法、装置、设备及存储介质
CN113656547B (zh) 文本匹配方法、装置、设备及存储介质
WO2022105493A1 (zh) 基于语义识别的数据查询方法、装置、设备及存储介质
WO2022105119A1 (zh) 意图识别模型的训练语料生成方法及其相关设备
CN111797217B (zh) 基于faq匹配模型的信息查询方法、及其相关设备
WO2023029513A1 (zh) 基于人工智能的搜索意图识别方法、装置、设备及介质
CN113821622A (zh) 基于人工智能的答案检索方法、装置、电子设备及介质
CN113360654B (zh) 文本分类方法、装置、电子设备及可读存储介质
WO2019085118A1 (zh) 基于主题模型的关联词分析方法、电子装置及存储介质
CN114842982B (zh) 一种面向医疗信息系统的知识表达方法、装置及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20917041

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20917041

Country of ref document: EP

Kind code of ref document: A1