WO2020164281A1 - Procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères, ainsi que support et dispositif informatique - Google Patents

Procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères, ainsi que support et dispositif informatique Download PDF

Info

Publication number
WO2020164281A1
WO2020164281A1 PCT/CN2019/118422 CN2019118422W WO2020164281A1 WO 2020164281 A1 WO2020164281 A1 WO 2020164281A1 CN 2019118422 W CN2019118422 W CN 2019118422W WO 2020164281 A1 WO2020164281 A1 WO 2020164281A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
layout
position information
recognition
table layout
Prior art date
Application number
PCT/CN2019/118422
Other languages
English (en)
Chinese (zh)
Inventor
周罡
卢波
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164281A1 publication Critical patent/WO2020164281A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables

Definitions

  • This application relates to the field of computer processing technology, and in particular to a table analysis method, medium and computer equipment based on text positioning and recognition.
  • Deep learning is developing rapidly in the field of image recognition. It has completely surpassed the accuracy and efficiency of traditional methods, and is deeply concerned in the field of image recognition. Deep learning is a new field in machine learning research. Its motivation lies in establishing and simulating a neural network for analysis and learning of the human brain. It mimics the mechanism of the human brain to interpret data, such as images, sounds and texts.
  • the recognition of the table refers to the conversion of the table in the table picture into editable table text. In this process, text recognition and image recognition are required.
  • the existing technical solution is to perform table analysis based on the presence of table lines. When there is no table line, the table format picture cannot be extracted.
  • the present application provides a form analysis method and corresponding device based on text positioning and recognition, which mainly realizes the positioning and recognition of text in form pictures by using established deep learning models, and improves the efficiency and accuracy of form picture recognition.
  • This application also provides a computer device and a readable storage medium for executing the table analysis method based on text positioning and recognition of this application.
  • the present application provides a method for analyzing table images based on text positioning and recognition, the method including:
  • the input form picture to a pre-trained text positioning network to obtain position information of characters in the form picture includes:
  • a rectangular coordinate system is established, and the coordinates of each vertex of the rectangular frame are obtained as the position information.
  • This application provides a form analysis method based on text positioning and recognition.
  • the position information of the characters in the form pictures is obtained; the form pictures are performed according to the position information.
  • a first table layout; according to the first table layout and the cell character content, a table file of the table picture is generated.
  • the established deep learning model can be used to locate and recognize the text in the table image, which improves the efficiency and accuracy of the table image recognition.
  • This application can detect whether the table picture contains grid lines; if the table picture contains grid lines, extract the second table layout of the table picture; combine the second table layout with the first A table layout is compared, and when the result of the comparison is that the first table layout is consistent with the second table layout, it is verified that the first table layout is valid.
  • This application can additionally detect whether there are table lines in the table picture. In the case that the table pictures have table lines, the table lines are directly extracted, and then the obtained first table layout and the extracted table line form the first The two table layouts are compared to verify whether the first table layout is valid.
  • This application uses the text positioning network and the text recognition network to parse the table pictures, which can be compatible with the situations where there is no table line and the table line or the table line is incomplete, and the scope of application is wide.
  • the present application may further calculate the comparison result of the second table layout and the first table layout, and the comparison result is expressed as the difference between the first table layout and the second table layout,
  • the comparison result is that the number of points of difference between the first table layout and the second table layout is greater than a preset value
  • the text positioning network is retrained. This application can flexibly and intelligently learn through this mechanism, and intelligently adjust the pre-trained text positioning network, so that the analysis result of the table image becomes more and more accurate.
  • FIG. 1 is a flowchart of a table parsing method based on text positioning recognition in an embodiment
  • Figure 2 is a text positioning network based on scene text detection in the prior art
  • FIG. 3 is a schematic diagram of obtaining position information of characters in the table picture in an embodiment
  • FIG. 4 is a structural block diagram of a table analysis device based on text positioning recognition in an embodiment
  • Fig. 5 is a block diagram of the internal structure of a computer device in an embodiment.
  • An embodiment of the present application provides a table analysis method based on text positioning and recognition. As shown in FIG. 1, the method includes the following steps:
  • the deep network training is performed by inputting multiple target samples in advance, and the text positioning network capable of positioning the text of the table picture and the text recognition network capable of recognizing the text of the table picture are trained. Specifically, feature point extraction and feature fusion are performed on the sample picture, and finally the text positioning network and the text recognition network are output.
  • the target sample includes at least a picture sample and the coordinates of a marked rectangular frame with text.
  • Deep network training is a new field in machine learning research. Its motivation is to establish and simulate a neural network that simulates the human brain for analysis and learning. It mimics the mechanism of the human brain to interpret data, such as images, sounds, and text.
  • the general idea of this application is the text detection and recognition process based on deep network training, specifically through FasterRCNN (deep learning-based target detection technology), CTPN (natural scene text detection) and other positioning networks for text detection and recognition in pictures. Positioning to obtain the location information of the text, and then input the area pointed to by the location information to the RNN-based text recognition network such as RCNN for text recognition, and obtain the character string corresponding to the location information.
  • RNN-based text recognition network such as RCNN for text recognition
  • Figure 2 is a text positioning network based on EAST (scene text detection).
  • the text positioning network used in this application is an improvement based on the EAST text positioning network.
  • the text positioning network used in this application is the score in the network structure shown in FIG. 2 After the map is connected to the LSTM (Long Short-term Memory Network), the score map is brightened and evenly touched. Use dice during training Loss replaces focus-loss.
  • LSTM is a time recurrent neural network, which is suitable for processing and predicting important events with relatively long intervals and delays in time series.
  • inputting the form picture described in this application to the pre-trained text positioning network to obtain the position information of the characters in the form picture specifically includes: inputting the form picture to the pre-trained text positioning network; Several character strings of are used as a character string combination; the smallest rectangular frame surrounding the character string combination is obtained; a rectangular coordinate system is established, and the coordinates of each vertex of the rectangular frame are obtained as the position information.
  • FIG. 3 is a schematic diagram of obtaining position information of characters in the table picture.
  • the table picture contains several character string combinations. After the text positioning network is used, the smallest rectangular frame wrapping each character string combination is output.
  • the position information of the characters in the table picture is expressed as the coordinate value of the smallest rectangular frame that wraps the combination of character strings.
  • the coordinates of the four vertices of the rectangular frame surrounding the character string combination can be directly obtained through the character positioning network.
  • the position information is expressed as the coordinate values of the upper left corner and the lower right corner of the rectangular frame.
  • the minimum and maximum values of the X axis and the minimum and maximum values of the Y axis constitute the coordinates of the upper left corner and the lower right corner of the rectangular frame, thereby obtaining a standard rectangular frame.
  • the coordinates of the four vertices of the smallest rectangular frame surrounding a certain string combination obtained through the text positioning network are: A (X1, Y1), A (X1, Y2), A (X2, Y1), and A (X2, Y2), according to the size of X1, X2, Y1, and Y2, select the coordinates of the upper left and lower right corners of the rectangle.
  • a rectangular frame is determined according to the position information, and a cell picture is determined according to the rectangular frame.
  • the present application performs image segmentation on the form picture according to the rectangular frame, and cuts out the cell picture corresponding to the rectangular frame from the form picture, wherein each cell picture contains a character string combination.
  • the present application inputs the cell picture to the text recognition network to recognize the content of the character string combination in the cell picture to obtain the cell character content.
  • the character recognition network is a classic character recognition CRNN network, and the cell character content that can be edited is obtained through the network.
  • extracting the first table layout of the table picture according to the position information specifically includes: extracting the coordinate values of the points at the upper left corner and the lower right corner of the rectangular frame in the position information; Divide the rectangular boxes corresponding to the points with the same abscissa into the same column according to the coordinate values of the points in the upper left corner and the lower right corner, and divide the rectangular boxes corresponding to the points with the same ordinate into the same row; calculate the total number of rows and the total The number of columns is used as the first table layout.
  • the rectangular frame wrapping each character string combination is divided into the positions of the rows and columns corresponding to the table pictures according to the overlap ratio of the position information in the horizontal direction and the vertical direction.
  • the ordinates of the vertices of the rectangular boxes in the same row are the same or similar
  • the abscissas of the rectangular boxes in the same column are the same or similar.
  • This application can set when the ordinates of two points are the same or the difference between the ordinates of the two points is within a preset range to determine that the two points are in the same row, and when the abscissas of the two points are the same or When the difference between the abscissas of the two points is within the preset range, it is determined that the two points are located in the same column.
  • this application divides the vertices of the rectangular frame with the same or similar ordinates into the same row, and divides the same or similar abscissas into the same column.
  • the first table layout includes at least the number of rows and columns of the table.
  • the name content of the table it has a text length that spans columns, so you can remove it first.
  • the generating a table file of the table picture according to the first table layout and the cell character content specifically includes: drawing a table according to the first table layout; The characters are correspondingly filled in the cells of the drawn table to generate a table file of the table picture.
  • the table corresponding to the table picture is drawn, and the table contains the same number of cells as the combination of the character strings. Further, this application fills the identified cell character content into the cells of the table to generate a table file, whose content can be saved in csv or json format for data analysis and processing by the program, thereby realizing the analysis of the table image .
  • the method before the input of the form picture to the pre-trained text positioning network and the position information of the characters in the form picture is obtained, the method further includes: detecting whether the form picture contains grid lines; if the form If the picture contains grid lines, extract the second table layout of the table picture; compare the second table layout with the first table layout, and when the comparison result is that the first table layout and the When the second table layout is consistent, it is verified that the first table layout is valid.
  • the second table layout can be extracted through the open and close operation of image science.
  • the present application can verify the reliability of the first table layout and the second table layout by comparing the first table layout with the second table layout.
  • the present application may also calculate a comparison result of the second table layout and the first table layout, and the comparison result is expressed as the difference between the first table layout and the second table.
  • the comparison result is that the number of points of difference between the first table layout and the second table layout is greater than a preset value, the text positioning network is retrained to improve the recognition accuracy of the solution.
  • the present application provides a form image analysis device based on text positioning recognition, including:
  • the input module 11 is used to input form pictures to a pre-trained text positioning network to obtain position information of characters in the form pictures.
  • the deep network training is performed by inputting multiple target samples in advance, and the text positioning network capable of positioning the text of the table picture and the text recognition network capable of recognizing the text of the table picture are trained. Specifically, feature point extraction and feature fusion are performed on the sample picture, and finally the text positioning network and the text recognition network are output.
  • the target sample includes at least a picture sample and the coordinates of a marked rectangular frame with text.
  • Deep network training is a new field in machine learning research. Its motivation is to establish and simulate a neural network that simulates the human brain for analysis and learning. It mimics the mechanism of the human brain to interpret data, such as images, sounds, and text.
  • the general idea of this application is the text detection and recognition process based on deep network training, specifically through FasterRCNN (deep learning-based target detection technology), CTPN (natural scene text detection) and other positioning networks for text detection and recognition in pictures. Positioning to obtain the location information of the text, and then input the area pointed to by the location information to the RNN-based text recognition network such as RCNN for text recognition, and obtain the character string corresponding to the location information.
  • RNN-based text recognition network such as RCNN for text recognition
  • Figure 2 is a text positioning network based on EAST (scene text detection).
  • the text positioning network used in this application is an improvement based on the EAST text positioning network.
  • the text positioning network used in this application is the score in the network structure shown in FIG. 2 After the map is connected to the LSTM (Long Short-term Memory Network), the score map is brightened and evenly touched. Use dice during training Loss replaces focus-loss.
  • LSTM is a time recurrent neural network, which is suitable for processing and predicting important events with relatively long intervals and delays in time series.
  • inputting the form picture described in this application to the pre-trained text positioning network to obtain the position information of the characters in the form picture specifically includes: inputting the form picture to the pre-trained text positioning network; Several character strings of are used as a character string combination; the smallest rectangular frame surrounding the character string combination is obtained; a rectangular coordinate system is established, and the coordinates of each vertex of the rectangular frame are obtained as the position information.
  • FIG. 3 is a schematic diagram of obtaining position information of characters in the table picture.
  • the table picture contains several character string combinations. After the text positioning network is used, the smallest rectangular frame wrapping each character string combination is output.
  • the position information of the characters in the table picture is expressed as the coordinate value of the smallest rectangular frame that wraps the combination of character strings.
  • the coordinates of the four vertices of the rectangular frame surrounding the character string combination can be directly obtained through the character positioning network.
  • the position information is expressed as the coordinate values of the upper left corner and the lower right corner of the rectangular frame.
  • the minimum and maximum values of the X axis and the minimum and maximum values of the Y axis constitute the coordinates of the upper left corner and the lower right corner of the rectangular frame, thereby obtaining a standard rectangular frame.
  • the coordinates of the four vertices of the smallest rectangular frame that wraps a certain string combination obtained through the text positioning network are: A(X1, Y1), A(X1, Y2), A(X2, Y1), and A (X2, Y2), according to the size of X1, X2, Y1, and Y2, select the coordinates of the upper left and lower right corners of the rectangle.
  • the segmentation module 12 is configured to perform graphic segmentation on the table picture according to the position information, segment the cell picture corresponding to the position information, and input the cell picture into a pre-trained text recognition network for character recognition to obtain Cell character content.
  • a rectangular frame is determined according to the position information, and a cell picture is determined according to the rectangular frame.
  • the present application performs image segmentation on the form picture according to the rectangular frame, and cuts out the cell picture corresponding to the rectangular frame from the form picture, wherein each cell picture contains a character string combination.
  • the present application inputs the cell picture to the text recognition network to recognize the content of the character string combination in the cell picture to obtain the cell character content.
  • the character recognition network is a classic character recognition CRNN network, and the cell character content that can be edited is obtained through the network.
  • the extraction module 13 is configured to extract the first table layout of the table picture according to the position information.
  • extracting the first table layout of the table picture according to the position information specifically includes: extracting the coordinate values of the points at the upper left corner and the lower right corner of the rectangular frame in the position information; Divide the rectangular boxes corresponding to the points with the same abscissa into the same column according to the coordinate values of the points in the upper left corner and the lower right corner, and divide the rectangular boxes corresponding to the points with the same ordinate into the same row; calculate the total number of rows and the total The number of columns is used as the first table layout.
  • the rectangular frame wrapping each character string combination is divided into the positions of the rows and columns corresponding to the table pictures according to the overlap ratio of the position information in the horizontal direction and the vertical direction.
  • the ordinates of the vertices of the rectangular boxes in the same row are the same or similar
  • the abscissas of the rectangular boxes in the same column are the same or similar.
  • This application can set when the ordinates of two points are the same or the difference between the ordinates of the two points is within a preset range to determine that the two points are in the same row, and when the abscissas of the two points are the same or When the difference between the abscissas of the two points is within the preset range, it is determined that the two points are located in the same column.
  • this application divides the vertices of the rectangular frame with the same or similar ordinates into the same row, and divides the same or similar abscissas into the same column.
  • the first table layout includes at least the number of rows and columns of the table.
  • the name content of the table it has a text length that spans columns, so you can remove it first.
  • the generating module 14 is configured to generate a table file of the table picture according to the first table layout and the cell character content.
  • the generating a table file of the table picture according to the first table layout and the cell character content specifically includes: drawing a table according to the first table layout; The characters are correspondingly filled in the cells of the drawn table to generate a table file of the table picture.
  • the table corresponding to the table picture is drawn, and the table contains the same number of cells as the combination of the character strings. Further, this application fills the identified cell character content into the cells of the table to generate a table file, whose content can be saved in csv or json format for data analysis and processing by the program, thereby realizing the analysis of the table image .
  • the method before the input of the form picture to the pre-trained text positioning network and the position information of the characters in the form picture is obtained, the method further includes: detecting whether the form picture contains grid lines; if the form If the picture contains grid lines, extract the second table layout of the table picture; compare the second table layout with the first table layout, and when the comparison result is that the first table layout and the When the second table layout is consistent, it is verified that the first table layout is valid.
  • the second table layout can be extracted through the open and close operation of image science.
  • the present application can verify the reliability of the first table layout and the second table layout by comparing the first table layout with the second table layout.
  • the present application may also calculate a comparison result of the second table layout and the first table layout, and the comparison result is expressed as the difference between the first table layout and the second table.
  • the comparison result is that the number of points of difference between the first table layout and the second table layout is greater than a preset value, the text positioning network is retrained to improve the recognition accuracy of the solution.
  • an embodiment of the present application provides a computer-readable storage medium, and the computer-readable storage medium may be a non-volatile readable storage medium.
  • the computer-readable storage medium stores computer-readable instructions, and when the program is executed by a processor, the table analysis method based on text positioning and recognition according to any one of the technical solutions is implemented.
  • the computer-readable storage medium includes but is not limited to any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory, read-only memory), RAM (Random AcceSS Memory, random memory), EPROM (EraSable Programmable Read-Only Memory, erasable programmable read-only memory), EEPROM (Electrically EraSable Programmable Read-Only Memory, electrically erasable programmable read-only memory), flash memory, magnetic card or optical card.
  • a storage device includes any medium that stores or transmits information in a readable form by a device (for example, a computer, a mobile phone), and may be a read-only memory, a magnetic disk, or an optical disk.
  • the computer-readable storage medium provided by the embodiment of the application can realize the input of form pictures to a pre-trained text positioning network to obtain the position information of the characters in the form pictures; graph the form pictures according to the position information Segmentation, segmenting the cell picture corresponding to the position information, inputting the cell picture into a pre-trained text recognition network for character recognition, and obtaining cell character content; extracting the first part of the table picture according to the position information A table layout; according to the first table layout and the cell character content, a table file of the table picture is generated.
  • the established deep learning model can be used to locate and recognize the text in the table image, which improves the efficiency and accuracy of the table image recognition.
  • the present application provides a computer device.
  • the computer device includes a processor 303, a memory 305, an input unit 307, and a display unit 309.
  • the memory 305 may be used to store the application program 301 and various functional modules, and the processor 303 runs the application program 301 stored in the memory 305 to execute various functional applications and data processing of the device.
  • the memory 305 may be internal memory or external memory, or include both internal memory and external memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the memory disclosed in this application includes but is not limited to these types of memory.
  • the memory 305 disclosed in this application is merely an example and not a limitation.
  • the input unit 307 is used for receiving input of signals and receiving keywords input by the user.
  • the input unit 307 may include a touch panel and other input devices.
  • the touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to the preset
  • the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 309 may be used to display information input by the user or information provided to the user and various menus of the computer device.
  • the display unit 309 may take the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the processor 303 is the control center of the computer equipment. It uses various interfaces and lines to connect the various parts of the entire computer. By running or executing the software programs and/or modules stored in the memory 303, and calling the data stored in the memory, execute Various functions and processing data.
  • the one or more processors 303 shown in FIG. 5 can execute and realize the functions of the input module 11, the recognition module 12, the extraction module 13, and the generation module 14 shown in FIG. 4.
  • the computer device includes a memory 305 and a processor 303.
  • the memory 305 stores computer-readable instructions.
  • the processor 303 executes the steps of a table analysis method based on character positioning recognition described in the above embodiment.
  • the computer device provided by the embodiment of the application can input form pictures to a pre-trained text positioning network to obtain position information of characters in the form pictures; perform graphic segmentation and segmentation on the form pictures according to the position information
  • the cell picture corresponding to the position information is extracted, and the cell picture is input into a pre-trained text recognition network for character recognition to obtain the cell character content; according to the position information, the first table layout of the table picture is extracted ; According to the first table layout and the cell character content, a table file of the table picture is generated.
  • the established deep learning model can be used to locate and recognize the text in the table image, which improves the efficiency and accuracy of the table image recognition.
  • the present application can also detect whether the table picture contains grid lines; if the table picture contains grid lines, extract the second table layout of the table picture; The second table layout is compared with the first table layout, and when the comparison result is that the first table layout is consistent with the second table layout, it is verified that the first table layout is valid.
  • This application can additionally detect whether there are table lines in the table picture. In the case that the table pictures have table lines, the table lines are directly extracted, and then the obtained first table layout and the extracted table line form the first The two table layouts are compared to verify whether the first table layout is valid.
  • This application uses the text positioning network and the text recognition network to parse the table pictures, which can be compatible with the situations where there is no table line and the table line or the table line is incomplete, and the scope of application is wide.
  • the computer-readable storage medium provided in the embodiment of the present application can implement the above-mentioned embodiment of the table analysis method based on text positioning and recognition.
  • the aforementioned storage medium may be a magnetic disk, an optical disk, a read-only storage memory (Read-Only Non-volatile storage media such as Memory, ROM, or Random Access Memory (RAM), etc.

Abstract

La présente invention concerne un procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères. Le procédé comprend les étapes consistant à : entrer une image de formulaire dans un réseau de localisation de caractères pré-entraîné pour obtenir des informations d'emplacement de caractères dans l'image de formulaire (S11) ; effectuer une segmentation de graphique sur l'image de formulaire selon les informations d'emplacement pour obtenir une image de cellule correspondant aux informations d'emplacement, et entrer l'image de cellule dans un réseau de reconnaissance de caractères pré-entraîné pour effectuer une reconnaissance de caractères de façon à obtenir le contenu de caractères de cellule (S12) ; extraire une première disposition de formulaire de l'image de formulaire selon les informations d'emplacement (S13) ; et générer un fichier de formulaire de l'image de formulaire selon la première disposition de formulaire et le contenu de caractères de cellule (S14). Un modèle d'apprentissage profond établi peut être utilisé pour localiser et reconnaître des caractères dans une image de formulaire, ce qui permet d'améliorer l'efficacité et la précision de la reconnaissance d'image de formulaire.
PCT/CN2019/118422 2019-02-13 2019-11-14 Procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères, ainsi que support et dispositif informatique WO2020164281A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910115364.7A CN109961008A (zh) 2019-02-13 2019-02-13 基于文字定位识别的表格解析方法、介质及计算机设备
CN201910115364.7 2019-02-13

Publications (1)

Publication Number Publication Date
WO2020164281A1 true WO2020164281A1 (fr) 2020-08-20

Family

ID=67023672

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118422 WO2020164281A1 (fr) 2019-02-13 2019-11-14 Procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères, ainsi que support et dispositif informatique

Country Status (2)

Country Link
CN (1) CN109961008A (fr)
WO (1) WO2020164281A1 (fr)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985459A (zh) * 2020-09-18 2020-11-24 北京百度网讯科技有限公司 表格图像校正方法、装置、电子设备和存储介质
CN112132794A (zh) * 2020-09-14 2020-12-25 杭州安恒信息技术股份有限公司 审计视频的文字定位方法、装置、设备和可读存储介质
CN112200117A (zh) * 2020-10-22 2021-01-08 长城计算机软件与系统有限公司 表格识别方法及装置
CN112364726A (zh) * 2020-10-27 2021-02-12 重庆大学 基于改进east的零件喷码字符定位的方法
CN112686258A (zh) * 2020-12-10 2021-04-20 广州广电运通金融电子股份有限公司 体检报告信息结构化方法、装置、可读存储介质和终端
CN112712014A (zh) * 2020-12-29 2021-04-27 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN113128490A (zh) * 2021-04-28 2021-07-16 湖南荣冠智能科技有限公司 一种处方信息扫描和自动识别方法
CN113378789A (zh) * 2021-07-08 2021-09-10 京东数科海益信息科技有限公司 单元格位置的检测方法、装置和电子设备
CN113392811A (zh) * 2021-07-08 2021-09-14 北京百度网讯科技有限公司 一种表格提取方法、装置、电子设备及存储介质
CN113538291A (zh) * 2021-08-02 2021-10-22 广州广电运通金融电子股份有限公司 卡证图像倾斜校正方法、装置、计算机设备和存储介质
CN114612921A (zh) * 2022-05-12 2022-06-10 中信证券股份有限公司 表单识别方法、装置、电子设备和计算机可读介质
CN115841679A (zh) * 2023-02-23 2023-03-24 江西中至科技有限公司 图纸表格提取方法、系统、计算机及可读存储介质
CN113538291B (zh) * 2021-08-02 2024-05-14 广州广电运通金融电子股份有限公司 卡证图像倾斜校正方法、装置、计算机设备和存储介质

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961008A (zh) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 基于文字定位识别的表格解析方法、介质及计算机设备
CN110334647A (zh) * 2019-07-03 2019-10-15 云南电网有限责任公司信息中心 一种基于图像识别的参数格式化方法
CN110347994B (zh) * 2019-07-12 2023-06-30 北京香侬慧语科技有限责任公司 一种表格处理方法和装置
CN110532968B (zh) * 2019-09-02 2023-05-23 苏州美能华智能科技有限公司 表格识别方法、装置和存储介质
CN110826393B (zh) * 2019-09-17 2022-12-30 中国地质大学(武汉) 钻孔柱状图信息自动提取方法
CN110956087B (zh) * 2019-10-25 2024-04-19 北京懿医云科技有限公司 一种图片中表格的识别方法、装置、可读介质和电子设备
CN110895696A (zh) * 2019-11-05 2020-03-20 泰康保险集团股份有限公司 一种图像信息提取方法和装置
CN111178353A (zh) * 2019-12-16 2020-05-19 中国建设银行股份有限公司 一种图像文字的定位方法和装置
CN111368744B (zh) * 2020-03-05 2023-06-27 中国工商银行股份有限公司 图片中非结构化表格识别方法及装置
CN111382717B (zh) * 2020-03-17 2022-09-09 腾讯科技(深圳)有限公司 一种表格识别方法、装置和计算机可读存储介质
CN111428723B (zh) * 2020-04-02 2021-08-24 苏州杰锐思智能科技股份有限公司 字符识别方法及装置、电子设备、存储介质
CN111639637B (zh) * 2020-05-29 2023-08-15 北京百度网讯科技有限公司 表格识别方法、装置、电子设备和存储介质
CN111753727B (zh) * 2020-06-24 2023-06-23 北京百度网讯科技有限公司 用于提取结构化信息的方法、装置、设备及可读存储介质
CN111783735B (zh) * 2020-07-22 2021-01-22 欧冶云商股份有限公司 一种基于人工智能的钢材单据解析系统
CN112149506A (zh) * 2020-08-25 2020-12-29 北京来也网络科技有限公司 结合rpa和ai的图像中的表格生成方法、设备及存储介质
CN113807158A (zh) * 2020-12-04 2021-12-17 四川医枢科技股份有限公司 一种pdf内容提取方法、装置及设备
CN112541332B (zh) * 2020-12-08 2023-06-23 北京百度网讯科技有限公司 表单信息抽取方法、装置、电子设备及存储介质
CN112733855B (zh) * 2020-12-30 2024-04-09 科大讯飞股份有限公司 表格结构化方法、表格恢复设备及具有存储功能的装置
CN113553892A (zh) * 2020-12-31 2021-10-26 内蒙古卫数数据科技有限公司 一种基于深度学习和ocr的检验、体检报告单结果提取方法
CN113065405B (zh) * 2021-03-08 2022-12-23 南京苏宁软件技术有限公司 图片识别方法、装置、计算机设备和存储介质
CN113297308B (zh) * 2021-03-12 2023-09-22 贝壳找房(北京)科技有限公司 表格结构化信息提取方法、装置及电子设备
CN112906695B (zh) * 2021-04-14 2022-03-08 数库(上海)科技有限公司 适配多类ocr识别接口的表格识别方法及相关设备
CN113112567A (zh) * 2021-04-16 2021-07-13 中国工商银行股份有限公司 生成可编辑流程图的方法、装置、电子设备和存储介质
CN113298167A (zh) * 2021-06-01 2021-08-24 北京思特奇信息技术股份有限公司 一种基于轻量级神经网络模型的文字检测方法及系统
CN113609906A (zh) * 2021-06-30 2021-11-05 南京信息工程大学 一种面向文献的表格信息抽取方法
CN113569677A (zh) * 2021-07-16 2021-10-29 国网天津市电力公司 一种基于扫描件的纸质试验报告生成方法
CN113989822B (zh) * 2021-12-24 2022-03-08 中奥智能工业研究院(南京)有限公司 基于计算机视觉和自然语言处理的图片表格内容提取方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908136A (zh) * 2009-06-08 2010-12-08 比亚迪股份有限公司 一种表格识别处理方法及系统
US20150169972A1 (en) * 2013-12-12 2015-06-18 Aliphcom Character data generation based on transformed imaged data to identify nutrition-related data or other types of data
CN105512611A (zh) * 2015-11-25 2016-04-20 成都数联铭品科技有限公司 一种表格图像检测识别方法
CN108805076A (zh) * 2018-06-07 2018-11-13 浙江大学 环境影响评估报告书表格文字的提取方法及系统
CN109961008A (zh) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 基于文字定位识别的表格解析方法、介质及计算机设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426856A (zh) * 2015-11-25 2016-03-23 成都数联铭品科技有限公司 一种图像表格文字识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908136A (zh) * 2009-06-08 2010-12-08 比亚迪股份有限公司 一种表格识别处理方法及系统
US20150169972A1 (en) * 2013-12-12 2015-06-18 Aliphcom Character data generation based on transformed imaged data to identify nutrition-related data or other types of data
CN105512611A (zh) * 2015-11-25 2016-04-20 成都数联铭品科技有限公司 一种表格图像检测识别方法
CN108805076A (zh) * 2018-06-07 2018-11-13 浙江大学 环境影响评估报告书表格文字的提取方法及系统
CN109961008A (zh) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 基于文字定位识别的表格解析方法、介质及计算机设备

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132794A (zh) * 2020-09-14 2020-12-25 杭州安恒信息技术股份有限公司 审计视频的文字定位方法、装置、设备和可读存储介质
CN111985459B (zh) * 2020-09-18 2023-07-28 北京百度网讯科技有限公司 表格图像校正方法、装置、电子设备和存储介质
CN111985459A (zh) * 2020-09-18 2020-11-24 北京百度网讯科技有限公司 表格图像校正方法、装置、电子设备和存储介质
CN112200117A (zh) * 2020-10-22 2021-01-08 长城计算机软件与系统有限公司 表格识别方法及装置
CN112200117B (zh) * 2020-10-22 2023-10-13 长城计算机软件与系统有限公司 表格识别方法及装置
CN112364726A (zh) * 2020-10-27 2021-02-12 重庆大学 基于改进east的零件喷码字符定位的方法
CN112686258A (zh) * 2020-12-10 2021-04-20 广州广电运通金融电子股份有限公司 体检报告信息结构化方法、装置、可读存储介质和终端
CN112712014A (zh) * 2020-12-29 2021-04-27 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN112712014B (zh) * 2020-12-29 2024-04-30 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN113128490A (zh) * 2021-04-28 2021-07-16 湖南荣冠智能科技有限公司 一种处方信息扫描和自动识别方法
CN113128490B (zh) * 2021-04-28 2023-12-05 湖南荣冠智能科技有限公司 一种处方信息扫描和自动识别方法
CN113392811B (zh) * 2021-07-08 2023-08-01 北京百度网讯科技有限公司 一种表格提取方法、装置、电子设备及存储介质
CN113378789B (zh) * 2021-07-08 2023-09-26 京东科技信息技术有限公司 单元格位置的检测方法、装置和电子设备
CN113392811A (zh) * 2021-07-08 2021-09-14 北京百度网讯科技有限公司 一种表格提取方法、装置、电子设备及存储介质
CN113378789A (zh) * 2021-07-08 2021-09-10 京东数科海益信息科技有限公司 单元格位置的检测方法、装置和电子设备
CN113538291A (zh) * 2021-08-02 2021-10-22 广州广电运通金融电子股份有限公司 卡证图像倾斜校正方法、装置、计算机设备和存储介质
CN113538291B (zh) * 2021-08-02 2024-05-14 广州广电运通金融电子股份有限公司 卡证图像倾斜校正方法、装置、计算机设备和存储介质
CN114612921B (zh) * 2022-05-12 2022-07-19 中信证券股份有限公司 表单识别方法、装置、电子设备和计算机可读介质
CN114612921A (zh) * 2022-05-12 2022-06-10 中信证券股份有限公司 表单识别方法、装置、电子设备和计算机可读介质
CN115841679A (zh) * 2023-02-23 2023-03-24 江西中至科技有限公司 图纸表格提取方法、系统、计算机及可读存储介质
CN115841679B (zh) * 2023-02-23 2023-05-05 江西中至科技有限公司 图纸表格提取方法、系统、计算机及可读存储介质

Also Published As

Publication number Publication date
CN109961008A (zh) 2019-07-02

Similar Documents

Publication Publication Date Title
WO2020164281A1 (fr) Procédé d'analyse de formulaire basé sur l'emplacement et la reconnaissance de caractères, ainsi que support et dispositif informatique
CN106056996B (zh) 一种多媒体交互教学系统及方法
WO2020164267A1 (fr) Procédé et appareil de construction de modèle de classification de texte, terminal et support de stockage
WO2020107765A1 (fr) Procédé, appareil et dispositif de traitement d'analyse de déclaration, et support de stockage lisible par ordinateur
WO2020253112A1 (fr) Procédé d'acquisition de stratégie de test, dispositif, terminal et support de stockage lisible
WO2014069741A1 (fr) Appareil et procédé de notation automatique
WO2012161359A1 (fr) Procédé et dispositif pour une interface utilisateur
WO2019156332A1 (fr) Dispositif de production de personnage d'intelligence artificielle pour réalité augmentée et système de service l'utilisant
WO2020107761A1 (fr) Procédé, appareil et dispositif de traitement de copie de publicité et support d'informations lisible par ordinateur
WO2018090740A1 (fr) Procédé et appareil de mise en œuvre d'une compagnie en fonction d'une technologie de réalité mixte
WO2015065006A1 (fr) Appareil multimédia, système d'éducation en ligne et procédé associé pour fournir un contenu d'éducation
WO2011068284A1 (fr) Procédé de commande de dispositif électronique d'apprentissage de langues, système, et système d'interprétation simultanée utilisant un tel procédé
WO2020159140A1 (fr) Dispositif électronique et son procédé de commande
WO2016182393A1 (fr) Procédé et dispositif d'analyse de l'émotion d'un utilisateur
WO2020134114A1 (fr) Dispositif de stockage, procédé d'implémentation de code de vérification, dispositif et équipement
WO2012034469A1 (fr) Système et procédé d'interaction homme-machine à base de gestes et support de stockage informatique
WO2023224433A1 (fr) Procédé et dispositif de génération d'informations
CN112016077A (zh) 一种基于滑动轨迹模拟的页面信息获取方法、装置和电子设备
WO2020045909A1 (fr) Appareil et procédé pour logiciel intégré d'interface utilisateur pour sélection multiple et fonctionnement d'informations segmentées non consécutives
WO2015109772A1 (fr) Dispositif de traitement de données et procédé de traitement de données
WO2020022645A1 (fr) Procédé et dispositif électronique pour configurer un clavier d'écran tactile
CN111860083A (zh) 一种人物关系补全方法及装置
WO2021177719A1 (fr) Procédé de fonctionnement d'une plateforme de traduction
WO2021003922A1 (fr) Procédé, dispositif et appareil d'optimisation d'entrée d'informations de page et support de stockage
WO2022145723A1 (fr) Procédé et appareil de détection de disposition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915547

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 05.10.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19915547

Country of ref document: EP

Kind code of ref document: A1