WO2020186779A1 - 图片信息识别方法、装置、计算机设备和存储介质 - Google Patents

图片信息识别方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020186779A1
WO2020186779A1 PCT/CN2019/117377 CN2019117377W WO2020186779A1 WO 2020186779 A1 WO2020186779 A1 WO 2020186779A1 CN 2019117377 W CN2019117377 W CN 2019117377W WO 2020186779 A1 WO2020186779 A1 WO 2020186779A1
Authority
WO
WIPO (PCT)
Prior art keywords
chart
business
blank
picture
service
Prior art date
Application number
PCT/CN2019/117377
Other languages
English (en)
French (fr)
Inventor
孙强
陆凯杰
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020186779A1 publication Critical patent/WO2020186779A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition

Definitions

  • This application relates to a method, device, computer equipment and storage medium for identifying picture information.
  • a method, apparatus, computer equipment, and storage medium for identifying picture information are provided.
  • a method for identifying picture information executed by a computer device, the method comprising: receiving a business picture uploaded by a first terminal; when the business picture contains chart information, determining the chart type corresponding to the business picture; if the chart The type is the first type, the chart lines in the business picture are extracted, and multiple chart lines are spliced to obtain a first chart; the first chart includes a plurality of blank cells; and the information text corresponding to each blank cell is identified; Convert the first chart into a second chart; the second chart includes a plurality of standard grids; and determine the corresponding relationship between the standard grid and the blank grid, and fill the information text into the corresponding relationship according to the corresponding relationship In the standard grid, the business chart corresponding to the business picture is obtained, and the business chart is stored in the chart library.
  • the receiving a service picture uploaded by the first terminal includes: receiving a service request sent by the first terminal; the service request carries a service type; and obtaining a source service page queue corresponding to the service type;
  • the source service page queue includes feature pages; the feature page includes blank cells; the source service page queue is returned to the first terminal so that the first terminal displays the source service page queue, and when the feature page is displayed , Collecting business data, filling the collected business data into blank cells of the feature page to generate a target business page queue; receiving the target business page queue sent by the first terminal, and extracting business data from the target business page queue;
  • the business data includes a business document; and scanning the business document to obtain at least one business picture containing chart information.
  • the method further includes: receiving a chart query request sent by the second terminal based on the business file; searching for the corresponding business chart in the chart database according to the query field included in the chart query request; acquiring the The layout information of the business chart in the business document; and returning the business chart and the corresponding layout information to the second terminal; enabling the second terminal to quickly locate the business chart in the business document according to the layout information, and respond accordingly The business picture is replaced with the acquired business chart.
  • the extracting chart lines in the business picture, and splicing multiple chart lines to obtain the first chart includes: detecting horizontal lines on the business picture according to a horizontal corrosion expansion algorithm, Obtain multiple horizontal lines; perform vertical line detection on the business image according to the vertical corrosion expansion algorithm to obtain multiple vertical lines; intersect the horizontal lines and the vertical lines to obtain a table line graph; and The non-cell elements in the table line graph are filtered by edge detection to obtain the first graph.
  • the identifying the information text corresponding to each blank cell includes: clipping the information block diagram in each blank cell; and inputting the information block diagram into a preset convolution
  • the neural network model recognizes the information text corresponding to each information block diagram.
  • the determining the blank grid that each standard grid matches includes: determining the starting point coordinates of each standard grid, traversing the second chart according to the starting point coordinates; querying whether the standard grid in the current traversal sequence exists Blank cells with the same starting point coordinates; if so, mark the blank cells with the same starting point coordinates as the blank cells that match the corresponding standard cell; otherwise, mark the blank cells in the previous column of the same line or the previous line of the same column as the current traversal sequence standard And mark the standard cell of the next traversal sequence as the standard cell of the current traversal sequence, and return to the step of querying whether there is a blank cell with the same starting point coordinates in the standard cell of the current traversal sequence, until the first Second, the chart traversal is complete.
  • a picture information recognition device comprising: a picture recognition module for receiving a business picture uploaded by a first terminal; when the business picture contains chart information, determining the chart type corresponding to the business picture; and table reconstruction The module is used to extract the chart lines in the business picture if the chart type is the first type, and to splice multiple chart lines to obtain the first chart; the first chart includes a plurality of blank cells; The first chart is converted into a second chart; the second chart includes a plurality of standard cells; a text mapping module for identifying the information text corresponding to each blank cell; and determining the correspondence between the standard cells and the blank cells Relationship: Filling the information text into the standard grid according to the corresponding relationship to obtain the business chart corresponding to the business picture, and storing the business chart in the chart library.
  • the picture recognition module is further configured to receive a service request sent by the first terminal; the service request carries a service type; obtain the source service page queue corresponding to the service type; the source service page The queue includes characteristic pages; the characteristic pages include blank cells; the source service page queue is returned to the first terminal, so that the first terminal displays the source service page queue, and when the characteristic page is displayed, service data is collected, Fill the collected business data into the blank cells of the feature page to generate a target business page queue; receive the target business page queue sent by the first terminal, and extract business data from the target business page queue; the business data includes business files ; And scanning the business document to obtain at least one business picture containing chart information.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the steps of the image information recognition method provided in any one of the embodiments of the present application are implemented.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors implement any one of the embodiments of the present application. Provide the steps of the image information recognition method.
  • Fig. 1 is an application scenario diagram of a method for identifying picture information according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of a method for identifying picture information according to one or more embodiments.
  • Fig. 3 is a schematic flowchart of steps for reconstructing the first graph in one or more embodiments.
  • Fig. 4 is a structural block diagram of an apparatus for identifying picture information according to one or more embodiments.
  • Figure 5 is a block diagram of a computer device according to one or more embodiments.
  • the image information recognition method provided in this application can be applied to the application environment shown in FIG. 1.
  • the first terminal 102 and the server 104 communicate through the network, and the second terminal 106 and the server 104 communicate through the network.
  • the first terminal 102 and the second terminal 106 can be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 can be an independent server or a server cluster composed of multiple servers. to realise.
  • the service picture can be uploaded.
  • Business pictures can contain chart information.
  • the first terminal 102 uploads the service picture to the server 104.
  • the server 104 recognizes the chart type of the chart included in the service picture based on the preset convolutional neural network pattern.
  • the chart types include the first type, the second type, and so on.
  • the server 104 extracts chart lines in the business picture, and splices multiple chart lines to obtain the first chart.
  • the first chart includes multiple blank cells.
  • the server 104 recognizes the information text corresponding to each blank cell.
  • the server 104 maps the first chart to the corresponding second chart.
  • the second chart includes multiple standard grids.
  • the server 104 determines the matching relationship between the standard cells and the blank cells, that is, determines the blank cells that each standard cell matches.
  • the server 104 fills the information text corresponding to the blank grid into the matching standard grid to obtain the business chart corresponding to the business picture, and stores the business chart in the chart library.
  • the server 104 When subsequently receiving the chart query request sent by the second terminal 106, the server 104 responds to the chart query request based on the chart library.
  • the first terminal 102 and the second terminal 106 may be the same terminal.
  • the server extracts the text information contained in the business picture uploaded by the user, and restores the display mode of the text information in the form of a chart.
  • the user queries the business picture he can directly use the information in the business picture Text information greatly improves the efficiency of obtaining picture information.
  • a method for recognizing picture information is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Receive a service picture uploaded by the first terminal.
  • a business platform is installed on the first terminal.
  • the business information is uploaded through the business platform on the first terminal.
  • the business profile can be a business document or a business picture.
  • the business file may contain one or more business pictures, and at least one business picture records chart information.
  • Business pictures can be screenshots, photos, etc.
  • the server performs binarization processing on the received business pictures containing chart information to convert the color business pictures into black and white pictures.
  • Step 204 When the business picture contains chart information, determine the chart type corresponding to the business picture.
  • the chart type of the chart in the business picture can be an excel sheet with table lines, an excel sheet without table lines but a table format, or a column chart, a line chart, etc.
  • the server trains the initial model based on sample pictures containing different types of charts to obtain an image processing model.
  • the initial model can be a Convolutional Neural Network (CNN) model.
  • the server inputs the binarized business pictures into the image processing model to obtain various picture information of the business pictures.
  • Picture information includes picture location, chart type, and graphic element information.
  • the picture location refers to the page number information of the business picture in the business file.
  • the primitive information includes primitive fields and primitive coordinates.
  • step 206 if the chart type is the first type, extract chart lines in the business picture, and stitch multiple chart lines to obtain a first chart; the first chart includes a plurality of blank cells.
  • the server extracts the horizontal and vertical lines in the business picture through the corrosion expansion method, and intersects the horizontal and vertical lines according to the coordinate position to obtain the first One chart.
  • the first chart includes multiple blank cells (denoted as blank cells). It is easy to understand that the first chart can include merged cells.
  • Step 208 Identify the information text corresponding to each blank cell.
  • identifying the information text corresponding to each blank cell includes: clipping the information block diagram in each blank cell; inputting the information block diagram into a preset convolutional neural network model, and recognizing each Information text corresponding to each information block diagram.
  • the cell picture of each blank cell is cut out according to the cell coordinates (denoted as the information block diagram).
  • Step 210 Convert the first chart to the second chart; the second chart includes multiple standard cells.
  • the server determines the maximum number of columns and the maximum number of rows corresponding to the first chart, and generates the second chart according to the maximum number of rows and the maximum number of columns. It is easy to understand that there is no merged cell in the second chart.
  • Step 212 Determine the corresponding relationship between the standard grid and the blank grid, fill the information text into the standard grid according to the corresponding relationship, obtain the business chart corresponding to the business picture, and store the business chart in the chart library.
  • the method further includes: receiving a chart query request sent by the second terminal based on the business file; searching for the corresponding business chart in the chart database according to the query field contained in the chart query request; obtaining the business chart in the business file Return the business chart and corresponding layout information to the second terminal; enable the second terminal to quickly locate the business chart in the business file according to the layout information, and replace the corresponding business picture with the acquired business chart .
  • the server When receiving the chart query request sent by the second terminal based on the business file, the server searches for the business chart containing the search field in the chart library according to the query field carried in the chart query request, obtains the picture position corresponding to the business chart, and compares the business chart and The picture location is sent to the second terminal.
  • the second terminal quickly locates the business picture according to the picture location, and replaces the corresponding business picture in the business file with the acquired business chart according to the picture location.
  • the chart type corresponding to the business picture can be determined according to the business picture that contains the chart information uploaded by the first terminal; if the chart type is the first type, the chart lines in the business picture can be extracted; By splicing, you can get a first chart that includes multiple blank cells; according to the first chart, you can map to a second chart that includes multiple standard cells; by identifying the information text corresponding to each blank cell and matching each standard cell You can fill in the information text corresponding to the blank grid to the matching standard grid to obtain the business chart corresponding to the business picture; store the business chart in the chart library, and you can receive the chart query request sent by the second terminal When responding to chart query requests based on the chart library.
  • the display mode of the text information can also be restored in the form of a chart for a chart containing merged cells.
  • users query business pictures they can directly use the text information in the business pictures, which greatly improves the efficiency of obtaining picture information.
  • receiving the service picture uploaded by the first terminal includes: receiving the service request sent by the first terminal; the service request carries the service type; obtaining the source service page queue corresponding to the service type; the source service page queue including the characteristics Page; the feature page includes blank cells; the source business page queue is returned to the first terminal, so that the first terminal displays the source business page queue, when the feature page is displayed, the business data is collected, and the collected business data is filled in the feature page Blank cell, generate target business page queue; receive target business page queue sent by the first terminal, and extract business data from the target business page queue; business data includes business files; scan the business files to obtain at least one sheet containing chart information Business picture.
  • the server returns a service page to the first terminal according to the service request.
  • the business page includes two options of the first business model and the second business model, and the business page also includes options for multiple business types.
  • the first terminal monitors the selection instruction of the service requester on the service mode option and the service type option.
  • the first terminal generates a corresponding business processing request according to the selected instruction, and sends the business processing request to the server.
  • the business processing request includes the business type and business model.
  • the server obtains the source service page queue pre-stored corresponding to the service type.
  • Each source business page queue contains all business pages involved in handling the corresponding business.
  • the source service page queue may be pre-configured by the service organization for simulated service handling when the service platform releases service products.
  • the source business page queue includes a plurality of business pages arranged in an orderly manner. At least one business page in the source business page queue is a feature page containing blank cells.
  • the method before obtaining the source service page queue corresponding to the service type, the method further includes: receiving a page recording request sent by the second terminal; monitoring multiple service pages displayed by the second terminal according to the page recording request; adding each The page label of the business page generates the association relationship between the page label and the business page; when the business page contains an input box, a blank cell is used to replace the input box; the source business page queue is generated according to the replaced business page and the association relationship.
  • the source service page queue can be a video, or an animated image that can be automatically switched according to a preset time frequency or other preset conditions.
  • the sequence of multiple business pages in the source business page queue may be determined according to the jump relationship between business pages when performing corresponding business processing.
  • Each business page has a corresponding page label, and the sort order between business pages can be characterized by the association relationship between the page label and the page. For example, if the first business product label of the first business page is triggered, and the detail page of the first business product is displayed, an association relationship between the first business product label of the first business page and the detail page of the first business product is established.
  • the server sends the source service page queue to the first terminal.
  • the first terminal displays the source business page queue, and when the feature page is displayed, the business data is collected, and the collected business data is filled into the blank cell of the feature page to generate the target business page queue with the business data.
  • the service requester makes a designated action in front of the first terminal according to the queue prompt of the source service page, and enters the service data.
  • the business data can be real scene data, such as fingerprint information, facial images, voice authorization information, and recorded video with a hand-held ID card with the characteristic information of the business requester.
  • the first terminal automatically collects business data and automatically fills in the corresponding blank cells.
  • the target business page queue includes the handling instructions of the corresponding business and the characteristic information of the business requester required to handle the business.
  • the first terminal sends the target service page queue to the server.
  • the server extracts business data from the target business page queue, and performs business processing based on the business data.
  • the user can enter all the business data required for the business applied for at one time according to the prompt of the displayed source business page queue with blank units to generate a target business page queue with user characteristics, and then just wait The business processing result feedback from the background is sufficient. Users do not need to participate in the business processing process to input corresponding information node by node, and business processing will take up much less time for users.
  • extracting chart lines in a business picture, and splicing multiple chart lines to obtain the first chart that is, the step of reconstructing the first chart, includes:
  • Step 302 Perform horizontal line detection on the business image according to the horizontal corrosion expansion algorithm to obtain multiple horizontal lines.
  • the server performs horizontal line detection on the business picture through the horizontal corrosion expansion method to obtain multiple horizontal lines, and obtains the line length and line width of each horizontal line.
  • the server determines the horizontal line with the largest line length, and filters the lines other than the vertical lines at both ends of the horizontal line with the largest line length (denoted as "horizontal width”). In other words, filter out other lines that are not within the horizontal width range.
  • Step 304 Perform vertical line detection on the business image according to the vertical corrosion expansion algorithm to obtain multiple vertical lines.
  • the server performs vertical line detection on the business image by means of vertical corrosion expansion to obtain multiple vertical lines, and obtains the line length and line width of each vertical line.
  • the server determines the vertical line with the largest line length, and filters the lines other than the vertical lines at both ends of the vertical line with the largest line length (denoted as "vertical width"). In other words, filter out other lines that are not within the vertical width range. By filtering other lines that are not within the horizontal width range or the vertical width range, you can remove redundant lines that are not within the range of the table.
  • Step 306 Intersect the horizontal line and the vertical line to obtain a table line graph.
  • Step 308 Filter the non-cell elements in the table line graph through edge detection to obtain the first graph.
  • the server intersects the horizontal and vertical lines to obtain a table line graph.
  • the server obtains the starting point coordinates, cell width, and cell height of each cell in the table line graph according to the edge detection.
  • the server can identify the non-cell elements in the table line chart, filter out the detected non-cell elements, and get the first chart.
  • the cell width and cell height must both be greater than 15 pixels.
  • the detected edge of the redundant vertical line can be regarded as a rectangle with a small width, and this small rectangle can be filtered according to the cell width requirements.
  • the horizontal redundant lines are detected as rectangles with a small height according to the edge, and can be filtered out according to the cell height requirements.
  • the business chart in the business picture is reconstructed by the corrosion expansion algorithm, and the messy non-chart elements contained therein are filtered, which can provide the user with a clean and tidy business chart, thereby further improving the efficiency of obtaining picture information. .
  • determining the blank grid matching each standard grid includes: determining the starting point coordinates of each standard grid, traversing the second chart according to the starting point coordinates; querying whether the standard grid in the current traversal sequence has starting point coordinates The same blank cell; if so, mark the blank cell with the same starting point coordinate as the blank cell that matches the corresponding standard cell; otherwise, mark the blank cell in the previous column of the same line or the standard cell in the previous row of the same column as the current traversal sequence standard cell phase Matched blank grid; mark the standard grid of the next traversal sequence as the standard grid of the current traversal sequence, and return to the step of querying whether there is a blank grid with the same starting point coordinates in the standard grid of the current traversal sequence, until the second chart traversal is completed.
  • the server determines the starting point coordinates of each standard grid, and traverses the standard grids according to the starting point coordinates. The position of the upper left corner of each cell can be used as the starting point coordinates.
  • the server queries whether there are blank cells with the same starting point coordinates in the standard cells in the current traversal sequence. If yes, the server marks the blank cells with the same starting point coordinates as blank cells that match the corresponding standard cells.
  • the server marks the blank cell matching the standard cell in the previous column of the same column or the same column as the blank cell matching the current traversal sequence standard cell. Specifically, if a standard grid has a blank grid with the same ordinate but different abscissas, it means that the standard grid is merged with the standard grid in the previous column of the same line, and the server marks the blank grid matching the previous standard grid in the same line as the current A blank cell that matches the standard cell. If a standard cell has a blank cell with the same abscissa but different ordinate, it means that the standard cell is merged with the standard cell in the previous column of the same line.
  • the server marks the blank cell that matches the previous standard cell in the same line as the current standard cell phase. Matching blank cells.
  • the server calculates the degree of intersection between the standard grid and the blank grid. The degree of intersection can be the ratio of the overlapping area of the standard grid and the blank grid. The degree of intersection can be one of 25% or 50%.
  • the server marks the blank cells whose intersection degree meets the preset condition as blank cells that match the corresponding standard cells.
  • the server determines the blank grid that matches the standard grid of the next traversal sequence in the above-mentioned manner, until the last standard grid in the second chart.
  • the display mode of the text information in the merged cells can be determined, and the chart including the merged cells can be restored.
  • FIGS. 2 to 3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or The order of execution of the stages is not necessarily carried out sequentially, but may be executed alternately or alternately with other steps or at least a part of the sub-steps or stages of other steps.
  • a picture information recognition device including: a picture recognition module 402, a table reconstruction module 404, and a text mapping module 406, wherein:
  • the picture recognition module 402 is configured to receive the business picture uploaded by the first terminal; when the business picture contains chart information, determine the chart type corresponding to the business picture.
  • the table reconstruction module 404 is used for extracting the chart lines in the business picture if the chart type is the first type, and splicing multiple chart lines to obtain the first chart; the first chart includes a plurality of blank cells; Converted to the second chart; the second chart includes multiple standard grids.
  • the text mapping module 406 is used to identify the information text corresponding to each blank cell; determine the corresponding relationship between the standard cell and the blank cell, fill the information text into the standard cell according to the corresponding relationship, and obtain the business chart corresponding to the business picture, and The business chart is stored in the chart library.
  • the image recognition module 402 is also used to receive the service request sent by the first terminal; the service request carries the service type; obtains the source service page queue corresponding to the service type; the source service page queue includes the characteristic page; the characteristic page Including blank cells; return the source business page queue to the first terminal so that the first terminal displays the source business page queue.
  • the business data is collected, and the collected business data is filled into the blank cell of the feature page to generate Target business page queue; receive the target business page queue sent by the first terminal, and extract business data from the target business page queue; the business data includes business files; scan the business files to obtain at least one business picture containing chart information.
  • the device further includes a picture query module 408, configured to receive a chart query request sent by the second terminal based on the business file; according to the query field included in the chart query request, search for the corresponding business chart in the chart library; Obtain the layout information of the business chart in the business document; return the business chart and corresponding layout information to the second terminal; enable the second terminal to quickly locate the business chart in the business document according to the layout information, and use the corresponding business picture to obtain To replace the business chart.
  • a picture query module 408 configured to receive a chart query request sent by the second terminal based on the business file; according to the query field included in the chart query request, search for the corresponding business chart in the chart library; Obtain the layout information of the business chart in the business document; return the business chart and corresponding layout information to the second terminal; enable the second terminal to quickly locate the business chart in the business document according to the layout information, and use the corresponding business picture to obtain To replace the business chart.
  • the table reconstruction module 404 is further configured to perform horizontal line detection on the business image according to the horizontal corrosion expansion algorithm to obtain multiple horizontal lines; and perform vertical line detection on the business image according to the vertical corrosion expansion algorithm , Get multiple vertical lines; intersect horizontal lines and vertical lines to get a table line graph; filter non-cell elements in the table line graph through edge detection to get the first graph.
  • the text mapping module 406 is also used to clip the information block diagram in each blank grid; input the information block diagram into the preset convolutional neural network model, and identify each information block diagram The corresponding information text.
  • the text mapping module 406 is also used to determine the starting point coordinates of each standard grid, and traverse the second chart according to the starting point coordinates; query whether there are blank grids with the same starting point coordinates in the standard grids in the current traversal sequence; if so, , Mark the blank cell with the same starting point coordinates as the blank cell that matches the corresponding standard cell; otherwise, mark the blank cell in the previous column of the same line or the standard cell in the previous row of the same column as the blank cell that matches the current traversal sequence standard cell; The standard grid of the next traversal sequence is marked as the standard grid of the current traversal sequence, and the step of querying whether there is a blank grid with the same starting point coordinates in the standard grid of the current traversal sequence is returned until the second chart traversal is completed.
  • Each module in the above-mentioned picture information recognition device can be implemented in whole or in part by software, hardware and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 5.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store business charts.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instruction is executed by the processor to realize a method for identifying picture information.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • One or more non-volatile storage media storing computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, the one or more processors realize the picture provided in any embodiment of the present application Information identification method steps.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Processing Or Creating Images (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种图片信息识别方法,包括:接收第一终端上传的业务图片;当业务图片包含图表信息时,确定业务图片对应的图表类型;若图表类型为第一类型,提取业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;第一图表包括多个空白格;识别每个空白格对应的信息文本;将第一图表转换为第二图表;第二图表包括多个标准格;确定标准格与空白格之间的对应关系,根据对应关系将信息文本填充至标准格中,得到业务图片对应的业务图表,将业务图表存储至图表库。

Description

图片信息识别方法、装置、计算机设备和存储介质
本申请要求于2019年03月19日提交中国专利局,申请号为2019102074271,申请名称为“图片信息识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种图片信息识别方法、装置、计算机设备和存储介质。
背景技术
对于图片上的文字,在较多情况下需要进行大段的复制使用。为了提高文字编辑效率,传统方式主要基于OCR(Optical Character Recognition,光学字符识别)技术将图片形式的文字转换为可编辑的文字。然而,传统方式仅简单的进行文字识别,但对于图片中图表的识别则是杂乱无章的。对于用户而言,仍无法基于识别转换结果进行直接快速复制使用,使得图片信息获取效率降低。
发明内容
根据本申请公开的各种实施例,提供一种图片信息识别方法、装置、计算机设备和存储介质。
一种图片信息识别方法,由计算机设备执行,所述方法包括:接收第一终端上传的业务图片;当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;识别每个空白格对应的信息文本;将所述第一图表转换为第二图表;所述第二图表包括多个标准格;及确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
在其中一个实施例中,所述接收第一终端上传的业务图片,包括:接收第一终端发送的业务请求;所述业务请求携带了业务类型;获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据;所述 业务数据包括业务文件;及对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
在其中一个实施例中,所述方法还包括:接收第二终端基于业务文件发送的图表查询请求;根据所述图表查询请求包含的查询字段,在图表库中查找对应的业务图表;获取所述业务图表在业务文件中的布局信息;及将所述业务图表及对应的布局信息返回至所述第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
在其中一个实施例中,所述提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表,包括:根据横向腐蚀膨胀算法在所述业务图片上进行横向线条检测,得到多个横向线条;根据竖向腐蚀膨胀算法在所述业务图片上进行竖向线条检测,得到多个竖向线条;将所述横向线条和所述竖向线条相交,得到表格线图;及通过边缘检测将所述表格线图中的非单元格元素过滤,得到第一图表。
在其中一个实施例中,所述识别每个空白格对应的信息文本,包括:剪取每个所述空白格中的信息区块图;及将所述信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
在其中一个实施例中,所述确定每个标准格相匹配的空白格,包括:确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;及将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回所述查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至所述第二图表遍历完成。
一种图片信息识别装置,所述装置包括:图片识别模块,用于接收第一终端上传的业务图片;当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;表格重构模块,用于若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;将所述第一图表转换为第二图表;所述第二图表包括多个标准格;文本映射模块,用于识别每个空白格对应的信息文本;及确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
在其中一个实施例中,所述图片识别模块还用于接收第一终端发送的业务请求;所述业务请求携带了业务类型;获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据;所述业务数据包括业务文件;及对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,计算机可读指令被处理器执行时实现本申请任意一个实施例中提供的图片信息识别方法的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现本申请任意一个实施例中提供的图片信息识别方法的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中图片信息识别方法的应用场景图。
图2为根据一个或多个实施例中图片信息识别方法的流程示意图。
图3为根据一个或多个实施例中第一图表重构的步骤的流程示意图。
图4为根据一个或多个实施例中图片信息识别装置的结构框图。
图5为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的图片信息识别方法,可以应用于如图1所示的应用环境中。第一终端102与服务器104通过网络进行通信,第二终端106与服务器104通过网络进行通信。第一终端102与第二终端106分别可以但不限于是各种 个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。当用户在第一终端102办理业务时,可以上传业务图片。业务图片可以包含图表信息。第一终端102将业务图片上传至服务器104。服务器104基于预设的卷积神经网络模式识别业务图片所包含图表的图表类型。图表类型包括第一类型、第二类型等。当图表类型为第一类型时,服务器104提取业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表。第一图表包括多个空白格。服务器104识别每个空白格对应的信息文本。服务器104将第一图表映射为对应的第二图表。第二图表包括多个标准格。服务器104确定标准格与空白格之间的匹配关系,即确定每个标准格相匹配的空白格。服务器104将空白格对应的信息文本填充至相匹配的标准格中,得到业务图片对应的业务图表,并将业务图表存储至图表库。当后续接收到第二终端106发送的图表查询请求时,服务器104基于图表库响应图表查询请求。第一终端102与第二终端106可以是同一终端。上述图片信息查询过程,服务器将用户上传的业务图片包含的文本信息提取出来,并以图表的方式对文本信息的展示方式进行还原,当用户对业务图片进行查询时,可以直接利用业务图片中的文本信息,大大提高图片信息获取效率。
在其中一个实施例中,如图2所示,提供了一种图片信息识别方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:
步骤202,接收第一终端上传的业务图片。
第一终端上安装了业务平台。当用户需要办理业务时,通过第一终端上的业务平台上传业务资料。业务资料可以是业务文件或业务图片。业务文件中可以包含一张或多张业务图片,其中至少一张业务图片记录了图表信息。业务图片可以是截图、照片等。服务器对接收到包含图表信息的业务图片进行二值化处理,以将彩色的业务图片转换为黑白图片。
步骤204,当业务图片包含图表信息时,确定业务图片对应的图表类型。
业务图片中图表的图表类型可以是带有表格线条的excel表,也可以是不带有表格线条但具有表格格式的excel表,还可以是柱状图、折线图等。服务器基于包含不同类型图表的样本图片对初始模型进行训练,得到图像处理模型。初始模型可以是卷积神经网络(Convolutional Neural Network,CNN)模型。服务器将二值化处理后的业务图片输入图像处理模型,可以得到业务图片的多种图片信息。图片信息包括图片位置、图表类型以及图元信息。图片位置是指业务图片在业务文件中的页码信息。图元信息包括图元字段和图元坐标等。
步骤206,若图表类型为第一类型,提取业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;第一图表包括多个空白格。
若图表类型为第一类型,即为带有表格线条的excel表时,服务器通过腐蚀膨胀方式提取业务图片中的横向线条和纵向线条,将横向线条和竖向线条按照坐标位置进行相交,得到第一图表。第一图表包括多个空白单元格(记作空白格)。容易理解,第一图表可以包括合并单元格。
步骤208,识别每个空白格对应的信息文本。
在其中一个实施例中,识别每个空白格对应的信息文本,包括:剪取每个空白格中的信息区块图;将信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
根据边缘检测得到的每一个单元格坐标,根据单元格坐标剪切出每一个空白格的单元格图片(记作信息区块图)。
步骤210,将第一图表转换为第二图表;第二图表包括多个标准格。
服务器确定第一图表对应的最大列数和最大行数,根据最大行数和最大列数生成第二图表。容易理解,第二图表不存在合并单元格。
步骤212,确定标准格与空白格之间的对应关系,根据对应关系将信息文本填充至标准格中,得到业务图片对应的业务图表,将业务图表存储至图表库。
每个空白格相匹配的标准格可以是多个。例如,若空白格为合并单元格,则存在多个相匹配的同行或同列的标准格。
在其中一个实施例中,该方法还包括:接收第二终端基于业务文件发送的图表查询请求;根据图表查询请求包含的查询字段,在图表库中查找对应的业务图表;获取业务图表在业务文件中的布局信息;将业务图表及对应的布局信息返回至第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
当接收到第二终端基于业务文件发送的图表查询请求时,服务器根据图表查询请求携带的查询字段,在图表库中查找包含查找字段的业务图表,获取业务图表对应的图片位置,将业务图表以及图片位置发送至第二终端。第二终端根据图片位置对业务图片进行快速定位,并根据图片位置将业务文件中的相业务图片采用获取到的业务图表进行替换。
本实施例中,根据第一终端上传的包含图表信息的业务图片,可以确定业务图片对应的图表类型;若图表类型为第一类型,可以提取业务图片中的图表线条;对多个图表线条进行拼接,可以得到包括多个空白格的第一图表;根据第一图表,可以映射得到对应包括多个标准格的第二图表;通过识别每个空白格对应的信息文本以及每个标准格相匹配的空白格,可以将空白格对 应的信息文本填充至相匹配的标准格中,进而得到业务图片对应的业务图表;将业务图表存储至图表库,可以在接收到第二终端发送的图表查询请求时,基于图表库响应图表查询请求。由于将用户上传的业务图片包含的文本信息提取出来,对于包含合并单元格的图表也可以以图表的方式对文本信息的展示方式进行还原。当用户对业务图片进行查询时,可以直接利用业务图片中的文本信息,大大提高图片信息获取效率。
在其中一个实施例中,接收第一终端上传的业务图片,包括:接收第一终端发送的业务请求;业务请求携带了业务类型;获取业务类型对应的源业务页面队列;源业务页面队列包括特征页面;特征页面包括空白单元;将源业务页面队列返回至第一终端,使第一终端展示源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;接收第一终端发送的目标业务页面队列,从目标业务页面队列中提取业务数据;业务数据包括业务文件;对业务文件进行扫描,得到至少一张包含图表信息的业务图片。
服务器根据业务请求向第一终端返回业务页面。业务页面包括第一业务模式和第二业务模式两个选项,业务页面还包括多种业务类型的选项。第一终端监听业务请求人对业务模式选项以及业务类型选项的选定指令。第一终端根据选定指令生成对应的业务办理请求,将业务办理请求发送至服务器。业务办理请求包含业务类型和业务模式。
若业务模式为第一业务模式,则服务器获取预存储对应业务类型的源业务页面队列。每个源业务页面队列中包含办理相应业务所涉及的所有业务页面。源业务页面队列可以是业务机构在业务平台发布业务产品时进行模拟业务办理预先配置的。源业务页面队列包括有序排列的多个业务页面。源业务页面队列中至少一个业务页面为包含空白单元的特征页面。
在其中一个实施例中,获取业务类型对应的源业务页面队列之前,还包括:接收第二终端发送的页面录制请求;根据页面录制请求,监听第二终端展示的多个业务页面;添加每个业务页面的页面标签,生成页面标签与业务页面之间的关联关系;当业务页面包含输入框时,采用空白单元替代输入框;根据替换后的业务页面以及关联关系生成源业务页面队列。
源业务页面队列可以是视频,也可以是可以按照预设时间频率或其他预设条件自动切换的动图等。源业务页面队列中多个业务页面的排列顺序可以根据进行相应业务处理时业务页面之间的跳转关系确定。每个业务页面具有对应的页面标签,业务页面之间的排序顺序可以用页面标签与页面之间的关联关系进行表征。例如,触发第一业务页面的第一业务产品标签,显示第一业务产品的详情页面,则建立如第一业务页面的第一业务产品标签与第一业 务产品的详情页面之间的关联关系。
服务器将源业务页面队列发送至第一终端。第一终端展示源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成带有业务数据的目标业务页面队列。具体的,业务请求人根据源业务页面队列提示,在第一终端前做出指定动作,录入业务数据。业务数据可以是现实场景数据,如带有业务请求人特征信息的指纹信息、人脸图像,语音授权信息、手持身份证件的录制视频等。第一终端去自动采集业务数据,并自动填入到相应的空白单元。若获取的数据满足条件,则展示下一个业务页面,直至源业务页面队列展示最后一个业务页面,生成目标业务页面队列。目标业务页面队列包括相应业务的办理说明,以及办理该业务需要的业务请求人特征信息。
第一终端将目标业务页面队列发送至服务器。服务器从目标业务页面队列中提取业务数据,基于业务数据进行业务处理。
本实施例中,用户可根据展示的带有空白单元的源业务页面队列的提示一次性录入申请办理的业务所需要的所有业务数据,生成带有用户特征的目标业务页面队列,然后只需等待后台反馈的业务办理结果即可。用户无需参与到业务办理流程中去逐个节点的输入相应的信息,业务办理占用用户的时间将大大较少。
在其中一个实施例中,如图3所示,提取业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表,即第一图表重构的步骤,包括:
步骤302,根据横向腐蚀膨胀算法在业务图片上进行横向线条检测,得到多个横向线条。
服务器通过横向腐蚀膨胀方式在业务图片上进行横向线条检测,得到多个横向线条,并获取每个横向线条的线条长度和线条宽度。服务器确定线条长度最大的横向线条,并对线条长度最大的横向线条的两端垂直线条(记作“横向宽度”)之外的其他线条进行过滤。换言之,过滤掉不在横向宽度范围之内的其他线条。
步骤304,根据竖向腐蚀膨胀算法在业务图片上进行竖向线条检测,得到多个竖向线条。
服务器通过竖向腐蚀膨胀方式在业务图片上进行竖向线条检测,得到多个竖向线条,并获取每个竖向线条的线条长度和线条宽度。服务器确定线条长度最大的竖向线条,并对线条长度最大的竖向线条的两端垂直线条(记作“竖向宽度”)之外的其他线条进行过滤。换言之,过滤掉不在竖向宽度范围之内的其他线条。通过对不在横向宽度范围之内或不在竖向宽度范围之内的其他线条进行过滤,可以去除非表格范围之内的多余线条。
步骤306,将横向线条和竖向线条相交,得到表格线图。
步骤308,通过边缘检测将表格线图中的非单元格元素过滤,得到第一图表。
服务器将横向线条和竖向线条相交,得到表格线图。服务器根据边缘检测得到表格线图中每一个单元格的起点坐标、单元格宽度和单元格高度。根据单元格宽度和单元格高度,服务器可以识别表格线图中的非单元格元素,将检测出来的非单元格元素过滤掉,得到第一图表。例如,要求单元格宽度和单元格高度必须均大于15像素。当进行边缘检测时,检测出来的多余竖向线条边缘,可以看做一个宽度很小的矩形,这个小矩形就可以根据单元格宽度要求进行过滤。同理,横向的多余线条根据边缘检测出来为高度很小的矩形,根据单元格高度要求可以过滤掉。
本实施例中,通过腐蚀膨胀算法对业务图片中的业务图表进行重构,并对其中包含的杂乱的非图表元素进行过滤,可以向用户提供干净整洁的业务图表,从而进一步提高图片信息获取效率。
在其中一个实施例中,确定每个标准格相匹配的空白格,包括:确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至第二图表遍历完成。
服务器确定每个标准格的起点坐标,根据起点坐标对标准格进行遍历。每个单元格左上角的位置可以作为起点坐标。按照“Z”字方向对第二图表进行循环扫描遍历。服务器查询当前遍历顺序的标准格是否存在起点坐标相同的空白格。若是,服务器将起点坐标相同的空白格标记为相应标准格相匹配的空白格。
若不存在起点坐标相同的空白格,服务器将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格。具体的,若一个标准格存在纵坐标相同但横坐标不同的空白格,则表示该标准格与同行前一列的标准格被合并了,服务器将同行前一列标准格相匹配的空白格标记为当前标准格相匹配的空白格。若一个标准格存在横坐标相同但纵坐标不同的空白格,则表示该标准格与同行前一列的标准格被合并了,服务器将同行前一列标准格相匹配的空白格标记为当前标准格相匹配的空白格。在另一个实施例中,服务器计算标准格与空白格的相交度。相交度可以是标准格与空白格的重叠面积比例。相交度可以是25%,50%这样的比例值其中的一个。 服务器将相交度符合预设条件的空白格标记为相应标准格相匹配的空白格。
服务器按照上述方式确定下一遍历顺序的标准格相匹配的空白格,直至第二图表中最后一个标准格。
本实施例中,通过将标准格与空白格进行匹配,可以确定合并单元格内文本信息的展示方式,进而可以还原包括合并单元格的图表。
应该理解的是,虽然图2~图3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2~图3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或阶段的至少一部分轮流或者交替地执行。
在其中一个实施例中,如图4所示,提供了一种图片信息识别装置,包括:图片识别模块402、表格重构模块404和文本映射模块406,其中:
图片识别模块402,用于接收第一终端上传的业务图片;当业务图片包含图表信息时,确定业务图片对应的图表类型。
表格重构模块404,用于若图表类型为第一类型,提取业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;第一图表包括多个空白格;将第一图表转换为第二图表;第二图表包括多个标准格。
文本映射模块406,用于识别每个空白格对应的信息文本;确定标准格与空白格之间的对应关系,根据对应关系将信息文本填充至标准格中,得到业务图片对应的业务图表,将业务图表存储至图表库。
在其中一个实施例中,图片识别模块402还用于接收第一终端发送的业务请求;业务请求携带了业务类型;获取业务类型对应的源业务页面队列;源业务页面队列包括特征页面;特征页面包括空白单元;将源业务页面队列返回至第一终端,使第一终端展示源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;接收第一终端发送的目标业务页面队列,从目标业务页面队列中提取业务数据;业务数据包括业务文件;对业务文件进行扫描,得到至少一张包含图表信息的业务图片。
在其中一个实施例中,该装置还包括图片查询模块408,用于接收第二终端基于业务文件发送的图表查询请求;根据图表查询请求包含的查询字段,在图表库中查找对应的业务图表;获取业务图表在业务文件中的布局信息; 将业务图表及对应的布局信息返回至第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
在其中一个实施例中,表格重构模块404还用于根据横向腐蚀膨胀算法在业务图片上进行横向线条检测,得到多个横向线条;根据竖向腐蚀膨胀算法在业务图片上进行竖向线条检测,得到多个竖向线条;将横向线条和竖向线条相交,得到表格线图;通过边缘检测将表格线图中的非单元格元素过滤,得到第一图表。
在其中一个实施例中,文本映射模块406还用于剪取每个空白格中的信息区块图;将信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
在其中一个实施例中,文本映射模块406还用于确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至第二图表遍历完成。
关于图片信息识别装置的具体限定可以参见上文中对于图片信息识别方法的限定,在此不再赘述。上述图片信息识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储业务图表。该计算机设备的网络接口用于与外部终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种图片信息识别方法。
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的 限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现本申请任意一个实施例中提供的图片信息识别方法的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体详细,但并不能因此理解为对发明专利范围的限制。应指出的是,对本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种图片信息识别方法,由计算机设备执行,所述方法包括:
    接收第一终端上传的业务图片;
    当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;
    若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;
    识别每个空白格对应的信息文本;
    将所述第一图表转换为第二图表;所述第二图表包括多个标准格;及
    确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
  2. 根据权利要求1所述的方法,其特征在于,所述接收第一终端上传的业务图片,包括:
    接收第一终端发送的业务请求;所述业务请求携带了业务类型;
    获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;
    将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;
    接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据;所述业务数据包括业务文件;及
    对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    接收第二终端基于业务文件发送的图表查询请求;
    根据所述图表查询请求包含的查询字段,在图表库中查找对应的业务图表;
    获取所述业务图表在业务文件中的布局信息;及
    将所述业务图表及对应的布局信息返回至所述第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
  4. 根据权利要求1所述的方法,其特征在于,所述提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表,包括:
    根据横向腐蚀膨胀算法在所述业务图片上进行横向线条检测,得到多个横向线条;
    根据竖向腐蚀膨胀算法在所述业务图片上进行竖向线条检测,得到多个 竖向线条;
    将所述横向线条和所述竖向线条相交,得到表格线图;及
    通过边缘检测将所述表格线图中的非单元格元素过滤,得到第一图表。
  5. 根据权利要求1所述的方法,其特征在于,所述识别每个空白格对应的信息文本,包括:
    剪取每个所述空白格中的信息区块图;及
    将所述信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
  6. 根据权利要求1所述的方法,其特征在于,所述确定每个标准格相匹配的空白格,包括:
    确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;
    查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;
    若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;
    否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;及
    将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回所述查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至所述第二图表遍历完成。
  7. 一种图片信息识别装置,所述装置包括:
    图片识别模块,用于接收第一终端上传的业务图片;当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;
    表格重构模块,用于若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;将所述第一图表转换为第二图表;所述第二图表包括多个标准格;
    文本映射模块,用于识别每个空白格对应的信息文本;确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
  8. 根据权利要求7所述的装置,其特征在于,所述图片识别模块还用于接收第一终端发送的业务请求;所述业务请求携带了业务类型;获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据; 所述业务数据包括业务文件;对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
  9. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收第一终端上传的业务图片;
    当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;
    若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;
    识别每个空白格对应的信息文本;
    将所述第一图表转换为第二图表;所述第二图表包括多个标准格;及
    确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
  10. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    接收第一终端发送的业务请求;所述业务请求携带了业务类型;
    获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;
    将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;
    接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据;所述业务数据包括业务文件;及
    对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    接收第二终端基于业务文件发送的图表查询请求;
    根据所述图表查询请求包含的查询字段,在图表库中查找对应的业务图表;
    获取所述业务图表在业务文件中的布局信息;及
    将所述业务图表及对应的布局信息返回至所述第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
  12. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行 所述计算机可读指令时还执行以下步骤:
    根据横向腐蚀膨胀算法在所述业务图片上进行横向线条检测,得到多个横向线条;
    根据竖向腐蚀膨胀算法在所述业务图片上进行竖向线条检测,得到多个竖向线条;
    将所述横向线条和所述竖向线条相交,得到表格线图;及
    通过边缘检测将所述表格线图中的非单元格元素过滤,得到第一图表。
  13. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    剪取每个所述空白格中的信息区块图;及
    将所述信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
  14. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;
    查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;
    若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;
    否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;及
    将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回所述查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至所述第二图表遍历完成。
  15. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收第一终端上传的业务图片;
    当所述业务图片包含图表信息时,确定所述业务图片对应的图表类型;
    若所述图表类型为第一类型,提取所述业务图片中的图表线条,对多个图表线条进行拼接,得到第一图表;所述第一图表包括多个空白格;
    识别每个空白格对应的信息文本;
    将所述第一图表转换为第二图表;所述第二图表包括多个标准格;及
    确定所述标准格与空白格之间的对应关系,根据所述对应关系将所述信息文本填充至标准格中,得到业务图片对应的业务图表,将所述业务图表存储至图表库。
  16. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指 令被所述处理器执行时还执行以下步骤:
    接收第一终端发送的业务请求;所述业务请求携带了业务类型;
    获取所述业务类型对应的源业务页面队列;所述源业务页面队列包括特征页面;所述特征页面包括空白单元;
    将所述源业务页面队列返回至所述第一终端,使第一终端展示所述源业务页面队列,当显示特征页面时,采集业务数据,将采集的业务数据填入至特征页面的空白单元,生成目标业务页面队列;
    接收第一终端发送的目标业务页面队列,从所述目标业务页面队列中提取业务数据;所述业务数据包括业务文件;及
    对所述业务文件进行扫描,得到至少一张包含图表信息的业务图片。
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    接收第二终端基于业务文件发送的图表查询请求;
    根据所述图表查询请求包含的查询字段,在图表库中查找对应的业务图表;
    获取所述业务图表在业务文件中的布局信息;及
    将所述业务图表及对应的布局信息返回至所述第二终端;使第二终端根据布局信息在业务文件中对业务图表进行快速定位,并将相应业务图片采用获取到的业务图表进行替换。
  18. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    根据横向腐蚀膨胀算法在所述业务图片上进行横向线条检测,得到多个横向线条;
    根据竖向腐蚀膨胀算法在所述业务图片上进行竖向线条检测,得到多个竖向线条;
    将所述横向线条和所述竖向线条相交,得到表格线图;及
    通过边缘检测将所述表格线图中的非单元格元素过滤,得到第一图表。
  19. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    剪取每个所述空白格中的信息区块图;及
    将所述信息区块图输入预设的卷积神经网络模型,识别得到每个信息区块图对应的信息文本。
  20. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    确定每个标准格的起点坐标,根据起点坐标对第二图表进行遍历;
    查询当前遍历顺序的标准格是否存在起点坐标相同的空白格;
    若是,将起点坐标相同的空白格标记为相应标准格相匹配的空白格;
    否则,将同行前一列或者同列前一行标准格相匹配的空白格标记为当前遍历顺序标准格相匹配的空白格;及
    将下一遍历顺序的标准格标记为当前遍历顺序的标准格,返回所述查询当前遍历顺序的标准格是否存在起点坐标相同的空白格的步骤,直至所述第二图表遍历完成。
PCT/CN2019/117377 2019-03-19 2019-11-12 图片信息识别方法、装置、计算机设备和存储介质 WO2020186779A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910207427.1A CN110059687B (zh) 2019-03-19 2019-03-19 图片信息识别方法、装置、计算机设备和存储介质
CN201910207427.1 2019-03-19

Publications (1)

Publication Number Publication Date
WO2020186779A1 true WO2020186779A1 (zh) 2020-09-24

Family

ID=67317058

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117377 WO2020186779A1 (zh) 2019-03-19 2019-11-12 图片信息识别方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN110059687B (zh)
WO (1) WO2020186779A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712014A (zh) * 2020-12-29 2021-04-27 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN112883926A (zh) * 2021-03-24 2021-06-01 泰康保险集团股份有限公司 表格类医疗影像的识别方法及装置
CN113627351A (zh) * 2021-08-12 2021-11-09 达而观信息科技(上海)有限公司 财报科目的匹配方法、装置、计算机设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059687B (zh) * 2019-03-19 2024-05-28 平安科技(深圳)有限公司 图片信息识别方法、装置、计算机设备和存储介质
CN110516208B (zh) * 2019-08-12 2023-06-09 深圳智能思创科技有限公司 一种针对pdf文档表格提取的系统及方法
CN111881659B (zh) * 2020-09-28 2021-02-26 江西汉辰信息技术股份有限公司 表格图片的处理方法、系统、可读存储介质及计算机设备
CN114627482B (zh) * 2022-05-16 2022-08-12 四川升拓检测技术股份有限公司 基于图像处理与文字识别实现表格数字化处理方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005293605A (ja) * 2005-04-26 2005-10-20 Hitachi Ltd 帳票認識方法
CN107622233A (zh) * 2017-09-11 2018-01-23 畅捷通信息技术股份有限公司 一种表格识别方法、识别系统及计算机装置
CN107679024A (zh) * 2017-09-11 2018-02-09 畅捷通信息技术股份有限公司 识别表格的方法、系统、计算机设备、可读存储介质
CN110059687A (zh) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 图片信息识别方法、装置、计算机设备和存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241608B (zh) * 2016-12-26 2021-06-22 北京国双科技有限公司 图表数据的处理方法、装置及系统
WO2018175686A1 (en) * 2017-03-22 2018-09-27 Drilling Info, Inc. Extracting data from electronic documents
CN107862303B (zh) * 2017-11-30 2019-04-26 平安科技(深圳)有限公司 表格类图像的信息识别方法、电子装置及可读存储介质
CN108470164A (zh) * 2018-03-20 2018-08-31 上海眼控科技股份有限公司 一种用于财务报表的数字识别系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005293605A (ja) * 2005-04-26 2005-10-20 Hitachi Ltd 帳票認識方法
CN107622233A (zh) * 2017-09-11 2018-01-23 畅捷通信息技术股份有限公司 一种表格识别方法、识别系统及计算机装置
CN107679024A (zh) * 2017-09-11 2018-02-09 畅捷通信息技术股份有限公司 识别表格的方法、系统、计算机设备、可读存储介质
CN110059687A (zh) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 图片信息识别方法、装置、计算机设备和存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712014A (zh) * 2020-12-29 2021-04-27 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN112712014B (zh) * 2020-12-29 2024-04-30 平安健康保险股份有限公司 表格图片结构解析方法、系统、设备和可读存储介质
CN112883926A (zh) * 2021-03-24 2021-06-01 泰康保险集团股份有限公司 表格类医疗影像的识别方法及装置
CN112883926B (zh) * 2021-03-24 2023-07-04 泰康保险集团股份有限公司 表格类医疗影像的识别方法及装置
CN113627351A (zh) * 2021-08-12 2021-11-09 达而观信息科技(上海)有限公司 财报科目的匹配方法、装置、计算机设备及存储介质
CN113627351B (zh) * 2021-08-12 2024-01-30 达观数据有限公司 财报科目的匹配方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN110059687B (zh) 2024-05-28
CN110059687A (zh) 2019-07-26

Similar Documents

Publication Publication Date Title
WO2020186779A1 (zh) 图片信息识别方法、装置、计算机设备和存储介质
CN110334585B (zh) 表格识别方法、装置、计算机设备和存储介质
US11416672B2 (en) Object recognition and tagging based on fusion deep learning models
CN109949907B (zh) 基于云端的大型病理学图像协作注释方法及系统
CN113378710B (zh) 图像文件的版面分析方法、装置、计算机设备和存储介质
CN111898411B (zh) 文本图像标注系统、方法、计算机设备和存储介质
WO2018233055A1 (zh) 保单信息录入的方法、装置、计算机设备及存储介质
US20080107337A1 (en) Methods and systems for analyzing data in media material having layout
US20090285482A1 (en) Detecting text using stroke width based text detection
CN110781859B (zh) 图像标注方法、装置、计算机设备和存储介质
CN110059688B (zh) 图片信息识别方法、装置、计算机设备和存储介质
KR20160132842A (ko) 플로우 문서를 생성하기 위한 이미지 문서 컴포넌트 검출 및 추출 기법
CN111752557A (zh) 一种展示方法及装置
WO2019041442A1 (zh) 图表数据结构化提取方法、系统、电子设备及计算机可读存储介质
CN112183249A (zh) 一种视频处理方法和装置
CN110866457A (zh) 一种电子保单的获得方法、装置、计算机设备和存储介质
CN110728687A (zh) 文件图像分割方法、装置、计算机设备和存储介质
CN112329548A (zh) 一种文档章节分割方法、装置及存储介质
JP2022185143A (ja) テキスト検出方法、テキスト認識方法及び装置
WO2020232866A1 (zh) 扫描文本分段方法、装置、计算机设备和存储介质
CN114529933A (zh) 一种合同数据差异性的比对方法、装置、设备和介质
CN114359533A (zh) 一种基于页面文本的页码识别方法和计算机设备
CN111985467B (zh) 聊天记录截图处理方法、装置、计算机设备和存储介质
US10963690B2 (en) Method for identifying main picture in web page
US10991085B2 (en) Classifying panoramic images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19920374

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19920374

Country of ref document: EP

Kind code of ref document: A1