WO2016188104A1 - 信息处理方法及信息处理装置 - Google Patents

信息处理方法及信息处理装置 Download PDF

Info

Publication number
WO2016188104A1
WO2016188104A1 PCT/CN2015/098836 CN2015098836W WO2016188104A1 WO 2016188104 A1 WO2016188104 A1 WO 2016188104A1 CN 2015098836 W CN2015098836 W CN 2015098836W WO 2016188104 A1 WO2016188104 A1 WO 2016188104A1
Authority
WO
WIPO (PCT)
Prior art keywords
column
slice
slice image
coordinate value
character recognition
Prior art date
Application number
PCT/CN2015/098836
Other languages
English (en)
French (fr)
Inventor
刘永波
李桂林
方红涛
Original Assignee
中国建设银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国建设银行股份有限公司 filed Critical 中国建设银行股份有限公司
Priority to EP15893172.5A priority Critical patent/EP3147825A4/en
Priority to SG11201610723SA priority patent/SG11201610723SA/en
Publication of WO2016188104A1 publication Critical patent/WO2016188104A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction

Definitions

  • the invention belongs to the technical field of text recognition, and in particular relates to an information processing method and an information processing device.
  • the data amount of the ticket is large; 2.
  • the data types or contents of the respective elements in the same column of the ticket are the same, and the data types and contents of the respective elements located in the same row are different;
  • Each element in the same row has the same pixel height, and each element in the same column has the same pixel width.
  • the element in the ticket refers to a character or a string located in a cell, and the element can also be understood as a piece of data, such as the user's name, date, and amount.
  • the staff If the staff enters the list information line by line, it is often necessary to frequently switch the input method and adjust the operation area of the finger on the keyboard, which is complicated. Therefore, the staff usually enters the tabular information column by column. Since the data types of the elements in the same column are the same, the visual effects are similar, which is very likely to cause visual fatigue of the staff, and problems such as input misplacement.
  • the entire form is intelligently identified by the text recognition tool, and the data in the form is obtained and stored.
  • an object of the present invention is to provide an information processing method and an information processing apparatus for acquiring elements in a form and reducing an error rate of the acquired data.
  • the present invention provides the following technical solutions:
  • the invention discloses an information processing method for processing elements in a form, the form is a digitized image, and elements in the form are distributed in N columns, wherein the element is a piece of data, and N is an integer greater than 1.
  • the information processing method includes:
  • each element in the form by using a location area of the respective columns in the form, a location area of the positioning elements of each column in the form, and a number of elements included in each column Location area
  • the form is sliced according to a position area of each element in the form, to obtain a plurality of slice images, wherein each slice image includes one element, and the number of slice images obtained by the slice process and the elements of the form are included Consistent quantity;
  • the obtained string is recorded according to a preset rule.
  • the sliced image is respectively subjected to optical character recognition, specifically: performing optical character recognition on the slice images located in the same column one by one, and then optically respectively slicing the slice images located in another column Character recognition until optical character recognition is performed on the slice images located in each column; wherein the slice image located in the nth column is obtained by slicing the nth column of the form.
  • the information processing method further includes: displaying an element input box corresponding to the first slice image, wherein the first slice image is a slice image currently in an input state; and receiving the user in the element input box The input character string; comparing the character string input by the user with the character string generated by performing optical character recognition on the first slice image, and prompting if the two are inconsistent.
  • the method further includes: adjusting a display effect of the first slice image, so that the first slice The display effect of the slice image is different from that of other slice images.
  • the information processing method further includes: receiving a scaling instruction input by the user, and performing corresponding scaling processing on the first slice image in response to the scaling instruction.
  • the location area of the nth column in the form in the form is calibrated by (first coordinate value, second coordinate value, third coordinate value, fourth coordinate value), wherein
  • the first coordinate value is a distance between a left side of the nth column and a left side of the form
  • the second coordinate value is a top end of the nth column and the top of the form a distance between the sides
  • the third coordinate value being a distance between a right side of the nth column and a left side of the form
  • the fourth coordinate value is a bottom of the nth column The distance between the end and the upper side of the form
  • the positional area of the nth column is calibrated at a positional area in the form by (a fifth coordinate value, a sixth coordinate value, a seventh coordinate value, and an eighth coordinate value), wherein the fifth coordinate value is a distance between a left side of the cell in which the positioning element of the nth column is located and a left side of the form, and the sixth coordinate value is a top end of a cell in which the positioning element of the nth column is located a distance between upper sides of the form, the seventh coordinate value being a distance between a right side of a cell in which the positioning element of the nth column is located and a left side of the form,
  • the eighth coordinate value is a distance between a bottom end of the cell in which the positioning element of the nth column is located and an upper side of the form.
  • the present invention also discloses an information processing apparatus for processing elements in a form, the form is a digitized image, and elements in the form are distributed in N columns, wherein the element is a piece of data, and N is greater than 1.
  • An integer the information processing apparatus includes:
  • a column location area determining unit configured to respectively determine a location area of each column in the form in the form
  • An element quantity determining unit configured to determine the number of elements included in each column in the form
  • An element position area determining unit configured to utilize a position area of each of the columns in the form, a position area of a positioning element of each column in the form, and a quantity of elements included in each column Do not determine the location area of each element in the form in the form;
  • An image processing unit configured to slice the form according to a position area of each element in the form, to obtain a plurality of slice images, wherein each slice image includes one element, and the number and shape of the slice image obtained by the slice processing The number of elements contained in the form is consistent;
  • a character recognition unit configured to respectively perform optical character recognition on the slice image to obtain a character string included in the slice image
  • a storage unit that records the obtained character string according to a preset rule.
  • the character recognition unit is specifically configured to perform optical character recognition on the slice images located in the same column one by one, and then perform optical character recognition on the slice images located in another column one by one until the pair is located
  • the slice images of the columns are all optically recognized; wherein the slice image located in the nth column is obtained by slicing the nth column of the form.
  • the information processing apparatus further includes: a control unit, configured to control the display interface to display an element input box corresponding to the first slice image, wherein the first slice image is a slice image currently in an input state; a first processing unit, configured to receive a character string input by the user in the element input box, compare a character string input by the user, and a character string generated by performing optical character recognition on the first slice image, where the two are inconsistent Next, issue a prompt.
  • a control unit configured to control the display interface to display an element input box corresponding to the first slice image, wherein the first slice image is a slice image currently in an input state
  • a first processing unit configured to receive a character string input by the user in the element input box, compare a character string input by the user, and a character string generated by performing optical character recognition on the first slice image, where the two are inconsistent Next, issue a prompt.
  • the information processing apparatus further includes: a second processing unit; after the display interface displays an element input box corresponding to the first slice image, the second processing unit adjusts display of the first slice image
  • the effect is such that the display effect of the first slice image is different from the display effect of other slice images.
  • the information processing apparatus further includes a third processing unit, where the third processing unit is configured to receive a scaling instruction input by the user, and perform corresponding scaling processing on the first slice image in response to the scaling instruction.
  • the third processing unit is configured to receive a scaling instruction input by the user, and perform corresponding scaling processing on the first slice image in response to the scaling instruction.
  • the information processing method disclosed in the present invention first determines a location area of each column in the form in the form, determines a location area of the positioning element of each column in the form, determines the number of elements included in each column in the form, and then according to the foregoing information. Determining the position area of each element in the form, and performing slice processing according to the position area of each element in the form, so that each element is divided into one slice image, and then respectively performing optical character recognition on each slice image to obtain a slice image containing character of String and record.
  • each element in the form is respectively divided into one slice image, and then optical character recognition is respectively performed for each slice image to acquire a character string included in the slice image, since the one-time optical character recognition operation is only for An element, so the slice image can be identified based on various data types until the string contained in the slice image is recognized, which can reduce the error rate of the data.
  • FIG. 3 is a schematic structural diagram of an information processing apparatus according to the present disclosure.
  • FIG. 4 is a schematic structural diagram of another information processing apparatus according to the present disclosure.
  • the present invention discloses an information processing method for processing elements in a form.
  • the form is a digitized image, and may be a scanned piece of the ticket or an image obtained by capturing the ticket.
  • the elements in the form are distributed in N columns, and N is an integer greater than 1.
  • FIG. 1 is a flowchart of an information processing method disclosed by the present invention.
  • the information processing method includes:
  • Step S11 respectively determine the location area of each column in the form in the form.
  • the position area of each column in the form in the form can be calibrated by the distance between each column and the four sides of the form. After the user can manually determine the location area of each column in the form, the user inputs the above data to the device running the method. Devices running this method can also be measured using existing ranging software to determine the location area of each column in the form.
  • Step S12 respectively determine the location area of the positioning element of each column in the form in the form.
  • the position area of the positioning element of each column in the form may be calibrated by the distance between the cell where the positioning element is located and the four sides of the list. It should be noted here that the cell in which the positioning element is located may be visible to the user or may be invisible to the user (the border of the cell is colorless).
  • the user inputs the above data to the device running the method. Devices running the method can also make measurements using existing ranging software to determine the location area of the positioning elements of each column in the form.
  • Step S13 Determine the number of elements included in each column in the form.
  • Step S14 determining the location area of each element in the form in the form by using the position area of each column in the form, the position area of the positioning element of each column in the form, and the number of elements included in each column.
  • the nth column in the form as an example: according to the positional area of the nth column in the form and the positional area of the nth column in the form, the total height of all the elements in the nth column can be determined, and the nth column is additionally The number of elements included is determined so that the average height of each element in the nth column can be determined. Then, based on the position of the positioning element of the nth column and the average height of each element in the nth column, the positional area of each element in the nth column in the form can be determined.
  • Step S15 The form is sliced according to the position area of each element in the form to obtain a plurality of slice images, wherein each slice image includes one element, and the number of slice images obtained by the slice process is consistent with the number of elements included in the form.
  • step S14 the positional area of each element of the form in the form has been determined, and each element is sliced into one slice image according to the slice processing of each element in the position area of the form. That is, you will get M slice images, where M and the form contain The number of elements is the same, and each slice image contains one element.
  • Step S16 Perform optical character recognition (OCR) on each slice image to obtain a character string included in the slice image.
  • OCR optical character recognition
  • the text recognition tool is used to perform overall intelligent recognition of the form. Since the form contains elements of multiple data types and identifies elements of multiple data types at the same time, the recognition rate is inevitably lower, and the corresponding acquired data is prone to errors.
  • the slice image is optically recognized, and one slice image contains only one element. Since the primary optical character recognition operation is only for one element, the slice image can be identified based on various data types until the string contained in the slice image is recognized, which can reduce the error rate of the data compared to the overall recognition in the prior art. .
  • Step S17 Record the obtained character string according to a preset rule.
  • the obtained string may be recorded in a specific location of the preset table, and the specific location is determined by the location area of the element containing the string in the form.
  • the information processing method disclosed in the present invention first determines a location area of each column in the form in the form, determines a location area of the positioning element of each column in the form, determines the number of elements included in each column in the form, and then according to the foregoing information. Determining the position area of each element in the form, and performing slice processing according to the position area of each element in the form, so that each element is divided into one slice image, and then respectively performing optical character recognition on each slice image to obtain a slice image containing The string is recorded.
  • each element in the form is respectively divided into one slice image, and then optical character recognition is respectively performed for each slice image to acquire a character string included in the slice image, since the one-time optical character recognition operation is only for An element, so the slice image can be identified based on various data types until the string contained in the slice image is recognized, which can reduce the error rate of the data.
  • the position area of the nth column in the form is calibrated by (first coordinate value, second coordinate value, third coordinate value, fourth coordinate value).
  • the first coordinate value is the distance between the left side of the nth column and the left side of the form
  • the second coordinate value is the distance between the top end of the nth column and the upper side of the form
  • the third coordinate value The distance between the right side of the nth column and the left side of the form
  • the fourth coordinate value is the distance between the bottom end of the nth column and the upper side of the form.
  • the positional element of the nth column is in the positional area in the form (the fifth coordinate value, the sixth coordinate value, The seventh coordinate value and the eighth coordinate value are calibrated.
  • the fifth coordinate value is the distance between the left side of the cell in which the positioning element of the nth column is located and the left side of the form
  • the sixth coordinate value is the top of the cell where the positioning element of the nth column is located.
  • the distance between the upper sides of the form, the seventh coordinate value is the distance between the right side of the cell in which the positioning element of the nth column is located and the left side of the form, and the eighth coordinate value is the positioning of the nth column.
  • the first coordinate value to the eighth coordinate value may also be configured as:
  • the first coordinate value is the distance between the left side of the nth column and the left side of the form
  • the second coordinate value is the distance between the top end of the nth column and the upper side of the form
  • the third coordinate value is the first The distance between the right side of the n column and the right side of the form
  • the fourth coordinate value is the distance between the bottom end of the nth column and the lower side of the form.
  • the fifth coordinate value is the distance between the left side of the cell in which the positioning element of the nth column is located and the left side of the form
  • the sixth coordinate value is the top of the cell in which the positioning element of the nth column is located and the form
  • the seventh coordinate value is the distance between the right side of the cell in which the positioning element of the nth column is located and the right side of the form
  • the eighth coordinate value is the positioning element of the nth column. The distance between the bottom of the cell and the underside of the form.
  • the positional area of the first column in the form is (leftA, TopA, RightA, BottomA)
  • the positional area of the positioning element of the first column is (leftA1, TopA1, RightA1, BottomA1), where LeftA1 is equal to LeftA, and RightA1 is equal to RightA.
  • the maximum number of rows in this column is MaxColumn, which means that the first column contains the number of elements MaxColumn.
  • the height of the MaxColumn row element in the first column is BottomA-TopA1
  • the average height Height of each element in the first column is (BottomA-TopA1)/MaxColumn. Then, according to the location area of the positioning element of the first column, and the average height of each element, the location area of the MaxColumn row element in the first column can be determined, specifically:
  • the position area of the element A1 located in the first row of the first column is
  • the position area of the element A2 located in the second row of the first column is
  • the position area of each element is marked on the form by using a rectangular frame of a specific color, for example, the position area of each element is marked on the form by using a red dotted rectangle.
  • the user can intuitively judge whether the calculated position area of each element matches the actual position area of each element. If the calculated position area of each element deviates from its actual position area, the user can manually adjust the position area of each column in the form and the position area of the positioning elements of each column.
  • the location area of the first column is adjusted to The positional area of the positioning element of the first column is adjusted to among them, The adjustment value of the distance between the left edge of the first column and the left edge of the form, The adjustment value of the distance between the top of the first column and the upper side of the form, The adjustment value of the distance between the right edge of the first column and the left edge of the form, The adjustment value of the distance between the bottom end of the first column and the upper side of the form, The adjustment value of the distance between the top of the cell where the positioning element of the first column is located and the upper side of the form, The adjustment value of the distance between the bottom of the cell where the positioning element of the first column is located and the upper side of the form.
  • the height of the MaxColumn row element in the first column is Average height of each element in the first column for Then, according to the location area of the positioning element of the first column, and the average height of each element, the location area of the MaxColumn row element in the first column can be determined, specifically:
  • the position area of the element A1 located in the first row of the first column is
  • the position area of the element A2 located in the second row of the first column is
  • optical image recognition is performed on each slice image in step S16, preferably in the following manner:
  • the slice images located in the same column are optically recognized one by one, and then the slice images located in the other column are optically recognized one by one until the slice images located in the respective columns are optically recognized.
  • the slice image located in the nth column is obtained by slicing the nth column of the form.
  • optical character recognition is performed on the slice images generated by the elements located in the same column one by one, and after the optical character recognition is completed in all the slice images of the column, the slice characters generated by the elements located in the other column are optically characterized one by one. Identification.
  • optical character recognition of the slice images in the same column at one time can effectively control the range of character recognition and improve recognition.
  • the rate also reduces the time it takes to identify the operation.
  • the elements of the name column are all Chinese character data types.
  • the elements of the Amount column are all floating point data types. In the process of performing optical character recognition on the slice image generated by the amount column, it is only necessary to perform character recognition using the recognition algorithm corresponding to the floating point type data type.
  • the present invention also discloses another preferred embodiment, as shown in FIG. After step S17, the following steps can also be set:
  • Step S18 displaying an element input box corresponding to the first slice image
  • Step S19 Receive a character string input by the user in the element input box, compare the character string input by the user and the character string generated by performing optical character recognition on the first slice image, and issue a prompt if the two are inconsistent.
  • the first slice image is a slice image currently in a recorded state.
  • the user Based on the information processing method shown in FIG. 2, the user performs a recording operation on a certain slice image. If the character string input by the user does not match the character string generated by the optical character recognition of the slice image, it indicates that the character string input by the user may be in error, or the identification of the slice image is incorrect, or both of them are in error. The user issues a prompt so that the user can check again to ensure that the correct string is finally entered, which can further reduce the probability of error in the entered data, and even eliminate the error of the entered data.
  • the specific manner of outputting the prompt may be, but not limited to, adjusting the display color of the first slice image or generating a voice prompt.
  • the following step may be further set: adjusting the display effect of the first slice image to make the first slice
  • the display effect of the slice image is different from that of other slice images.
  • the user adjusts the slice image to be subjected to the entry operation to a different display effect, so that the user can more clearly see the slice image to be subjected to the entry operation among the plurality of slice images.
  • a red dashed line can be displayed on the outer circumference of the first slice image, so that the user can more intuitively see the first slice image.
  • the local area in the form may have a problem that the characters are not clear.
  • the following steps may be set: receiving a zoom instruction input by the user, and responding to the zoom The instruction performs a corresponding scaling process on the first slice image.
  • the user can input an enlargement instruction, and the control device enlarges the first slice image so as to be able to see the elements contained in the first slice image, in the user input box
  • the control device performs a reduction process on the first slice image so as to be the first
  • the sliced image is restored to its original size.
  • the present invention also discloses an information processing apparatus for processing elements in a form.
  • the form is a digitized image, and may be a scanned piece of the ticket or an image obtained by capturing the ticket.
  • the elements in the form are distributed in N columns, and N is an integer greater than 1. The following description can be referred to in correspondence with the above description regarding the information processing method.
  • FIG. 3 is a schematic structural diagram of an information processing apparatus according to the present disclosure.
  • the letter The information processing apparatus includes a column position area determining unit 1, a positioning element position area determining unit 2, an element number determining unit 3, an element position area determining unit 4, an image processing unit 5, a character recognizing unit 6, and a storage unit 7.
  • the column location area determining unit 1 is configured to respectively determine a location area of each column in the form in the form.
  • the element number determining unit 3 is for determining the number of elements included in each column in the form.
  • the element position area determining unit 4 is configured to determine each element in the form in the form by using the position area of each column in the form, the position area of the positioning element of each column in the form, and the number of elements included in each column. Location area.
  • the image processing unit 5 is configured to slice the form according to the position area of each element in the form to obtain a plurality of slice images.
  • Each slice image contains one element, and the number of slice images obtained by the slice process is consistent with the number of elements included in the form.
  • the character recognition unit 6 is configured to respectively perform optical character recognition on the slice image to obtain a character string included in the slice image.
  • the storage unit 7 is configured to record the obtained character string according to a preset rule.
  • the information processing apparatus disclosed by the present invention first determines a location area of each column in the form in the form, determines a location area of the positioning element of each column in the form, determines the number of elements included in each column in the form, and then according to the foregoing information. Determining the position area of each element in the form, and performing slice processing according to the position area of each element in the form, so that each element is divided into one slice image, and then respectively performing optical character recognition on each slice image to obtain a slice image containing The string is recorded.
  • the information processing apparatus disclosed in the present invention divides each element in the form into one slice image, and then performs optical character recognition separately for each slice image to acquire a character string included in the slice image, since one optical character recognition operation is only for one image
  • the element can therefore identify the slice image based on various data types until the string contained in the slice image is recognized, which can reduce the error rate of the data.
  • the nth column in the form is in the position area of the form (first coordinate value, second coordinate)
  • the value, the third coordinate value, and the fourth coordinate value are calibrated.
  • the first coordinate value is the distance between the left side of the nth column and the left side of the form
  • the second coordinate value is the distance between the top end of the nth column and the upper side of the form
  • the third coordinate value The distance between the right side of the nth column and the left side of the form
  • the fourth coordinate value is the distance between the bottom end of the nth column and the upper side of the form.
  • the positional element of the nth column is calibrated in the positional area in the form by (fifth coordinate value, sixth coordinate value, seventh coordinate value, eighth coordinate value).
  • the fifth coordinate value is the distance between the left side of the cell in which the positioning element of the nth column is located and the left side of the form
  • the sixth coordinate value is the top of the cell where the positioning element of the nth column is located.
  • the distance between the upper sides of the form, the seventh coordinate value is the distance between the right side of the cell in which the positioning element of the nth column is located and the left side of the form
  • the eighth coordinate value is the positioning of the nth column.
  • the first coordinate value to the eighth coordinate value may also be configured as:
  • the first coordinate value is the distance between the left side of the nth column and the left side of the form
  • the second coordinate value is the distance between the top end of the nth column and the upper side of the form
  • the third coordinate value is the first The distance between the right side of the n column and the right side of the form
  • the fourth coordinate value is the distance between the bottom end of the nth column and the lower side of the form.
  • the fifth coordinate value is the distance between the left side of the cell in which the positioning element of the nth column is located and the left side of the form
  • the sixth coordinate value is the top of the cell in which the positioning element of the nth column is located and the form
  • the seventh coordinate value is the distance between the right side of the cell in which the positioning element of the nth column is located and the right side of the form
  • the eighth coordinate value is the positioning element of the nth column. The distance between the bottom of the cell and the underside of the form.
  • the character recognition unit 6 is configured to perform optical character recognition on the slice images located in the same column one by one, and then perform optical character recognition on the slice images located in another column one by one until the slice images located in the respective columns are optically Character recognition.
  • the slice image located in the nth column is obtained by slicing the nth column of the form.
  • the character recognition unit 6 performs optical character recognition on the slice images generated by the elements located in the same column one by one, and after the optical character recognition is completed in all the slice images of the column, the slices generated by the elements located in the other column are successively generated. The image is optically recognized.
  • FIG. 4 is a schematic structural diagram of another information processing apparatus according to the present disclosure. Compared with the information processing apparatus shown in FIG. 3, the method further includes: a control unit 8 and a first processing unit 9.
  • the control unit 8 is configured to control the display interface to display an element input box corresponding to the first slice image, wherein the first slice image is a slice image currently in a recorded state.
  • the first processing unit 9 is configured to receive a character string input by the user in the element input box, compare the character string input by the user, and generate a character string generated by performing optical character recognition on the first slice image, and issue a prompt if the two are inconsistent .
  • the information processing apparatus shown in FIG. 4 of the present invention is capable of performing a recording operation for a certain slice image as compared with the information processing apparatus shown in FIG. 3, if the character string input by the user and the character generated by optical character recognition of the slice image The string is inconsistent, indicating that the user-entered string may have an error, or the slice image is identified incorrectly, or both of them have an error. At this time, the user is prompted to check the user again to ensure that the correct string is finally entered. It can further reduce the probability of errors in the recorded data, and even eliminate the error of the recorded data.
  • the specific manner of outputting the prompt may be, but not limited to, adjusting the display color of the first slice image or generating a voice prompt.
  • a second processing unit may be provided. After the element input box corresponding to the first slice image is displayed on the display interface, the second processing unit adjusts the display effect of the first slice image to make the display effect of the first slice image and the display effect of the other slice images. different.
  • a third processing unit may be further provided in the above information processing apparatus.
  • the third processing unit is configured to receive a zoom instruction input by the user, and perform corresponding scaling processing on the first slice image in response to the zoom instruction.
  • the information processing apparatus of various embodiments in this specification can be implemented by a computer including a processor and an optoelectronic recognizer configured to perform optical character recognition on the slice image, respectively.
  • the information processing apparatus may be implemented by a computer including a memory, a photoelectric recognizer, and a processor, in addition to storing a character string obtained by recording according to a preset rule, thereby implementing a processing unit in the present specification, and storing There is a program that causes the processor to execute steps other than the optical recognition step in the information processing method in this specification by executing the program.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)

Abstract

一种信息处理方法及装置,该方法首先分别确定表单中各列在该表单中的位置区域(S11),分别确定表单中各列的定位元素在表单中的位置区域(S12),确定表单中各列包含的元素的数量(S13),分别确定表单中各个元素在表单中的位置区域(S14),依据各个元素在表单中的位置区域对表单进行切片处理,得到多个切片图像(S15),分别对各个切片图像进行光学字符识别,获得切片图像包含的字符串(S16),然后按照预设规则记录获取到的字符串(S17)。能够获取表单中的元素,并且能够降低数据的出错率。

Description

信息处理方法及信息处理装置 技术领域
本发明属于文本识别技术领域,尤其涉及信息处理方法及信息处理装置。
背景技术
在银行的日常业务中,需要对海量的表单进行处理。目前,工作人员在对纸质材料进行电子化处理过程中,要对大量的列表式信息进行人工录入,例如住房公积金归集过程中的公积金汇缴清册。
这些列表式信息具有以下特征:1、票据的数据量大;2、票据中位于同一列的各个元素的数据类型或者内容相同,位于同一行的各个元素的数据类型和内容不同;3、票据中位于同一行的各个元素的像素高度相同,位于同一列的各个元素的像素宽度相同。其中,票据中的元素指的是位于一个单元格中的字符或者字符串,也可以将元素理解为一条数据,例如:用户的姓名、日期、金额。
工作人员如果逐行录入列表式信息,往往需要频繁切换输入法,并调整手指在键盘上的操作区域,操作比较复杂。因此,工作人员通常是逐列录入列表式信息,由于位于同一列的元素的数据类型相同,视觉效果相似,这极易造成工作人员视觉疲劳,出现输入错位等问题。
为了降低工作人员的劳动强度,目前针对列表式信息出现了新的处理方式,具体的:利用文本识别工具对整个表单进行智能识别,获得表单中的数据并进行存储。
但是,申请人发现:采用上述方式对表单进行智能识别,获得的数据出错率较高。因此,如何对处理表单的方式进行改进,降低数据的出错率,是本领域技术人员亟待解决的问题。
发明内容
有鉴于此,本发明的目的在于提供一种信息处理方法和信息处理装置,用以获取表单中的元素,并降低获取到的数据的出错率。
为实现上述目的,本发明提供如下技术方案:
本发明公开一种信息处理方法,用于处理表单中的元素,所述表单为数字化图像,所述表单中的元素呈N列分布,其中,所述元素为一条数据,N为大于1的整数,所述信息处理方法包括:
分别确定所述表单中各列在所述表单中的位置区域;
分别确定所述表单中各列的定位元素在所述表单中的位置区域,其中,第n列的定位元素为:所述第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N;
确定所述表单中各列包含的元素的数量;
利用所述各列在所述表单中的位置区域、各列的定位元素在所述表单中的位置区域、以及各列包含的元素的数量,分别确定所述表单中各个元素在所述表单中的位置区域;
依据各个元素在所述表单中的位置区域对所述表单进行切片处理,得到多个切片图像,其中每个切片图像包含一个元素,切片处理得到的切片图像的数量与所述表单包含的元素的数量一致;
分别对所述切片图像进行光学字符识别,获得所述切片图像包含的字符串;
按照预设规则记录获得的字符串。
可选的,上述信息处理方法中,所述分别对所述切片图像进行光学字符识别,具体为:逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别;其中,位于第n列的切片图像由对所述表单的第n列进行切片处理得到。
可选的,上述信息处理方法还包括:显示与第一切片图像对应的元素输入框,其中,所述第一切片图像是当前处于录入状态的切片图像;接收用户在所述元素输入框输入的字符串;比较用户输入的字符串和对所述第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
可选的,上述信息处理方法中,在显示与第一切片图像对应的元素输入框之后,所述方法还包括:调整所述第一切片图像的显示效果,以使得所述第一切片图像的显示效果与其他切片图像的显示效果不同。
可选的,上述信息处理方法还包括:接收用户输入的缩放指令,响应所述缩放指令对所述第一切片图像进行相应的缩放处理。
可选的,上述信息处理方法中,所述表单中第n列在所述表单中的位置区域以(第一坐标值、第二坐标值、第三坐标值、第四坐标值)标定,其中,所述第一坐标值为所述第n列的左侧边与所述表单的左侧边之间的距离,所述第二坐标值为所述第n列的顶端与所述表单的上侧边之间的距离,所述第三坐标值为所述第n列的右侧边与所述表单的左侧边之间的距离,所述第四坐标值为所述第n列的底端与所述表单的上侧边之间的距离;
所述第n列的定位元素在所述表单中的位置区域以(第五坐标值、第六坐标值、第七坐标值、第八坐标值)标定,其中,所述第五坐标值为所述第n列的定位元素所处单元格的左侧边与所述表单的左侧边之间的距离,所述第六坐标值为所述第n列的定位元素所处单元格的顶端与所述表单的上侧边之间的距离,所述第七坐标值为所述第n列的定位元素所处单元格的右侧边与所述表单的左侧边之间的距离,所述第八坐标值为所述第n列的定位元素所处单元格的底端与所述表单的上侧边之间的距离。
本发明还公开一种信息处理装置,用于处理表单中的元素,所述表单为数字化图像,所述表单中的元素呈N列分布,其中,所述元素为一条数据,N为大于1的整数,所述信息处理装置包括:
列位置区域确定单元,用于分别确定所述表单中各列在所述表单中的位置区域;
定位元素位置区域确定单元,用于分别确定所述表单中各列的定位元素在所述表单中的位置区域,其中,第n列的定位元素为:所述第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N;
元素数量确定单元,用于确定所述表单中各列包含的元素的数量;
元素位置区域确定单元,用于利用所述各列在所述表单中的位置区域、各列的定位元素在所述表单中的位置区域、以及各列包含的元素的数量,分 别确定所述表单中各个元素在所述表单中的位置区域;
图像处理单元,用于依据各个元素在所述表单中的位置区域对所述表单进行切片处理,得到多个切片图像,其中每个切片图像包含一个元素,切片处理得到的切片图像的数量与所述表单包含的元素的数量一致;
字符识别单元,用于分别对所述切片图像进行光学字符识别,获得所述切片图像包含的字符串;
存储单元,用于按照预设规则记录获得的字符串。
可选的,上述信息处理装置中,所述字符识别单元具体用于:逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别;其中,位于第n列的切片图像由对所述表单的第n列进行切片处理得到。
可选的,上述信息处理装置还包括:控制单元,用于控制显示界面显示与第一切片图像对应的元素输入框,其中,所述第一切片图像是当前处于录入状态的切片图像;第一处理单元,用于接收用户在所述元素输入框输入的字符串,比较用户输入的字符串和对所述第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
可选的,上述信息处理装置还包括第二处理单元;在所述显示界面显示与第一切片图像对应的元素输入框之后,所述第二处理单元调整所述第一切片图像的显示效果,以使得所述第一切片图像的显示效果与其他切片图像的显示效果不同。
可选的,上述信息处理装置还包括第三处理单元;所述第三处理单元用于接收用户输入的缩放指令,响应所述缩放指令对所述第一切片图像进行相应的缩放处理。
由此可见,本发明的有益效果为:
本发明公开的信息处理方法,首先确定表单中各列在该表单中的位置区域,确定各列的定位元素在表单中的位置区域,确定表单中各列包含的元素的数量,之后根据前述信息确定各个元素在表单中的位置区域,依据各个元素在表单中的位置区域进行切片处理,使得每个元素均切分为一个切片图像,之后对各个切片图像分别进行光学字符识别,获得切片图像包含的字符 串并进行记录。基于本发明公开的信息处理方法,将表单中的各个元素分别划分为一个切片图像,后续针对各个切片图像分别进行光学字符识别,以获取切片图像包含的字符串,由于一次光学字符识别操作仅针对一个元素,因此可以基于多种数据类型对切片图像进行识别,直至识别出切片图像包含的字符串,能够降低数据的出错率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本发明公开的一种信息处理方法的流程图;
图2为本发明公开的另一种信息处理方法的流程图;
图3为本发明公开的一种信息处理装置的结构示意图;
图4为本发明公开的另一种信息处理装置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明公开一种信息处理方法,用于处理表单中的元素。其中,该表单为数字化图像,可以为票据的扫描件或者对票据进行拍摄得到的图像,该表单中的元素呈N列分布,N为大于1的整数。
基于本发明公开的信息处理方法,能够获取表单中的元素,并降低获取到的数据的出错率。
参见图1,图1为本发明公开的一种信息处理方法的流程图。该信息处理方法包括:
步骤S11:分别确定表单中各列在表单中的位置区域。
实施中,表单中的各列在表单中的位置区域可以采用各列与表单的四个侧边的距离标定。用户可以人工确定各列在表单中的位置区域后,向运行该方法的设备输入上述数据。运行该方法的设备也可以利用现有的测距软件进行测量,以确定各列在表单中的位置区域。
步骤S12:分别确定表单中各列的定位元素在表单中的位置区域。其中,第n列的定位元素为:第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N。
实施中,表单中各列的定位元素的位置区域可以采用定位元素所在单元格与列表的四个侧边的距离标定。这里需要说明的是:定位元素所在的单元格可以是用户可见的,也可以是用户不可见的(单元格的边框为无色)。用户可以人工确定各列的定位元素在表单中的位置区域后,向运行该方法的设备输入上述数据。运行该方法的设备也可以利用现有的测距软件进行测量,以确定各列的定位元素在表单中的位置区域。
步骤S13:确定表单中各列包含的元素的数量。
步骤S14:利用各列在表单中的位置区域、各列的定位元素在表单中的位置区域、以及各列包含的元素的数量,分别确定表单中各个元素在表单中的位置区域。
以表单中的第n列为例:根据第n列在表单中的位置区域以及第n列的定位元素在表单中位置区域,就可以确定第n列中全部元素的总高度,另外第n列所包含的元素数量是已确定的,由此可以确定第n列中每个元素的平均高度。之后,根据第n列的定位元素的位置,以及第n列中每个元素的平均高度就可以确定第n列中的各个元素在表单中的位置区域。
步骤S15:依据各个元素在表单中的位置区域对表单进行切片处理,得到多个切片图像,其中每个切片图像包含一个元素,切片处理得到的切片图像的数量与表单包含的元素的数量一致。
在执行步骤S14后,表单的各个元素在该表单中的位置区域已经确定,依据各个元素在表单中的位置区域进行切片处理,就可以将每个元素均切分为一个切片图像。也就是说,将获得M个切片图像,其中M与该表单包含 的元素的数量一致,每个切片图像包含一个元素。
步骤S16:分别对各个切片图像进行光学字符识别(OCR),获得切片图像包含的字符串。
在现有技术中,是利用文本识别工具对表单进行整体智能识别。由于表单中包含多种数据类型的元素,同时对多种数据类型的元素进行识别,必然会导致识别率较低,相应的获得的数据容易出错。
而本发明中,是对切片图像进行光学字符识别,而且一个切片图像仅包含一个元素。由于一次光学字符识别操作仅针对一个元素,因此可以基于多种数据类型对切片图像进行识别,直至识别出切片图像包含的字符串,相对于现有技术中的整体识别,能够降低数据的出错率。
步骤S17:按照预设规则记录获取到的字符串。
实施中,可以将获取到的字符串记录于预设表格的特定位置,该特定位置由包含该字符串的元素在表单中的位置区域确定。
本发明公开的信息处理方法,首先确定表单中各列在该表单中的位置区域,确定各列的定位元素在表单中的位置区域,确定表单中各列包含的元素的数量,之后根据前述信息确定各个元素在表单中的位置区域,依据各个元素在表单中的位置区域进行切片处理,使得每个元素均切分为一个切片图像,之后对各个切片图像分别进行光学字符识别,获得切片图像包含的字符串并进行记录。基于本发明公开的信息处理方法,将表单中的各个元素分别划分为一个切片图像,后续针对各个切片图像分别进行光学字符识别,以获取切片图像包含的字符串,由于一次光学字符识别操作仅针对一个元素,因此可以基于多种数据类型对切片图像进行识别,直至识别出切片图像包含的字符串,能够降低数据的出错率。
实施中,表单中第n列在表单中的位置区域以(第一坐标值、第二坐标值、第三坐标值、第四坐标值)标定。其中,第一坐标值为第n列的左侧边与表单的左侧边之间的距离,第二坐标值为第n列的顶端与表单的上侧边之间的距离,第三坐标值为第n列的右侧边与表单的左侧边之间的距离,第四坐标值为第n列的底端与表单的上侧边之间的距离。
第n列的定位元素在表单中的位置区域以(第五坐标值、第六坐标值、 第七坐标值、第八坐标值)标定。其中,第五坐标值为第n列的定位元素所处单元格的左侧边与表单的左侧边之间的距离,第六坐标值为第n列的定位元素所处单元格的顶端与表单的上侧边之间的距离,第七坐标值为第n列的定位元素所处单元格的右侧边与表单的左侧边之间的距离,第八坐标值为第n列的定位元素所处单元格的底端与表单的上侧边之间的距离。
当然,上述只是对表单中各列的位置区域以及各列中定位元素的位置区域的一种标定方式。实施中,第一坐标值至第八坐标值还可以配置为:
第一坐标值为第n列的左侧边与表单的左侧边之间的距离,第二坐标值为第n列的顶端与表单的上侧边之间的距离,第三坐标值为第n列的右侧边与表单的右侧边之间的距离,第四坐标值为第n列的底端与表单的下侧边之间的距离。
第五坐标值为第n列的定位元素所处单元格的左侧边与表单的左侧边之间的距离,第六坐标值为第n列的定位元素所处单元格的顶端与表单的上侧边之间的距离,第七坐标值为第n列的定位元素所处单元格的右侧边与表单的右侧边之间的距离,第八坐标值为第n列的定位元素所处单元格的底端与表单的下侧边之间的距离。
下面结合实例对确定表单中位于同一列的各个元素的位置区域的过程进行说明:
假如表单中第一列的位置区域为(leftA,TopA,RightA,BottomA),第一列的定位元素的位置区域为(leftA1,TopA1,RightA1,BottomA1),其中LeftA1等于LeftA,RightA1等于RightA。该列的最大行数为MaxColumn,也就是说第一列包含的元素的数量为MaxColumn。
可以确定,第一列中MaxColumn行元素的高度为BottomA-TopA1,第一列中各元素的平均高度Height为(BottomA-TopA1)/MaxColumn。之后,根据第一列的定位元素的位置区域,以及各元素的平均高度就可以确定第一列中MaxColumn行元素的位置区域,具体的:
位于第一列第一行的元素A1的位置区域为
(LeftA,TopA1,RightA,TopA1+Height);
位于第一列第二行的元素A2的位置区域为
(LeftA,TopA1+Height,RightA,TopA1+2*Height):
……;
位于第一列第MaxColumn行的元素Amax的位置区域为
(LeftA,TopA1+Height*(MaxColumn-1),RightA,TopA1+Height*MaxColumn)。
作为优选方案,在确定某一列中各个元素的位置区域之后,利用特定颜色的矩形框在表单上标注各个元素的位置区域,例如利用红色虚线矩形框在表单上标注各个元素的位置区域。
用户基于标注的矩形框就可以直观的判断计算出的各个元素的位置区域与各个元素的实际位置区域是否相符。如果计算出的各个元素的位置区域与其实际位置区域存在偏差,用户可以对表单中各列的位置区域以及各列的定位元素的位置区域进行人工调整。
仍以表单中的第一列为例:第一列的位置区域调整为
Figure PCTCN2015098836-appb-000001
第一列的定位元素的位置区域调整为
Figure PCTCN2015098836-appb-000002
其中,
Figure PCTCN2015098836-appb-000003
为第一列的左侧边与表单的左侧边之间的距离的调整值,
Figure PCTCN2015098836-appb-000004
为第一列的顶端与表单的上侧边之间的距离的调整值,
Figure PCTCN2015098836-appb-000005
为第一列的右侧边与表单的左侧边之间的距离的调整值,
Figure PCTCN2015098836-appb-000006
为第一列的底端与表单的上侧边之间的距离的调整值,
Figure PCTCN2015098836-appb-000007
为第一列的定位元素所在单元格的顶端与表单的上侧边之间的距离的调整值,
Figure PCTCN2015098836-appb-000008
为第一列的定位元素所在单元格的底端与表单的上侧边之间的距离的调整值。
在这种情况下,第一列中MaxColumn行元素的高度为
Figure PCTCN2015098836-appb-000009
第一列中各元素的平均高度
Figure PCTCN2015098836-appb-000010
Figure PCTCN2015098836-appb-000011
之后,根据第一列的定位元素的位置区域,以及各元素的平均高度就可以确定第一列中MaxColumn行元素的位置区域,具体的:
位于第一列第一行的元素A1的位置区域为
Figure PCTCN2015098836-appb-000012
位于第一列第二行的元素A2的位置区域为
Figure PCTCN2015098836-appb-000013
……;
位于第一列第MaxColumn行的元素Amax的位置区域为
Figure PCTCN2015098836-appb-000014
在本发明图1所示的信息处理方法中,步骤S16中分别对各个切片图像进行光学字符识别,优选采用如下方式:
逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别。其中,位于第n列的切片图像由对所述表单的第n列进行切片处理得到。
也就是说,逐个对由位于同一列的元素产生的切片图像进行光学字符识别,在该列全部的切片图像完成光学字符识别后,再逐个对由位于另一列的元素产生的切片图像进行光学字符识别。
由于表单中位于同一列的元素的数据类型相同,甚至位于同一列的部分元素的内容也相同,因此一次性对位于同一列的切片图像进行光学字符识别,可以有效控制字符识别范围,能够提高识别率,同时也能够缩短识别操作所耗费的时间。
例如:姓名列的元素均为汉字字符数据类型。在对由姓名列产生的切片图像进行光学字符识别过程中,只需要利用与汉字字符数据类型对应的识别算法进行字符识别。
例如:金额列的元素均为浮点型数据类型。在对由金额列产生的切片图像进行光学字符识别过程中,只需要利用与浮点型数据类型对应的识别算法进行字符识别。
在本发明上述公开的信息处理方法的基础上,本发明还公开另一种优选方案,如图2所示。在步骤S17之后,还可以设置以下步骤:
步骤S18:显示与第一切片图像对应的元素输入框;
步骤S19:接收用户在该元素输入框输入的字符串,比较用户输入的字符串和对第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
其中,第一切片图像是当前处于录入状态的切片图像。
基于图2所示的信息处理方法,用户针对某一切片图像进行录入操作, 如果用户输入的字符串与对该切片图像进行光学字符识别产生的字符串不一致,表明用户输入的字符串可能出现错误,或者该切片图像的识别出现错误,或者两者均出现错误,此时向用户发出提示,以使得用户再次核对,以保证最终录入正确的字符串,能够进一步降低录入数据出现错误的概率,甚至消除录入数据出现错误的现象。
实施中,输出提示的具体方式可以采用但不限于:调整第一切片图像的显示颜色,或者发生语音提示。
另外,在图2所示信息处理方法的基础上,在显示与第一切片图像对应的元素输入框之后,还可以设置以下步骤:调整第一切片图像的显示效果,以使得第一切片图像的显示效果与其他切片图像的显示效果不同。
在用户人工录入的过程中,将用户将要执行录入操作的切片图像调整为不同的显示效果,以便用户能够在多个切片图像中更加直观的看到将要执行录入操作的切片图像。
考虑到表单多为白底黑字,作为一种实现方式,可以在第一切片图像的外周显示红色的虚线,以便用户更加直观的看到第一切片图像。
另外,表单中的局部区域可能存在字符不太清晰的问题,为了方便用户查看该区域,在图2所示信息处理方法的基础上,可以设置以下步骤:接收用户输入的缩放指令,响应该缩放指令对第一切片图像进行相应的缩放处理。
如果第一切片图像中的字符不太清晰,用户可以输入放大指令,控制设备对第一切片图像进行放大处理,以便能看清楚第一切片图像包含的元素,在用户在元素输入框输入字符串,并且用户输入的字符串与对第一切片图像进行光学字符识别产生的字符串一致时,用户可以输入缩小指令,控制设备对第一切片图像进行缩小处理,以便将第一切片图像恢复至原始大小。
本发明还公开一种信息处理装置,用于处理表单中的元素。其中,该表单为数字化图像,可以为票据的扫描件或者对票据进行拍摄得到的图像,该表单中的元素呈N列分布,N为大于1的整数。下文描述内容可与上述关于信息处理方法的描述内容相互对应参照。
参见图3,图3为本发明公开的一种信息处理装置的结构示意图。该信 息处理装置包括列位置区域确定单元1、定位元素位置区域确定单元2、元素数量确定单元3、元素位置区域确定单元4、图像处理单元5、字符识别单元6和存储单元7。
其中:列位置区域确定单元1,用于分别确定表单中各列在表单中的位置区域。
定位元素位置区域确定单元2,用于分别确定表单中各列的定位元素在表单中的位置区域,其中,第n列的定位元素为:第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N。
元素数量确定单元3,用于确定表单中各列包含的元素的数量。
元素位置区域确定单元4,用于利用各列在表单中的位置区域、各列的定位元素在表单中的位置区域、以及各列包含的元素的数量,分别确定表单中各个元素在表单中的位置区域。
图像处理单元5,用于依据各个元素在表单中的位置区域对表单进行切片处理,得到多个切片图像。其中,每个切片图像包含一个元素,切片处理得到的切片图像的数量与表单包含的元素的数量一致。
字符识别单元6,用于分别对切片图像进行光学字符识别,获得切片图像包含的字符串。
存储单元7,用于按照预设规则记录获得的字符串。
本发明公开的信息处理装置,首先确定表单中各列在该表单中的位置区域,确定各列的定位元素在表单中的位置区域,确定表单中各列包含的元素的数量,之后根据前述信息确定各个元素在表单中的位置区域,依据各个元素在表单中的位置区域进行切片处理,使得每个元素均切分为一个切片图像,之后对各个切片图像分别进行光学字符识别,获得切片图像包含的字符串并进行记录。本发明公开的信息处理装置,将表单中的各个元素分别划分为一个切片图像,后续针对各个切片图像分别进行光学字符识别,以获取切片图像包含的字符串,由于一次光学字符识别操作仅针对一个元素,因此可以基于多种数据类型对切片图像进行识别,直至识别出切片图像包含的字符串,能够降低数据的出错率。
实施中,表单中第n列在表单中的位置区域以(第一坐标值、第二坐标 值、第三坐标值、第四坐标值)标定。其中,第一坐标值为第n列的左侧边与表单的左侧边之间的距离,第二坐标值为第n列的顶端与表单的上侧边之间的距离,第三坐标值为第n列的右侧边与表单的左侧边之间的距离,第四坐标值为第n列的底端与表单的上侧边之间的距离。
第n列的定位元素在表单中的位置区域以(第五坐标值、第六坐标值、第七坐标值、第八坐标值)标定。其中,第五坐标值为第n列的定位元素所处单元格的左侧边与表单的左侧边之间的距离,第六坐标值为第n列的定位元素所处单元格的顶端与表单的上侧边之间的距离,第七坐标值为第n列的定位元素所处单元格的右侧边与表单的左侧边之间的距离,第八坐标值为第n列的定位元素所处单元格的底端与表单的上侧边之间的距离。
当然,上述只是对表单中各列的位置区域以及各列中定位元素的位置区域的一种标定方式。实施中,第一坐标值至第八坐标值还可以配置为:
第一坐标值为第n列的左侧边与表单的左侧边之间的距离,第二坐标值为第n列的顶端与表单的上侧边之间的距离,第三坐标值为第n列的右侧边与表单的右侧边之间的距离,第四坐标值为第n列的底端与表单的下侧边之间的距离。
第五坐标值为第n列的定位元素所处单元格的左侧边与表单的左侧边之间的距离,第六坐标值为第n列的定位元素所处单元格的顶端与表单的上侧边之间的距离,第七坐标值为第n列的定位元素所处单元格的右侧边与表单的右侧边之间的距离,第八坐标值为第n列的定位元素所处单元格的底端与表单的下侧边之间的距离。
作为优选方式,字符识别单元6具体用于:逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别。其中,位于第n列的切片图像由对表单的第n列进行切片处理得到。
也就是说,字符识别单元6逐个对由位于同一列的元素产生的切片图像进行光学字符识别,在该列全部的切片图像完成光学字符识别后,再逐个对由位于另一列的元素产生的切片图像进行光学字符识别。
由于表单中位于同一列的元素的数据类型相同,甚至位于同一列的部分 元素的内容也相同,因此一次性对位于同一列的切片图像进行光学字符识别,可以有效控制字符识别范围,能够提高识别率,同时也能够缩短识别操作所耗费的时间。
参见图4,图4为本发明公开的另一种信息处理装置的结构示意图。与图3所示信息处理装置相比,进一步包括:控制单元8和第一处理单元9。
控制单元8用于控制显示界面显示与第一切片图像对应的元素输入框,其中,第一切片图像是当前处于录入状态的切片图像。
第一处理单元9用于接收用户在元素输入框输入的字符串,比较用户输入的字符串和对第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
本发明图4所示的信息处理装置与图3所示信息处理装置相比,用户能够针对某一切片图像进行录入操作,如果用户输入的字符串与对该切片图像进行光学字符识别产生的字符串不一致,表明用户输入的字符串可能出现错误,或者该切片图像的识别出现错误,或者两者均出现错误,此时向用户发出提示,以使得用户再次核对,以保证最终录入正确的字符串,能够进一步降低录入数据出现错误的概率,甚至消除录入数据出现错误的现象。
实施中,输出提示的具体方式可以采用但不限于:调整第一切片图像的显示颜色,或者发生语音提示。
作为优选方案,在图4所示信息处理装置的基础上,还可以设置第二处理单元。其中,在显示界面显示与第一切片图像对应的元素输入框之后,第二处理单元调整第一切片图像的显示效果,以使得第一切片图像的显示效果与其他切片图像的显示效果不同。
另外,还可以在上述信息处理装置中进一步设置第三处理单元。第三处理单元用于接收用户输入的缩放指令,响应缩放指令对第一切片图像进行相应的缩放处理。
如同本领域技术人员所理解的,本说明书中的各个实施例的信息处理装置可以通过包括处理器和光电识别器的计算机来实现,该光电识别器被配置为分别对切片图像进行光学字符识别,获得切片图像包含的字符串,从而实现本说明书中的光学识别单元6,该处理器被配置为执行本说明书中各个实 施例的信息处理方法中的除光学识别步骤之外的步骤。另选地,信息处理装置可以通过包括存储器、光电识别器和处理器的计算机来实现,存储器除了存储有按照预设规则记录获得的字符串,从而实现本说明书中的处理单元之外,还存储有程序,使得处理器通过执行该程序来执行本说明书中的信息处理方法中除光学识别步骤之外的步骤。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (11)

  1. 一种信息处理方法,该信息处理方法用于处理表单中的元素,所述表单为数字化图像,所述表单中的元素呈N列分布,其中,所述元素为一条数据,N为大于1的整数,所述信息处理方法包括:
    分别确定所述表单中各列在所述表单中的位置区域;
    分别确定所述表单中各列的定位元素在所述表单中的位置区域,其中,第n列的定位元素为:所述第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N;
    确定所述表单中各列包含的元素的数量;
    利用所述各列在所述表单中的位置区域、各列的定位元素在所述表单中的位置区域、以及各列包含的元素的数量,分别确定所述表单中各个元素在所述表单中的位置区域;
    依据各个元素在所述表单中的位置区域对所述表单进行切片处理,得到多个切片图像,其中每个切片图像包含一个元素,切片处理得到的切片图像的数量与所述表单包含的元素的数量一致;
    分别对所述切片图像进行光学字符识别,获得所述切片图像包含的字符串;
    按照预设规则记录获得的字符串。
  2. 根据权利要求1所述的方法,其中,所述分别对所述切片图像进行光学字符识别,具体为:
    逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别;
    其中,位于第n列的切片图像由对所述表单的第n列进行切片处理得到。
  3. 根据权利要求1或2所述的方法,其中,所述方法还包括:
    显示与第一切片图像对应的元素输入框,其中,所述第一切片图像是当前处于录入状态的切片图像;
    接收用户在所述元素输入框输入的字符串,比较用户输入的字符串和对所述第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
  4. 根据权利要求3所述的方法,其中,在显示与第一切片图像对应的 元素输入框之后,所述方法还包括:
    调整所述第一切片图像的显示效果,以使得所述第一切片图像的显示效果与其他切片图像的显示效果不同。
  5. 根据权利要求4所述的方法,其中,所述方法还包括:
    接收用户输入的缩放指令,响应所述缩放指令对所述第一切片图像进行相应的缩放处理。
  6. 根据权利要求1所述的方法,其中,
    所述表单中第n列在所述表单中的位置区域以(第一坐标值、第二坐标值、第三坐标值、第四坐标值)标定,其中,所述第一坐标值为所述第n列的左侧边与所述表单的左侧边之间的距离,所述第二坐标值为所述第n列的顶端与所述表单的上侧边之间的距离,所述第三坐标值为所述第n列的右侧边与所述表单的左侧边之间的距离,所述第四坐标值为所述第n列的底端与所述表单的上侧边之间的距离;
    所述第n列的定位元素在所述表单中的位置区域以(第五坐标值、第六坐标值、第七坐标值、第八坐标值)标定,其中,所述第五坐标值为所述第n列的定位元素所处单元格的左侧边与所述表单的左侧边之间的距离,所述第六坐标值为所述第n列的定位元素所处单元格的顶端与所述表单的上侧边之间的距离,所述第七坐标值为所述第n列的定位元素所处单元格的右侧边与所述表单的左侧边之间的距离,所述第八坐标值为所述第n列的定位元素所处单元格的底端与所述表单的上侧边之间的距离。
  7. 一种信息处理装置,该信息处理装置用于处理表单中的元素,所述表单为数字化图像,所述表单中的元素呈N列分布,其中,所述元素为一条数据,N为大于1的整数,所述信息处理装置包括:
    列位置区域确定单元,用于分别确定所述表单中各列在所述表单中的位置区域;
    定位元素位置区域确定单元,用于分别确定所述表单中各列的定位元素在所述表单中的位置区域,其中,第n列的定位元素为:所述第n列所需录入的元素中、处于最上侧的元素,其中n=1,2,…N;
    元素数量确定单元,用于确定所述表单中各列包含的元素的数量;
    元素位置区域确定单元,用于利用所述各列在所述表单中的位置区域、各列的定位元素在所述表单中的位置区域、以及各列包含的元素的数量,分别确定所述表单中各个元素在所述表单中的位置区域;
    图像处理单元,用于依据各个元素在所述表单中的位置区域对所述表单进行切片处理,得到多个切片图像,其中每个切片图像包含一个元素,切片处理得到的切片图像的数量与所述表单包含的元素的数量一致;
    字符识别单元,用于分别对所述切片图像进行光学字符识别,获得所述切片图像包含的字符串;
    存储单元,用于按照预设规则记录获得的字符串。
  8. 根据权利要求7所述的信息处理装置,其中,所述字符识别单元具体用于:
    逐个对位于同一列的切片图像进行光学字符识别,之后逐个对位于另一列的切片图像进行光学字符识别,直至对位于各列的切片图像均进行光学字符识别;
    其中,位于第n列的切片图像由对所述表单的第n列进行切片处理得到。
  9. 根据权利要求7或8所述的信息处理装置,还包括:
    控制单元,用于控制显示界面显示与第一切片图像对应的元素输入框,其中,所述第一切片图像是当前处于录入状态的切片图像;
    第一处理单元,用于接收用户在所述元素输入框输入的字符串,比较用户输入的字符串和对所述第一切片图像进行光学字符识别产生的字符串,在两者不一致的情况下,发出提示。
  10. 根据权利要求9所述的信息处理装置,还包括第二处理单元;在所述显示界面显示与第一切片图像对应的元素输入框之后,所述第二处理单元调整所述第一切片图像的显示效果,以使得所述第一切片图像的显示效果与其他切片图像的显示效果不同。
  11. 根据权利要求10所述的信息处理装置,还包括第三处理单元;所述第三处理单元用于接收用户输入的缩放指令,响应所述缩放指令对所述第一切片图像进行相应的缩放处理。
PCT/CN2015/098836 2015-11-12 2015-12-25 信息处理方法及信息处理装置 WO2016188104A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP15893172.5A EP3147825A4 (en) 2015-11-12 2015-12-25 Information processing method and information processing device
SG11201610723SA SG11201610723SA (en) 2015-11-12 2015-12-25 Information processing method and information processing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510771802.7 2015-11-12
CN201510771802.7A CN105373791B (zh) 2015-11-12 2015-11-12 信息处理方法及信息处理装置

Publications (1)

Publication Number Publication Date
WO2016188104A1 true WO2016188104A1 (zh) 2016-12-01

Family

ID=55375974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/098836 WO2016188104A1 (zh) 2015-11-12 2015-12-25 信息处理方法及信息处理装置

Country Status (4)

Country Link
EP (1) EP3147825A4 (zh)
CN (1) CN105373791B (zh)
SG (1) SG11201610723SA (zh)
WO (1) WO2016188104A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106775786A (zh) * 2017-03-23 2017-05-31 北京赛迈特锐医疗科技有限公司 优化复杂信息录入界面的系统及其方法
CN109344831B (zh) * 2018-08-22 2024-04-05 中国平安人寿保险股份有限公司 一种数据表识别方法、装置及终端设备
CN111104853A (zh) * 2019-11-11 2020-05-05 中国建设银行股份有限公司 图像信息录入方法、装置、电子设备及存储介质
CN111401365B (zh) * 2020-03-17 2024-03-22 海尔优家智能科技(北京)有限公司 Ocr图像自动生成方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1877598A (zh) * 2005-06-06 2006-12-13 英华达(上海)电子有限公司 手机中使用图像识别进行名片信息的采集与录入的方法
CN102156855A (zh) * 2011-03-30 2011-08-17 信雅达系统工程股份有限公司 基于影像切割的银行凭证数据采集方法
CN102567764A (zh) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 一种提高电子影像识别效率的票据凭证及系统
CN103020619A (zh) * 2012-12-05 2013-04-03 上海合合信息科技发展有限公司 一种自动切分电子化笔记本中手写条目的方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004015619A1 (ja) * 2002-08-07 2004-02-19 Matsushita Electric Industrial Co., Ltd. 文字認識処理装置及び文字認識処理方法、並びに携帯端末装置
JP2007279828A (ja) * 2006-04-03 2007-10-25 Toshiba Corp 帳票処理装置、帳票様式作成装置、帳票、帳票処理用のプログラム、帳票様式作成用のプログラム
JP5321109B2 (ja) * 2009-02-13 2013-10-23 富士ゼロックス株式会社 情報処理装置及び情報処理プログラム
JP5465015B2 (ja) * 2010-01-06 2014-04-09 キヤノン株式会社 文書を電子化する装置及び方法
CN101923643B (zh) * 2010-08-11 2012-11-21 中科院成都信息技术有限公司 通用表格识别方法
CN104636117A (zh) * 2013-11-12 2015-05-20 江苏奥博洋信息技术有限公司 一种表格图像的自动切分方法
CN104462044A (zh) * 2014-12-16 2015-03-25 上海合合信息科技发展有限公司 表格图像识别编辑方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1877598A (zh) * 2005-06-06 2006-12-13 英华达(上海)电子有限公司 手机中使用图像识别进行名片信息的采集与录入的方法
CN102156855A (zh) * 2011-03-30 2011-08-17 信雅达系统工程股份有限公司 基于影像切割的银行凭证数据采集方法
CN102567764A (zh) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 一种提高电子影像识别效率的票据凭证及系统
CN103020619A (zh) * 2012-12-05 2013-04-03 上海合合信息科技发展有限公司 一种自动切分电子化笔记本中手写条目的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3147825A4 *

Also Published As

Publication number Publication date
CN105373791A (zh) 2016-03-02
CN105373791B (zh) 2018-12-14
SG11201610723SA (en) 2017-03-30
EP3147825A4 (en) 2017-08-30
EP3147825A1 (en) 2017-03-29

Similar Documents

Publication Publication Date Title
US20170351708A1 (en) Automated data extraction from scatter plot images
US9785627B2 (en) Automated form fill-in via form retrieval
KR101304084B1 (ko) 제스처 기반의 선택적인 텍스트 인식
US11113464B2 (en) Synchronizing data-entry fields with corresponding image regions
WO2016188104A1 (zh) 信息处理方法及信息处理装置
TW201617971A (zh) 資訊識別方法及裝置
LU93203B1 (en) Systems, methods and devices for the automated verification and quality control and assurance of vehicle identification plates
EP3940589B1 (en) Layout analysis method, electronic device and computer program product
CN109598185B (zh) 图像识别翻译方法、装置、设备及可读存储介质
US20220415008A1 (en) Image box filtering for optical character recognition
WO2014086277A1 (zh) 方便电子化的专业笔记本及其页码自动识别方法
US11080464B2 (en) Correction techniques of overlapping digital glyphs
JP2015215889A (ja) リフロー型電子書籍生成方法及びウェブサイトシステム
US9396389B2 (en) Techniques for detecting user-entered check marks
WO2013039063A1 (ja) 答案処理装置、答案処理方法、記録媒体、およびシール
BE1026159B1 (fr) Système de traitement d’image et procede de traitement d’image
CN109145907B (zh) 基于常用字字频统计的文本图像倒置检测方法及装置
US9483834B1 (en) Object boundary detection in an image
US9607360B2 (en) Modifying the size of document content based on a pre-determined threshold value
US11055551B2 (en) Correction support device and correction support program for optical character recognition result
US10032073B1 (en) Detecting aspect ratios of document pages on smartphone photographs by learning camera view angles
CN111563511B (zh) 一种智能框题的方法、装置、电子设备及存储介质
WO2014086265A1 (zh) 一种方便电子化的专业笔记本及其电子化方法
Parker et al. Robust binarization of degraded document images using heuristics
JP5844698B2 (ja) 文字認識装置

Legal Events

Date Code Title Description
REEP Request for entry into the european phase

Ref document number: 2015893172

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015893172

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15893172

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE