CN116469120A - Automatic data processing method and device for electric charge bill and storage medium - Google Patents

Automatic data processing method and device for electric charge bill and storage medium Download PDF

Info

Publication number
CN116469120A
CN116469120A CN202310634880.7A CN202310634880A CN116469120A CN 116469120 A CN116469120 A CN 116469120A CN 202310634880 A CN202310634880 A CN 202310634880A CN 116469120 A CN116469120 A CN 116469120A
Authority
CN
China
Prior art keywords
bill
information
extraction
electric charge
separation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310634880.7A
Other languages
Chinese (zh)
Other versions
CN116469120B (en
Inventor
周俊
蔡剑
姜志博
徐梦佳
林森
孙一申
胡茜
吕彬
季李昕
姚雅艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Marketing Service Center of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Marketing Service Center of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd, Marketing Service Center of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202310634880.7A priority Critical patent/CN116469120B/en
Publication of CN116469120A publication Critical patent/CN116469120A/en
Application granted granted Critical
Publication of CN116469120B publication Critical patent/CN116469120B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/224Character recognition characterised by the type of writing of printed characters having additional code marks or containing code marks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an automatic data processing method, device and storage medium for electric charge bill, comprising the following steps: calling a corresponding first separation extraction layer according to a first bill format; extracting the text information of each extracted area based on OCR to obtain bill information, and adding a corresponding area label to each bill information; extracting bill information of the related area labels according to a preset comparison method to obtain a comparison label set, and verifying the bill information based on a preset verification model to obtain a verification result; taking an extraction area corresponding to the corresponding comparison tag set as a problem area, adjusting the first separation extraction layer based on the problem area to obtain a second separation extraction layer, and combining the second separation extraction layer with the electric charge bill to obtain a feedback image for display; and if the verification result meets the requirement, filling the extracted bill information into a preset electric charge table.

Description

Automatic data processing method and device for electric charge bill and storage medium
Technical Field
The present invention relates to data processing technologies, and in particular, to a method and apparatus for automatically processing electric charge bill data, and a storage medium.
Background
The electric bill is a bill for notifying the power consumption condition of the user to the user by the power supply department, and the content of the electric bill reflects the power use condition of the power utilization enterprise in a certain time period.
The verification of the electric bill is a key node related to whether the electric bill data is deducted correctly or not, and relates to the benefits of an electric enterprise and a power transmission enterprise. In the prior art, the verification of the electric bill is performed in a manual verification mode, and the electric bill is huge in quantity, more in types of verification are needed, low in manual verification efficiency and low in accuracy.
Therefore, how to automatically extract and check the data of the electric bill according to the check requirement, and improving the check efficiency and accuracy become the urgent problem to be solved.
Disclosure of Invention
The invention overcomes the defects of the prior art, and provides an automatic data processing method, an automatic data processing device and a storage medium for electric charge bill, which can meet the multi-dimensional verification requirement, and automatically extract and verify the data of the electric charge bill by combining the verification requirement, thereby improving the verification efficiency and accuracy.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the first aspect of the invention provides an automatic data processing method for an electric charge bill, comprising the following steps:
Acquiring a first bill format of an electric charge bill, and calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information;
combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, extracting the text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area label to each bill sub-information;
extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, wherein each comparison label set is provided with a corresponding preset verification model;
if the verification result is abnormal, an extraction area corresponding to the corresponding comparison tag set is used as a problem area, the first separation extraction layer is adjusted based on the problem area to obtain a second separation extraction layer, and the second separation extraction layer is combined with the electric charge bill to obtain a feedback image for display;
and if the verification result meets the requirement, filling the extracted bill information into a preset electric charge table.
Optionally, the acquiring a first bill format of the electric charge bill, and calling a corresponding first separation and extraction layer according to the first bill format, where the first separation and extraction layer includes a plurality of extraction areas with different extraction information, and the method includes:
acquiring a first bill format added to an electric charge bill by a user, and calling a corresponding first separation and extraction layer according to the first bill format, wherein each bill format is correspondingly arranged with the corresponding separation and extraction layer;
and determining an extraction area corresponding to each first separation extraction layer, wherein the extraction areas are used as target areas for OCR recognition, and the first separation extraction layers comprise a plurality of extraction areas with different extraction information.
Optionally, the combining the first separation and extraction layer with the electric charge bill to divide the electric charge bill by the extraction areas, identifying text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area tag to each bill sub-information, including:
intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of a first separation extraction layer according to the specification of the information extraction area image;
After judging that the information extraction area image corresponds to the specification of the first separation extraction layer, correspondingly combining the information extraction area image with the first separation extraction layer, and dividing the electric charge bill based on the extraction area in the first separation extraction layer;
and identifying the text information of each extracted area based on OCR to obtain bill information, and adding a corresponding area label to each bill information.
Optionally, the intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of the first separation extraction layer according to the specification of the information extraction area image includes:
carrying out coordinated processing on the electric charge bill, determining all pixel points located in a preset pixel interval, and taking the determined pixel points as first pixel points, wherein the first pixel points have first coordinates;
connecting all the adjacent first pixel points with the same abscissa in the first coordinates to obtain a first vertical connecting line, and connecting all the adjacent first pixel points with the same ordinate in the first coordinates to obtain a first horizontal connecting line;
taking first vertical connecting lines with the number of the first pixel points being greater than a first preset number as second vertical connecting lines, and taking first transverse connecting lines with the number of the first pixel points being greater than a second preset number as second transverse connecting lines;
Intercepting the electric charge bill according to the second vertical connecting line and the second horizontal connecting line to obtain an information extraction area image;
and taking the number of the second vertical connecting lines and the number of the pixel points of the second horizontal connecting lines in the information extraction area image as the specification of the information extraction area image, and adjusting the specification of the first separation extraction layer according to the second vertical connecting lines and the second horizontal connecting lines.
Optionally, the capturing the electric charge bill according to the second vertical connecting line and the second horizontal connecting line to obtain an information extraction area image includes:
determining a second vertical connecting line corresponding to the largest abscissa and the smallest abscissa in the second vertical connecting lines respectively to obtain a vertical connecting line intercepting group;
determining a second transverse connecting line corresponding to the largest ordinate and the smallest ordinate in the second transverse connecting lines respectively to obtain a transverse connecting line interception group;
determining a second vertical connecting line and a second transverse connecting line which correspond to the vertical connecting line intercepting group and the transverse connecting line intercepting group respectively in an electric charge bill, and forming a coordinate region section according to a maximum abscissa, a minimum abscissa, a maximum ordinate and a minimum ordinate;
And taking the determined areas formed by the second vertical connecting lines, the second horizontal connecting lines and all pixel points in the coordinate area interval as information extraction area images.
Optionally, the adjusting the number of the second vertical connection lines and the second horizontal connection lines in the information extraction area image as the specification of the information extraction area image according to the specification of the second vertical connection lines and the second horizontal connection lines on the first separation extraction layer includes:
acquiring the number of pixels of a second vertical connecting line in the information extraction area image to obtain a first vertical point number specification, and acquiring the number of pixels of a second horizontal connecting line in the information extraction area image to obtain a first horizontal point number specification;
determining a third vertical connecting line corresponding to the second vertical connecting line and a third transverse connecting line corresponding to the second transverse connecting line in the first separation and extraction layer;
obtaining the number of pixels of a third vertical connecting line in the first separation and extraction layer to obtain a second vertical point number specification, and obtaining the number of pixels of a third horizontal connecting line in the first separation and extraction layer to obtain a second horizontal point number specification;
comparing the first vertical point number specification with the second vertical point number specification, and comparing the first transverse point number specification with the second transverse point number specification to obtain an adjustment ratio;
And adjusting the specification of the first separation and extraction layer according to the adjustment proportion.
Optionally, after determining that the information extraction area image corresponds to the specification of the first separation extraction layer, the information extraction area image and the first separation extraction layer are correspondingly combined, and the electric charge bill is divided based on the extraction area in the first separation extraction layer, including:
determining a first central pixel point of the information extraction area image and a second central pixel point of a first separation extraction layer;
overlapping the first central pixel point and the second central pixel point so as to enable the information extraction area image and the first separation extraction layer to be correspondingly combined;
and dividing the electric charge bill into areas based on the extraction areas in the first separation extraction layer.
Optionally, the extracting the bill information of the related area tag according to the preset comparison method to obtain a comparison tag set, verifying the bill information based on a preset verification model to obtain a verification result, where each comparison tag set has a corresponding preset verification model, including:
sequentially extracting sub-ratio strategies included in the preset ratio strategy, and determining a plurality of labels corresponding to the sub-ratio strategy, wherein the plurality of labels determined by each sub-ratio strategy are associated area labels;
Generating an initial set corresponding to the sub-comparison strategy, sequentially extracting bill sub-information of the corresponding extraction area according to a plurality of labels, and filling the bill sub-information into the initial set to obtain a comparison label set;
determining a preset verification model corresponding to the sub-comparison strategy, and if the preset verification model is a comparison type model, inputting bill sub-information to corresponding input parameters in the preset verification model;
if the preset verification model is still established after the parameters are input, the verification result is normal;
if the preset verification model is not established after the parameters are input, the verification result is abnormal.
Optionally, the method further comprises:
if the preset verification model is a calculation type model, classifying the bill sub-information to obtain calculation bill sub-information and verification bill sub-information, and inputting the calculation bill sub-information into the preset verification model to obtain calculation result information;
if the calculation result information corresponds to the verification document sub-information, the verification result is normal;
and if the calculation result information does not correspond to the verification document sub-information, the verification result is abnormal.
Optionally, if the verification result is abnormal, the extracting area corresponding to the corresponding comparison tag set is used as a problem area, the first separation extracting layer is adjusted based on the problem area to obtain a second separation extracting layer, and the second separation extracting layer is combined with the electric charge bill to obtain a feedback image for displaying, where the method includes:
Determining all pixel points of the outline corresponding to the problem area in the first separation extraction layer as problem pixel points, and controlling the problem pixel points to be displayed with a second preset pixel value to obtain a second separation extraction layer;
combining the second separation extraction layer with the electric charge bill to obtain a feedback image, and generating a problem electric charge table according to bill information of a non-problem area and a problem area, wherein the problem electric charge table comprises the bill information of the non-problem area and the bill information of the problem area which are empty;
feeding back the second separation extraction layer and the problem electricity fee list to a user;
and correcting the electric charge bill according to the manual bill information filled in the empty problem area in the problem electric charge table by the user.
Optionally, the correcting the electric charge bill according to the manual bill information filled in the empty problem area in the problem electric charge table by the user includes:
generating a display subgraph corresponding to the manual bill information, determining an image corresponding to the problem area in the electric bill, and adjusting the pixel value in the image corresponding to the problem area to be a third preset pixel value;
and overlapping the display subgraph with the problem area so that the numerical value corresponding to the manual bill information is positioned in the corresponding problem area.
In a second aspect of the present invention, there is provided an automatic data processing apparatus for an electric charge bill, comprising:
the calling module is used for obtaining a first bill format of the electric charge bill, calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information;
the combination module is used for combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, text information of each extraction area is extracted based on OCR to obtain bill sub-information, and corresponding area labels are added to each bill sub-information;
the matching module is used for extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, and each comparison label set is provided with a corresponding preset verification model;
the feedback module is used for taking an extraction area corresponding to the corresponding comparison tag set as a problem area if the verification result is abnormal, adjusting the first separation extraction layer based on the problem area to obtain a second separation extraction layer, and combining the second separation extraction layer with the electric charge bill to obtain a feedback image for display;
And the result module is used for filling the extracted bill information into a preset electric charge list if the verification result meets the requirement.
In a third aspect of the invention, a storage medium is provided, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method according to the first aspect.
The beneficial effects are that: 1. according to the scheme, the characteristics of the electric charge bill are combined, the required data in the electric charge bill are extracted through the first separation extraction layer and the corresponding extraction strategy, then a label is added to the extracted data, the corresponding preset verification model is matched with the verification requirement, and the extracted data are combined and verified by the corresponding preset verification model. In addition, the scheme can determine the problem area when the verification result is abnormal, and correct and replace the problem data by combining the feedback information of the user. According to the scheme, the multi-dimensional verification requirement can be met, the verification requirement is combined to automatically extract and verify the data of the electric bill, and verification efficiency and accuracy are improved. It should be noted that, the invention does not just calculate the numerical value, but automatically adds the corresponding unit and attribute by combining the extraction area to determine that the numerical value is in different areas, calculates according to the corresponding unit and attribute and combining the preset model, adds the corresponding calculation logic according to the setting relation of different numerical values in different areas, and realizes the correlation verification between the numerical values, thereby realizing the prediction of the data accuracy of the electric charge.
2. When data extraction is carried out, the first separation extraction layer of the corresponding format is called by combining the document format, then the specification of the first separation extraction layer is adjusted by combining the extraction area image so as to improve the accuracy of data extraction, in the process of specification adjustment, the scheme can determine the outer edge line in the extraction area image by combining the characteristics of coordinates, pixel values and the like of pixel points, and then the specification of the first separation extraction layer is adjusted by combining the proportion of the number dimension so that the corresponding area is positioned at the corresponding position, and thus the data in the corresponding area is extracted more accurately.
3. When the scheme verifies the data of the bill, two verification modes are laid out in combination with the verification requirement. In the method, the related comparison data can be found by combining the regional tag, and then whether the comparison data meets the requirement is judged, so that a verification result is obtained; and the other is to perform verification of data calculation, in the mode, the scheme classifies the extracted data to obtain calculation data and verification data, then calculates by using a calculation model, and compares calculation results to obtain verification results. According to the scheme, the verification requirement of the user can be combined to automatically extract and verify the data of the electric bill, and verification efficiency and accuracy are improved. In addition, after the problem area is determined, a second separation extraction layer is generated by combining the problem area, the problem electricity fee table is fed back to the user, then the manual bill sub-information filled in the problem area which is empty in the problem electricity fee table is combined with the user to correct the electricity fee bill, and by the mode, when abnormal data appear, the data can be quickly corrected by combining with the active intervention of the user, so that the data on the bill are accurate.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an automatic data processing method for electric charge bill according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of a document provided by the present invention;
fig. 3 is a schematic structural diagram of an automatic data processing device for electric charge bill according to an embodiment of the present invention.
Detailed Description
In order that the invention may be more readily understood, a more particular description thereof will be rendered by reference to specific embodiments that are illustrated in the appended drawings.
Referring to fig. 1, a flow chart of an automatic data processing method for an electric charge bill according to an embodiment of the present invention is shown, where the method includes S1-S5:
s1, acquiring a first bill format of an electric charge bill, and calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information.
It will be appreciated that the electric charge bill has a plurality of formats, and the bill formats corresponding to different electric charge bills may be different, for example, some bills 3 rows and 4 columns, some bills 2 rows and 5 columns, etc., and the corresponding length and width may be different, and the position of the information may also be different. Therefore, the scheme can determine the first bill format of the electric charge bill first, and then find the first separation and extraction layer corresponding to the first bill format.
The first separation extraction layer may be a layer superimposed over the electronic image of the electric charge bill, and includes a plurality of extraction areas having different extraction information, and the extraction areas are used to extract corresponding information on the electric charge bill.
In some embodiments, S1 (acquiring a first bill format of an electric charge bill, and calling a corresponding first separation and extraction layer according to the first bill format, where the first separation and extraction layer includes a plurality of extraction areas with different extraction information) includes S11-S12:
s11, acquiring a first bill format added to the electric charge bill by a user, and calling a corresponding first separation and extraction layer according to the first bill format, wherein each bill format is correspondingly set with the corresponding separation and extraction layer.
It can be understood that the bill format is set corresponding to the corresponding separation extraction layer, so that after the first bill format of the electric charge bill is determined, the first separation extraction layer corresponding to the first bill format can be found.
S12, determining an extraction area corresponding to each first separation extraction layer, and taking the extraction areas as target areas of OCR recognition, wherein the first separation extraction layers comprise a plurality of extraction areas with different extraction information.
After determining the first separation extraction layers, the scheme determines extraction areas corresponding to each first separation extraction layer, and then takes the extraction areas as target areas for OCR recognition, wherein the first separation extraction layers comprise a plurality of extraction areas with different extraction information.
For example, the extraction area may be an information area corresponding to the current month reading, the last month reading, the number of reading, the unit price, the amount of money, and the like.
S2, combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, extracting text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area label to each bill sub-information.
According to the scheme, the first separation and extraction layer and the electric charge bill are combined, so that the electric charge bill is divided by the extraction areas, and one extraction area corresponds to one information area.
After combination, the scheme utilizes OCR to extract the text information of each extracted area to obtain bill information, and then adds a corresponding area label to each bill information.
The regional label can be the label of the reading of the current month, the reading of the last month, the reading number, the unit price, the amount and the like.
In some embodiments, S2 (combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, identifying text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area tag to each bill sub-information) includes S21-S23:
s21, intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of the first separation extraction layer according to the specification of the information extraction area image.
It can be understood that the electric charge bill is often obtained by shooting through a terminal, and the specifications corresponding to the information extraction area image obtained by intercepting may be different, so that in order to extract the information more accurately, the specifications of the first separation extraction layer and the information extraction area image need to be adjusted to be consistent.
In some embodiments, S21 (intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of the first separation extraction layer according to the specification of the information extraction area image) includes S211-S215:
s211, carrying out coordinated processing on the electric charge bill, determining all pixel points located in a preset pixel interval, and taking the determined pixel points as first pixel points, wherein the first pixel points have first coordinates.
Firstly, the scheme carries out the coordinated processing on the electric charge bill, then determines all pixel points located in a preset pixel interval, and takes the determined pixel points as first pixel points, wherein the first pixel points have first coordinates.
The pixel points in the preset pixel interval can be the pixel points corresponding to black, and it can be understood that referring to fig. 2, the electric charge bill often separates a plurality of information areas through black lines, and when the first separation and extraction layer extracts information, the first separation and extraction layer also needs to separate a plurality of information areas corresponding to the black lines, so that accurate information extraction is realized.
S212, all the adjacent first pixel points with the same abscissa in the first coordinates are connected to obtain a first vertical connecting line, and all the adjacent first pixel points with the same ordinate in the first coordinates are connected to obtain a first horizontal connecting line.
It will be appreciated that for a vertical black line, the black pixels are adjacent and the abscissa in the first coordinate is the same. For a horizontal black line, the black pixels are adjacent, and the ordinate is the same in the first coordinate.
Therefore, the first vertical connecting lines are obtained by connecting all the adjacent first pixel points with the same abscissa in the first coordinates, and the first horizontal connecting lines are obtained by connecting all the adjacent first pixel points with the same ordinate in the first coordinates.
S213, taking the first vertical connecting lines with the number of the first pixel points being greater than the first preset number as the second vertical connecting lines, and taking the first transverse connecting lines with the number of the first pixel points being greater than the second preset number as the second transverse connecting lines.
In fig. 2, there are also vertical connection lines of adjacent first pixel points with the same abscissa in the first coordinate, for example, vertical lines in "Zhang Sanj", where there is also a first horizontal connection line obtained by connecting adjacent first pixel points with the same ordinate in the first coordinate.
Therefore, the interference data can be removed, and it can be understood that, in general, the length corresponding to the first vertical connecting line and the first horizontal connecting line is longer, and the number of the first pixel points is more, so the number of the first pixel points can be compared with the first preset number, if the number is larger than the first preset number, the first vertical connecting line and the second horizontal connecting line can be calibrated into the second vertical connecting line and the second horizontal connecting line, and the accurate positioning of the lines is realized.
S214, intercepting the electric charge bill according to the second vertical connecting line and the second horizontal connecting line to obtain an information extraction area image.
It can be understood that after the second vertical connecting line and the second horizontal connecting line are obtained, the second vertical connecting line and the second horizontal connecting line are utilized to intercept the electric charge bill to obtain the information extraction area image.
In some embodiments, S214 (the information extraction area image obtained by intercepting the electric charge bill according to the second vertical connection line and the second horizontal connection line) includes S2141-S2144:
s2141, determining the second vertical connecting lines corresponding to the largest abscissa and the smallest abscissa in the second vertical connecting lines respectively, and obtaining a vertical connecting line interception group.
The scheme can determine the second vertical connecting lines corresponding to the largest abscissa and the smallest abscissa in the second vertical connecting lines respectively, namely the leftmost second vertical connecting line and the rightmost second vertical connecting line, and a vertical connecting line interception group is obtained.
S2142, determining the second transverse connecting lines corresponding to the largest ordinate and the smallest ordinate in the second transverse connecting lines respectively, and obtaining a transverse connecting line interception group.
Similarly, the scheme can determine the second transverse connecting lines corresponding to the largest ordinate and the smallest ordinate in the second transverse connecting lines respectively, namely the uppermost second transverse connecting line and the lowermost second transverse connecting line, so as to obtain the transverse connecting line interception group.
S2143, determining a vertical connecting line interception group and a second vertical connecting line and a second transverse connecting line corresponding to the transverse connecting line interception group in the electric charge bill, and forming a coordinate region section according to the maximum abscissa, the minimum abscissa, the maximum ordinate and the minimum ordinate.
It can be understood that the method can determine the second vertical connecting line and the second horizontal connecting line corresponding to the vertical connecting line intercepting group and the horizontal connecting line intercepting group respectively in the electric bill document, and then form the coordinate region section by using the largest abscissa, the smallest abscissa, the largest ordinate and the smallest ordinate.
And S2144, taking the determined second vertical connecting lines, the second horizontal connecting lines and the areas formed by all the pixel points in the coordinate area interval as information extraction area images.
By the method, a required information extraction area image can be determined. It can be understood that the information extraction area image can delete the rest useless information, thereby reducing the interference in the data extraction and processing and improving the data processing efficiency and the processing accuracy.
And S215, taking the number of the second vertical connecting lines and the second horizontal connecting lines in the information extraction area image as the specification of the information extraction area image, and adjusting the specification of the first separation extraction layer according to the second vertical connecting lines and the second horizontal connecting lines.
According to the scheme, the number of the second vertical connecting lines and the second horizontal connecting lines in the information extraction area image is used as the specification of the information extraction area image, the adjustment standard is sequentially adopted, and the second vertical connecting lines and the second horizontal connecting lines are utilized to adjust the specification of the first separation extraction image layer.
In some embodiments, S215 (taking the number of pixels of the second vertical connection line and the second horizontal connection line in the information extraction area image as the specification of the information extraction area image, and adjusting the specification of the first separation extraction layer according to the second vertical connection line and the second horizontal connection line) includes S2151-S2155:
S2151, obtaining the number of pixels of the second vertical connecting line in the information extraction area image to obtain a first vertical point number specification, and obtaining the number of pixels of the second horizontal connecting line in the information extraction area image to obtain a first horizontal point number specification.
Firstly, the method can count the number of pixel points of a second vertical connecting line and a second horizontal connecting line in an information extraction area image to be used as the specification of the information extraction area image, and then adjust the specification of a first separation extraction layer according to the second vertical connecting line and the second horizontal connecting line.
S2152, determining a third vertical connecting line corresponding to the second vertical connecting line and a third transverse connecting line corresponding to the second transverse connecting line in the first separation and extraction layer.
It should be noted that the first separation and extraction layer also has a transverse line and a vertical line, so that the scheme can determine a third vertical connecting line corresponding to the second vertical connecting line and a third transverse connecting line corresponding to the second transverse connecting line in the first separation and extraction layer. For example, the leftmost, rightmost, uppermost, and lowermost lines in the first partition extraction layer.
S2153, obtaining the number of pixels of the third vertical connecting line in the first separation and extraction layer to obtain a second vertical point number specification, and obtaining the number of pixels of the third horizontal connecting line in the first separation and extraction layer to obtain a second horizontal point number specification.
In order to adjust the specification, the scheme can obtain the number of pixels of the third vertical connecting line in the first separation and extraction layer to obtain the second number of vertical points specification, and the number of pixels of the third horizontal connecting line in the first separation and extraction layer to obtain the second number of horizontal points specification.
S2154, comparing the first vertical point number specification with the second vertical point number specification, and comparing the first transverse point number specification with the second transverse point number specification to obtain an adjustment ratio.
According to the scheme, the first vertical point quantity specification is compared with the second vertical point quantity specification, the first transverse point quantity specification is compared with the second transverse point quantity specification, and the adjustment ratio is obtained. Since the format of the first separation and extraction layer corresponds to the document format in this scheme, in general, the ratio obtained in the vertical dimension should be the same as the calculated ratio in the horizontal dimension, and thus, the adjustment of multiple enlargement or reduction can be performed.
And S2155, adjusting the specification of the first separation extraction layer according to the adjustment proportion.
After the adjustment proportion is obtained, the specification of the first separation and extraction layer can be adjusted by utilizing the adjustment proportion. For example, the scale is adjusted to 2, and then the specification of the first division extraction layer may be enlarged by 2 times to correspond to the specification of the information extraction area image.
S22, after judging that the information extraction area image corresponds to the specification of the first separation extraction layer, correspondingly combining and setting the information extraction area image and the first separation extraction layer, and dividing the electric charge bill based on the extraction area in the first separation extraction layer.
It can be understood that after the specification is adjusted, the information extraction area image and the first separation extraction layer can be correspondingly combined and arranged, and then the extraction area in the first separation extraction layer is utilized to divide the electric charge bill.
In some embodiments, S22 (after determining that the information extraction area image corresponds to the specification of the first separation extraction layer, the information extraction area image is set in a corresponding combination with the first separation extraction layer, and the electric charge bill is divided based on the extraction area in the first separation extraction layer) includes S221-S223:
s221, determining a first central pixel point of the information extraction area image and a second central pixel point of the first separation extraction layer.
In order to achieve accurate extraction of information, the information extraction region image and the first separation extraction layer are correspondingly combined and arranged, and positioning is needed, so that a first center pixel point of the information extraction region image and a second center pixel point of the first separation extraction layer can be obtained.
S222, overlapping the first central pixel point and the second central pixel point, so that the information extraction area image and the first separation extraction image layer are correspondingly combined.
After the first central pixel point of the information extraction area image and the second central pixel point of the first separation extraction layer are obtained, the two central pixel points are overlapped, so that the information extraction area image and the first separation extraction layer are correspondingly combined.
S223, dividing the electric charge bill into areas based on the extraction areas in the first separation extraction layer.
After the combination, the method can divide the electric charge bill into areas by utilizing the extraction areas in the first separation extraction layer.
S23, character information of each extracted area is identified based on OCR to obtain bill information, and corresponding area labels are added to each bill information.
According to the scheme, character information of each extracted area can be identified by utilizing an OCR (optical character recognition) technology to obtain bill information, and then a corresponding area label is added to each bill information.
The area label is a type of information representing the area, and may be, for example, a label such as a reading of the current month, a reading of the last month, a reading number, a unit price, or an amount.
And S3, extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, wherein each comparison label set is provided with a corresponding preset verification model.
It should be noted that, the preset comparison strategies in the scheme can have a plurality of types, and respectively correspond to different comparison requirements.
For example, in the data comparison verification strategy, when document information is verified, if the reading of the document information in the month is smaller than the reading of the document in the month, the data is certainly wrong, and may be wrong in the reading of the document in the month or wrong in the reading of the document in the month, so that the data comparison is required.
In another exemplary embodiment, when checking bill information, if the calculation result of the electric quantity and the unit price does not correspond to the amount of money in the bill, the data must be erroneous, and one or more of the electric quantity, the unit price and the amount of money may be erroneous, so that comparison of data calculation is required.
In some embodiments, S3 (extracting bill information of an associated area tag according to a preset comparison scheme to obtain a comparison tag set, verifying the bill information based on a preset verification model to obtain a verification result, where each comparison tag set has a corresponding preset verification model) includes S31-S35:
S31, extracting sub-comparison strategies included in the preset comparison strategy in sequence, determining a plurality of labels corresponding to the sub-comparison strategy, wherein the plurality of labels determined by each sub-comparison strategy are associated area labels.
For example, for the data comparison verification policy, when verifying the bill information, if the comparison is required to be performed on whether the lower reading and the upper reading are wrong, the labels corresponding to the sub comparison policy may be the lower reading and the upper reading. It will be appreciated that the plurality of tags determined for each sub-ratio strategy are associated region tags.
S32, generating an initial set corresponding to the sub-comparison strategy, sequentially extracting bill sub-information of the corresponding extraction area according to the plurality of labels, and filling the bill sub-information into the initial set to obtain a comparison label set.
According to the scheme, an initial set corresponding to the sub-comparison strategy is generated, bill sub-information of a corresponding extraction area is sequentially extracted according to a plurality of labels, and the bill sub-information is filled into the initial set to obtain a comparison label set. By the method, the data to be compared can be correspondingly extracted.
S33, determining a preset verification model corresponding to the sub-comparison strategy, and if the preset verification model is a comparison type model, inputting bill information into corresponding input parameters in the preset verification model.
The scheme can determine a preset verification model corresponding to the sub-comparison strategy, and if the preset verification model is a comparison type model, the bill sub-information is input to corresponding input parameters in the preset verification model.
For example, when the current month reading is required to be compared with the previous month reading, the comparison type model can be used for comparison.
S34, if the preset verification model is still established after the parameters are input, the verification result is normal;
s35, if the preset verification model is not established after the parameters are input, the verification result is abnormal.
It can be understood that if the preset verification model is still established after the parameters are input, the verification result is normal, for example, the reading in this month is greater than the reading in the last month.
It may also be appreciated that if the preset verification model is not established after the parameters are input, the verification result is abnormal, for example, the current reading is less than the previous reading.
On the basis of the above embodiment, the method further comprises:
and if the preset verification model is a calculation type model, classifying the bill sub-information to obtain calculation bill sub-information and verification bill sub-information, and inputting the calculation bill sub-information into the preset verification model to obtain calculation result information.
It can be understood that when checking the data of the electric charge, the scheme can utilize a calculation type model, firstly, the bill sub-information is classified to obtain calculation bill sub-information and verification bill sub-information, and then the calculation bill sub-information is input into a preset verification model to obtain calculation result information.
For example, the power and unit price may be calculated bill sub-information and the amount may be verified bill sub-information. The scheme can input electric quantity and unit price into a calculation type model, calculate a monetary result and then compare the monetary result with the verification bill information.
If the calculation result information corresponds to the verification document sub-information, the verification result is normal;
and if the calculation result information does not correspond to the verification document sub-information, the verification result is abnormal.
It can be understood that if the calculation result information corresponds to the verification document sub-information, the verification result is normal; if the calculation result information does not correspond to the verification document sub-information, the verification result is abnormal.
And S4, if the verification result is abnormal, taking an extraction area corresponding to the corresponding comparison tag set as a problem area, adjusting the first separation extraction layer based on the problem area to obtain a second separation extraction layer, and combining the second separation extraction layer with the electric charge bill to obtain a feedback image for display.
When the verification result is abnormal, the scheme can further process the data. Firstly, taking an extraction area corresponding to a corresponding comparison tag set as a problem area, and then adjusting the first separation extraction layer by combining the problem area to obtain a second separation extraction layer.
It will be appreciated that the second partition extraction layer is a layer that includes problem areas. According to the scheme, the second separation extraction layer and the electric charge bill are combined to obtain a feedback image for display.
In some embodiments, S4 (if the verification result is abnormal, an extraction area corresponding to the corresponding comparison tag set is used as a problem area, the first separation and extraction layer is adjusted based on the problem area to obtain a second separation and extraction layer, and the second separation and extraction layer is combined with the electric charge bill to obtain a feedback image to display) includes S41-S44:
s41, determining all pixel points of the outline corresponding to the problem area in the first separation and extraction layer as problem pixel points, and controlling the problem pixel points to be displayed with a second preset pixel value to obtain a second separation and extraction layer.
Firstly, determining all pixel points of the outline corresponding to the problem area in the first partition extraction layer as problem pixel points, for example, the pixel points corresponding to the black frame corresponding to the problem area are problem pixel points, and controlling the problem pixel points to be displayed with a second preset pixel value, for example, displaying with a pixel value corresponding to red, so as to obtain a second partition extraction layer.
S42, combining the second separation extraction layer with the electric charge bill to obtain a feedback image, and generating a problem electric charge table according to bill information of a non-problem area and a problem area, wherein the problem electric charge table comprises the bill information of the non-problem area and the bill information of the problem area.
After the second separation extraction layer is obtained, the second separation extraction layer and the electric charge bill are combined to obtain a feedback image, and then a problem electric charge table is generated according to bill information of a non-problem area and a problem area.
It can be understood that the bill information of the non-problem area is included in the problem electricity fee table, wherein the bill information of the problem area is empty. That is, the present scheme does not extract data in the problem area.
And S43, feeding back the second separation extraction layer and the problem electricity fee list to the user.
The scheme can feed back the second separation extraction layer and the problem electricity fee list to the user.
S44, correcting the electric charge bill according to manual bill information filled in the empty problem area in the problem electric charge table by the user.
After receiving the second separation extraction layer and the problem electric charge table, the user can manually fill data in the problem area which is empty in the problem electric charge table, and the server can correct the electric charge bill by combining manual bill sub-information filled in the problem area which is empty in the problem electric charge table by the user.
In some embodiments, S44 (correction of the electricity bill according to manual bill information filled by the user for the empty problem area in the problem electricity bill) includes S441-S442:
s441, a display subgraph corresponding to the manual bill sub-information is generated, an image corresponding to the problem area in the electric bill is determined, and the pixel value in the image corresponding to the problem area is adjusted to be a third preset pixel value.
The method can generate a display subgraph corresponding to the manual bill sub-information, determine an image corresponding to a problem area in the electric bill, and adjust a pixel value in the image corresponding to the problem area to a third preset pixel value. The third preset pixel value may be, for example, a pixel value corresponding to white, and by the above manner, erasing of the interference data in the problem area may be achieved.
S442, overlapping the display subgraph with the problem areas so that the numerical values corresponding to the manual bill information are located in the corresponding problem areas.
According to the scheme, the display subgraph and the problem area are overlapped, so that the numerical value corresponding to the manual bill information is located in the corresponding problem area, and the data can be corrected and replaced by combining with the active supplement of the user to the data.
And S5, if the verification result meets the requirement, filling the extracted bill information into a preset electric charge table.
It can be understood that if the verification result meets the requirement, the extracted bill information is filled into a preset electric charge table.
Referring to fig. 3, a schematic structural diagram of an automatic data processing device for electric charge bill according to an embodiment of the present invention is provided, where the device includes:
the calling module is used for obtaining a first bill format of the electric charge bill, calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information;
the combination module is used for combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, text information of each extraction area is extracted based on OCR to obtain bill sub-information, and corresponding area labels are added to each bill sub-information;
the matching module is used for extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, and each comparison label set is provided with a corresponding preset verification model;
The feedback module is used for taking an extraction area corresponding to the corresponding comparison tag set as a problem area if the verification result is abnormal, adjusting the first separation extraction layer based on the problem area to obtain a second separation extraction layer, and combining the second separation extraction layer with the electric charge bill to obtain a feedback image for display;
and the result module is used for filling the extracted bill information into a preset electric charge list if the verification result meets the requirement.
The present invention also provides a readable storage medium having stored therein a computer program for implementing the methods provided by the various embodiments described above when executed by a processor.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). In addition, the ASIC may reside in a user device. The processor and the readable storage medium may reside as discrete components in a communication device. The readable storage medium may be read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tape, floppy disk, optical data storage device, etc.
The present invention also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, the execution instructions being executed by the at least one processor to cause the device to implement the methods provided by the various embodiments described above.
It should be understood that the processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or other general purpose processor, digital signal processor (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuit (english: application Specific Integrated Circuit, abbreviated as ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
In addition to the above embodiments, the present invention may have other embodiments; all technical schemes formed by equivalent substitution or equivalent transformation fall within the protection scope of the invention.

Claims (13)

1. The automatic data processing method for the electric charge bill is characterized by comprising the following steps:
acquiring a first bill format of an electric charge bill, and calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information;
combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, extracting the text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area label to each bill sub-information;
extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, wherein each comparison label set is provided with a corresponding preset verification model;
if the verification result is abnormal, an extraction area corresponding to the corresponding comparison tag set is used as a problem area, the first separation extraction layer is adjusted based on the problem area to obtain a second separation extraction layer, and the second separation extraction layer is combined with the electric charge bill to obtain a feedback image for display;
And if the verification result meets the requirement, filling the extracted bill information into a preset electric charge table.
2. The automatic data processing method for electric charge bill according to claim 1, wherein,
the first bill format of the electric charge bill is obtained, a corresponding first separation and extraction layer is called according to the first bill format, the first separation and extraction layer comprises a plurality of extraction areas with different extraction information, and the method comprises the following steps:
acquiring a first bill format added to an electric charge bill by a user, and calling a corresponding first separation and extraction layer according to the first bill format, wherein each bill format is correspondingly arranged with the corresponding separation and extraction layer;
and determining an extraction area corresponding to each first separation extraction layer, wherein the extraction areas are used as target areas for OCR recognition, and the first separation extraction layers comprise a plurality of extraction areas with different extraction information.
3. The automatic data processing method for electric charge bill according to claim 2, wherein,
combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, identifying the text information of each extraction area based on OCR to obtain bill sub-information, and adding a corresponding area label to each bill sub-information, wherein the method comprises the following steps:
Intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of a first separation extraction layer according to the specification of the information extraction area image;
after judging that the information extraction area image corresponds to the specification of the first separation extraction layer, correspondingly combining the information extraction area image with the first separation extraction layer, and dividing the electric charge bill based on the extraction area in the first separation extraction layer;
and identifying the text information of each extracted area based on OCR to obtain bill information, and adding a corresponding area label to each bill information.
4. The automatic data processing method for electric charge bill according to claim 3, wherein,
intercepting the electric charge bill to obtain an information extraction area image, and adjusting the specification of a first separation extraction layer according to the specification of the information extraction area image, wherein the method comprises the following steps:
carrying out coordinated processing on the electric charge bill, determining all pixel points located in a preset pixel interval, and taking the determined pixel points as first pixel points, wherein the first pixel points have first coordinates;
connecting all the adjacent first pixel points with the same abscissa in the first coordinates to obtain a first vertical connecting line, and connecting all the adjacent first pixel points with the same ordinate in the first coordinates to obtain a first horizontal connecting line;
Taking first vertical connecting lines with the number of the first pixel points being greater than a first preset number as second vertical connecting lines, and taking first transverse connecting lines with the number of the first pixel points being greater than a second preset number as second transverse connecting lines;
intercepting the electric charge bill according to the second vertical connecting line and the second horizontal connecting line to obtain an information extraction area image;
and taking the number of the second vertical connecting lines and the number of the pixel points of the second horizontal connecting lines in the information extraction area image as the specification of the information extraction area image, and adjusting the specification of the first separation extraction layer according to the second vertical connecting lines and the second horizontal connecting lines.
5. The automatic data processing method for electric charge bill according to claim 4, wherein,
the intercepting the electric charge bill according to the second vertical connecting line and the second horizontal connecting line to obtain an information extraction area image comprises the following steps:
determining a second vertical connecting line corresponding to the largest abscissa and the smallest abscissa in the second vertical connecting lines respectively to obtain a vertical connecting line intercepting group;
determining a second transverse connecting line corresponding to the largest ordinate and the smallest ordinate in the second transverse connecting lines respectively to obtain a transverse connecting line interception group;
Determining a second vertical connecting line and a second transverse connecting line which correspond to the vertical connecting line intercepting group and the transverse connecting line intercepting group respectively in an electric charge bill, and forming a coordinate region section according to a maximum abscissa, a minimum abscissa, a maximum ordinate and a minimum ordinate;
and taking the determined areas formed by the second vertical connecting lines, the second horizontal connecting lines and all pixel points in the coordinate area interval as information extraction area images.
6. The automatic data processing method for electric charge bill according to claim 4, wherein,
the step of adjusting the number of the second vertical connecting lines and the second horizontal connecting lines in the information extraction area image as the specification of the information extraction area image according to the specification of the second vertical connecting lines and the second horizontal connecting lines on the first separation extraction layer comprises the following steps:
acquiring the number of pixels of a second vertical connecting line in the information extraction area image to obtain a first vertical point number specification, and acquiring the number of pixels of a second horizontal connecting line in the information extraction area image to obtain a first horizontal point number specification;
determining a third vertical connecting line corresponding to the second vertical connecting line and a third transverse connecting line corresponding to the second transverse connecting line in the first separation and extraction layer;
Obtaining the number of pixels of a third vertical connecting line in the first separation and extraction layer to obtain a second vertical point number specification, and obtaining the number of pixels of a third horizontal connecting line in the first separation and extraction layer to obtain a second horizontal point number specification;
comparing the first vertical point number specification with the second vertical point number specification, and comparing the first transverse point number specification with the second transverse point number specification to obtain an adjustment ratio;
and adjusting the specification of the first separation and extraction layer according to the adjustment proportion.
7. The automatic data processing method for electric charge bill according to claim 6, wherein,
after judging that the information extraction area image corresponds to the specification of the first separation extraction layer, correspondingly combining and setting the information extraction area image and the first separation extraction layer, dividing the electric charge bill based on the extraction area in the first separation extraction layer, and comprising the following steps:
determining a first central pixel point of the information extraction area image and a second central pixel point of a first separation extraction layer;
overlapping the first central pixel point and the second central pixel point so as to enable the information extraction area image and the first separation extraction layer to be correspondingly combined;
And dividing the electric charge bill into areas based on the extraction areas in the first separation extraction layer.
8. The automatic data processing method for electric charge bill according to claim 7, wherein,
extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain verification results, wherein each comparison label set is provided with a corresponding preset verification model, and the method comprises the following steps:
sequentially extracting sub-ratio strategies included in the preset ratio strategy, and determining a plurality of labels corresponding to the sub-ratio strategy, wherein the plurality of labels determined by each sub-ratio strategy are associated area labels;
generating an initial set corresponding to the sub-comparison strategy, sequentially extracting bill sub-information of the corresponding extraction area according to a plurality of labels, and filling the bill sub-information into the initial set to obtain a comparison label set;
determining a preset verification model corresponding to the sub-comparison strategy, and if the preset verification model is a comparison type model, inputting bill sub-information to corresponding input parameters in the preset verification model;
if the preset verification model is still established after the parameters are input, the verification result is normal;
If the preset verification model is not established after the parameters are input, the verification result is abnormal.
9. The automatic data processing method of electric charge bill according to claim 8, further comprising:
if the preset verification model is a calculation type model, classifying the bill sub-information to obtain calculation bill sub-information and verification bill sub-information, and inputting the calculation bill sub-information into the preset verification model to obtain calculation result information;
if the calculation result information corresponds to the verification document sub-information, the verification result is normal;
and if the calculation result information does not correspond to the verification document sub-information, the verification result is abnormal.
10. The automatic data processing method for electric charge bill according to claim 9, wherein,
if the verification result is abnormal, the extraction area corresponding to the corresponding comparison tag set is used as a problem area, the first separation extraction layer is adjusted based on the problem area to obtain a second separation extraction layer, and the second separation extraction layer is combined with the electric charge bill to obtain a feedback image for display, and the method comprises the following steps:
determining all pixel points of the outline corresponding to the problem area in the first separation extraction layer as problem pixel points, and controlling the problem pixel points to be displayed with a second preset pixel value to obtain a second separation extraction layer;
Combining the second separation extraction layer with the electric charge bill to obtain a feedback image, and generating a problem electric charge table according to bill information of a non-problem area and a problem area, wherein the problem electric charge table comprises the bill information of the non-problem area and the bill information of the problem area which are empty;
feeding back the second separation extraction layer and the problem electricity fee list to a user;
and correcting the electric charge bill according to the manual bill information filled in the empty problem area in the problem electric charge table by the user.
11. The automatic data processing method for electric charge bill according to claim 10, wherein,
the correction of the electric charge bill is carried out according to the manual bill information filled in the empty problem area in the problem electric charge table by the user, and comprises the following steps:
generating a display subgraph corresponding to the manual bill information, determining an image corresponding to the problem area in the electric bill, and adjusting the pixel value in the image corresponding to the problem area to be a third preset pixel value;
and overlapping the display subgraph with the problem area so that the numerical value corresponding to the manual bill information is positioned in the corresponding problem area.
12. Automatic data processing apparatus of charges of electricity bill, characterized by comprising:
The calling module is used for obtaining a first bill format of the electric charge bill, calling a corresponding first separation and extraction layer according to the first bill format, wherein the first separation and extraction layer comprises a plurality of extraction areas with different extraction information;
the combination module is used for combining the first separation and extraction layer with the electric charge bill so that the electric charge bill is divided by the extraction areas, text information of each extraction area is extracted based on OCR to obtain bill sub-information, and corresponding area labels are added to each bill sub-information;
the matching module is used for extracting bill information of the related area labels according to a preset comparison method to obtain comparison label sets, verifying the bill information based on a preset verification model to obtain a verification result, and each comparison label set is provided with a corresponding preset verification model;
the feedback module is used for taking an extraction area corresponding to the corresponding comparison tag set as a problem area if the verification result is abnormal, adjusting the first separation extraction layer based on the problem area to obtain a second separation extraction layer, and combining the second separation extraction layer with the electric charge bill to obtain a feedback image for display;
And the result module is used for filling the extracted bill information into a preset electric charge list if the verification result meets the requirement.
13. A storage medium having stored therein a computer program for implementing the method of any of claims 1 to 11 when executed by a processor.
CN202310634880.7A 2023-05-31 2023-05-31 Automatic data processing method and device for electric charge bill and storage medium Active CN116469120B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310634880.7A CN116469120B (en) 2023-05-31 2023-05-31 Automatic data processing method and device for electric charge bill and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310634880.7A CN116469120B (en) 2023-05-31 2023-05-31 Automatic data processing method and device for electric charge bill and storage medium

Publications (2)

Publication Number Publication Date
CN116469120A true CN116469120A (en) 2023-07-21
CN116469120B CN116469120B (en) 2023-09-05

Family

ID=87177337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310634880.7A Active CN116469120B (en) 2023-05-31 2023-05-31 Automatic data processing method and device for electric charge bill and storage medium

Country Status (1)

Country Link
CN (1) CN116469120B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104507047A (en) * 2014-12-15 2015-04-08 福建星网锐捷网络有限公司 Method and device for acquiring POI (point of interest) areas
CN104915114A (en) * 2015-05-29 2015-09-16 小米科技有限责任公司 Method and device for recording information as well as intelligent terminals
WO2019174276A1 (en) * 2018-03-14 2019-09-19 京东方科技集团股份有限公司 Method, device, equipment and medium for locating center of target object region
CN111582085A (en) * 2020-04-26 2020-08-25 中国工商银行股份有限公司 Document shooting image identification method and device
CN112734352A (en) * 2019-10-28 2021-04-30 北京京东尚科信息技术有限公司 Document auditing method and device based on data dimensionality
WO2021147252A1 (en) * 2020-01-22 2021-07-29 平安科技(深圳)有限公司 Ocr-based table format recovery method and apparatus, electronic device, and storage medium
CN113569863A (en) * 2021-09-26 2021-10-29 广东电网有限责任公司中山供电局 Document checking method, system, electronic equipment and storage medium
CN114639173A (en) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 OCR technology-based intelligent auditing method and device for checking and certifying materials
CN114708582A (en) * 2022-05-31 2022-07-05 国网浙江省电力有限公司 AI and RPA-based intelligent electric power data inspection method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104507047A (en) * 2014-12-15 2015-04-08 福建星网锐捷网络有限公司 Method and device for acquiring POI (point of interest) areas
CN104915114A (en) * 2015-05-29 2015-09-16 小米科技有限责任公司 Method and device for recording information as well as intelligent terminals
WO2019174276A1 (en) * 2018-03-14 2019-09-19 京东方科技集团股份有限公司 Method, device, equipment and medium for locating center of target object region
CN112734352A (en) * 2019-10-28 2021-04-30 北京京东尚科信息技术有限公司 Document auditing method and device based on data dimensionality
WO2021147252A1 (en) * 2020-01-22 2021-07-29 平安科技(深圳)有限公司 Ocr-based table format recovery method and apparatus, electronic device, and storage medium
CN111582085A (en) * 2020-04-26 2020-08-25 中国工商银行股份有限公司 Document shooting image identification method and device
CN113569863A (en) * 2021-09-26 2021-10-29 广东电网有限责任公司中山供电局 Document checking method, system, electronic equipment and storage medium
CN114639173A (en) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 OCR technology-based intelligent auditing method and device for checking and certifying materials
CN114708582A (en) * 2022-05-31 2022-07-05 国网浙江省电力有限公司 AI and RPA-based intelligent electric power data inspection method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
WANQING SONG ET AL.: "Bank Bill Recognition Based on an Image Processing", 《2009 THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING》, pages 569 - 573 *
熊海涛等: "基于图像识别技术的电力缴费智能核查系统", 《电子世界》, no. 13, pages 161 - 162 *
蔡剑等: "基于特征选择和标签相关性的多标签分类算法", 《计算机与数字工程》, vol. 49, no. 10, pages 1967 - 1972 *
袁嘉怡等: "基于CTPN和CRNN的中英文字识别", 《电脑编程技巧与维护》, no. 09, pages 134 - 137 *

Also Published As

Publication number Publication date
CN116469120B (en) 2023-09-05

Similar Documents

Publication Publication Date Title
CN108717543B (en) Invoice identification method and device and computer storage medium
CN109657665A (en) A kind of invoice batch automatic recognition system based on deep learning
JP6925615B2 (en) Identity verification document authenticity system, method and program
WO2019071660A1 (en) Bill information identification method, electronic device, and readable storage medium
US8289562B2 (en) Image processing apparatus, method and recording medium
CN111797837A (en) Intelligent receipt reimbursement method, system, computer equipment and storage medium
CN114639173B (en) OCR technology-based intelligent auditing method and device for checking and certifying materials
CN104851184A (en) Recognition method of transversely spliced banknote and device thereof
CN113569863B (en) Document checking method, system, electronic equipment and storage medium
CN109271980A (en) A kind of vehicle nameplate full information recognition methods, system, terminal and medium
WO2021110090A1 (en) Method, apparatus, device and storage medium for detecting card surface picture
CN110263616A (en) A kind of character recognition method, device, electronic equipment and storage medium
CN110765748B (en) Intelligent generation system and method for budget accounting document
CN116469120B (en) Automatic data processing method and device for electric charge bill and storage medium
CN112632926B (en) Bill data processing method and device, electronic equipment and storage medium
CN116402482B (en) Data processing method and processing equipment based on intelligent settlement of electric charge
CN112785402A (en) Bill information processing method, bill information processing system, and storage medium
CN115841353B (en) Advertisement putting photo acquisition and auditing method and device and terminal equipment
CN112487982A (en) Merchant information auditing method, system and storage medium
CN110619060A (en) Cigarette carton image database construction method and cigarette carton anti-counterfeiting query method
JP4300051B2 (en) Form image processing apparatus and billing method
TWI745724B (en) Mobile Document Recognition System
CN113158988A (en) Financial statement processing method and device and computer readable storage medium
CN110751110A (en) Identity image information verification method, device, equipment and storage medium
CN112580111A (en) Electronic signature method and system with signing time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant