CN116824604A - Financial data management method and system based on image processing - Google Patents

Financial data management method and system based on image processing Download PDF

Info

Publication number
CN116824604A
CN116824604A CN202311100459.4A CN202311100459A CN116824604A CN 116824604 A CN116824604 A CN 116824604A CN 202311100459 A CN202311100459 A CN 202311100459A CN 116824604 A CN116824604 A CN 116824604A
Authority
CN
China
Prior art keywords
dimension
demand
class
relay
folder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311100459.4A
Other languages
Chinese (zh)
Other versions
CN116824604B (en
Inventor
施志晖
曹杰
王有权
黄进
申冬琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Finance and Economics
Jiangsu Suning Bank Co Ltd
Original Assignee
Nanjing University of Finance and Economics
Jiangsu Suning Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Finance and Economics, Jiangsu Suning Bank Co Ltd filed Critical Nanjing University of Finance and Economics
Priority to CN202311100459.4A priority Critical patent/CN116824604B/en
Publication of CN116824604A publication Critical patent/CN116824604A/en
Application granted granted Critical
Publication of CN116824604B publication Critical patent/CN116824604B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1916Validation; Performance evaluation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a financial data management method and a financial data management system based on image processing, wherein a preset identification layer is called and overlapped above a target bill, the preset identification layer is aligned with the target bill according to an alignment strategy and a positioning line, and first text information corresponding to each dimension label is extracted based on an identification target area; retrieving a text verification strategy corresponding to the dimension label to verify the first text information; generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and a class-two folder based on classification intervals of the second dimension labels, and generating a class management tree corresponding to the second dimension labels according to the class total node and the class sub-nodes; analyzing the fusion demand information of the user to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first class management tree based on the fusion strategy and the sequence dimension serial numbers to obtain the second class management tree.

Description

Financial data management method and system based on image processing
Technical Field
The present invention relates to data processing technology, and in particular, to a financial data management method and system based on image processing.
Background
The financial data management refers to a process that an enterprise manages financial data related to the enterprise by adopting a certain means according to a certain program. The financial data has the characteristics of large data volume, high calling frequency and the like, enterprises have a large number of financial notes in the daily management process, and the financial notes are managed in time, so that the follow-up consulting and using are convenient.
In the prior art, most enterprises still adopt the form of folders to store corresponding financial notes, however, once the corresponding financial notes need to be referred to, for example, when a large amount of notes are referred to, a manager needs to refer to the notes in all folders to screen the notes meeting the requirements, and automatic screening according to the requirements of users cannot be realized, so that a large amount of searching time is wasted and errors are easy to occur.
Therefore, how to automatically sort the financial data and perform customized retrieval on the financial data according to different requirements of users becomes a problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a financial data management method and a financial data management system based on image processing, which can realize automatic arrangement of financial data and customized retrieval of the financial data according to different requirements of users.
In a first aspect of an embodiment of the present invention, there is provided a financial data management method based on image processing, including:
a preset identification layer is called and overlapped above a target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas;
retrieving a text verification strategy corresponding to the dimension label to verify the first text information, taking the first text information meeting the verification requirement as second text information of the target bill, and taking the corresponding dimension label as second dimension label of the target bill;
generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and a class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes;
And receiving fusion demand information of a user, analyzing the fusion demand information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first class management tree based on a fusion strategy and the sequence dimension serial numbers to obtain a second class management tree.
Optionally, in one possible implementation manner of the first aspect, the aligning the preset recognition layer with the target ticket according to the alignment policy and the positioning line, and extracting the first text information corresponding to each dimension tag based on the recognition target area includes:
acquiring an end point of the positioning line in the preset identification layer, which is positioned at a first side, as a first positioning point, and an end point of the positioning line in the preset identification layer, which is positioned at a second side, as a second positioning point, and acquiring an end point of the bill line in the target bill, which is positioned at the first side, as a third positioning point, and an end point of the bill line in the target bill, which is positioned at the second side, as a fourth positioning point;
carrying out coordinated processing on the preset identification layer and the target bill to obtain a first coordinate of a first positioning point, a second coordinate of a second positioning point, a third coordinate of a third positioning point and a fourth coordinate of a fourth positioning point;
Obtaining a first slope according to the first coordinate and the second coordinate, obtaining a second slope according to the third coordinate and the fourth coordinate, and obtaining a first angle corresponding to the first slope and a second angle corresponding to the second slope based on an arctangent function of the first slope and the second slope;
obtaining an adjusting angle based on the first angle and the second angle, performing rotation processing on a preset identification layer according to the adjusting angle, positioning a first positioning point to a third coordinate of a third positioning point, and positioning a second positioning point to a fourth coordinate of a fourth positioning point;
based on OCR character recognition, character recognition is carried out on the region aligned with the recognition target region in the target bill, initial text information of each title in the corresponding recognition target region is obtained, each recognition target region is provided with a corresponding dimension label, and first text information corresponding to each dimension label is extracted.
Optionally, in one possible implementation manner of the first aspect, the retrieving a text verification policy corresponding to the dimension tag to verify the first text information, taking the first text information that meets a verification requirement as the second text information of the target ticket, and taking the corresponding dimension tag as the second dimension tag of the target ticket includes:
Analyzing the attribute of the dimension tag, and if the attribute of the dimension tag is a text attribute, calling the number of preset texts and a tail verification set, wherein the tail verification set comprises a plurality of preset tail texts, and acquiring texts of the dimension tag corresponding to the number of the preset texts at the tail in the first text information as texts to be compared;
if the preset tail text is judged to be consistent with the text to be compared, the first text information is used as second text information of the target bill, and the corresponding dimension label is used as a second dimension label of the target bill;
if the attribute of the dimension tag is a communication digital attribute, the communication verification quantity is called, the character quantity of the dimension tag corresponding to the first text information is obtained to obtain a to-be-compared number, the to-be-compared number is judged to be equal to the communication verification quantity, the first text information is used as second text information of the target bill, and the corresponding dimension tag is used as second dimension tag of the target bill;
and if the attribute of the dimension label is an amount digital attribute, retrieving an amount verification quantity, acquiring the amount of the rear part of the decimal point corresponding to the first text information of the dimension label as a quantity to be verified, judging that the quantity to be verified is equal to the amount verification quantity, taking the first text information as second text information of the target bill, and taking the corresponding dimension label as second dimension label of the target bill.
Optionally, in one possible implementation manner of the first aspect, the generating, according to a class classification model, a class total node and a class primary folder corresponding to each of the second dimension labels, generating, based on a classification interval corresponding to the second dimension label, a plurality of class child nodes and a class secondary folder corresponding to the class total node, and generating, according to the class total node and the class child nodes, a class management tree corresponding to the second dimension label includes:
generating a class total node corresponding to the second dimension label according to a class classification model, creating a first-level folder, moving a target bill corresponding to the second dimension label to the first-level folder based on the second text information, and associating the class total node with the first-level folder;
generating a plurality of class child nodes connected with the class total nodes based on the classification interval corresponding to the second dimension label, and generating a corresponding secondary folder;
classifying the target notes in the primary folder according to the classifying interval and the second text information to obtain secondary notes corresponding to each classifying interval, placing the secondary notes in the corresponding secondary folder, and associating the secondary folder with the class of child nodes based on the classifying interval;
And directly connecting the child nodes and the total nodes to generate a management tree corresponding to the second dimension label.
Optionally, in one possible implementation manner of the first aspect, the receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to a sequence to obtain a sequence dimension serial number, and performing fusion processing on the management tree based on a fusion policy and the sequence dimension serial number to obtain a second class management tree, where the method includes:
receiving the fusion demand information of a user, and analyzing the fusion demand information to obtain fusion dimension information, wherein the fusion dimension information only comprises a demand dimension, and the demand dimension comprises at least one of money, invoice type, time, buyer company name and seller company name;
sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a first demand dimension sequence, wherein the demand dimensions are in one-to-one correspondence with the second dimension labels;
acquiring a management tree of a type corresponding to a second dimension label in the first demand dimension sequence as an initial management tree, deleting the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the initial management tree based on a classification interval corresponding to the demand dimension in the second demand dimension sequence to obtain a second type management tree.
Optionally, in one possible implementation manner of the first aspect, the sequentially performing fusion processing on the initial management tree based on the classification interval corresponding to the requirement dimension in the second requirement dimension sequence to obtain a second class management tree includes:
extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a classification interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each child node in the initial management tree based on the first relay interval, and moving the first relay folder into a second folder corresponding to the initial management tree;
acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a classification interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node and a corresponding second relay folder which are directly connected with each first relay sub-node based on the second relay interval, and moving the second relay folder into a first relay folder corresponding to the first relay sub-node;
Acquiring second text information of a second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating a second relay child node with the corresponding second relay folder based on the second relay interval;
and taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a second class management tree.
Optionally, in one possible implementation manner of the first aspect, the receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to a sequence to obtain a sequence dimension serial number, and performing fusion processing on the management tree based on a fusion policy and the sequence dimension serial number to obtain a second class management tree, where the method includes:
receiving fused demand information of a user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a type of demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount, an invoice type, time, a buyer company name and a seller company name, and the type of demand interval is at least one of classification intervals corresponding to the second dimension label;
Sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a demand dimension sequence;
obtaining a first type of management tree corresponding to the first demand dimension in a first demand dimension sequence as a management tree to be cut and a first type of demand interval, reserving a first type of child nodes corresponding to the first type of demand interval in the management tree to be cut to obtain a cut first management tree, deleting the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the first management tree based on the first type of demand interval corresponding to the demand dimension in the second demand dimension sequence to obtain a second type of management tree.
Optionally, in one possible implementation manner of the first aspect, the sequentially performing fusion processing on the first management tree based on a type of demand interval corresponding to a demand dimension in the second demand dimension sequence to obtain a type of management tree includes:
extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a type of demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each type of child node in the first management tree based on the first relay interval, and moving the first relay folder into a second folder corresponding to the first management tree;
Acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a first demand interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node directly connected with each first relay sub-node and a corresponding second relay folder based on the second relay interval, and moving the second relay folder to a first relay folder corresponding to the first relay sub-node;
acquiring second text information of a second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating a second relay child node with the corresponding second relay folder based on the second relay interval;
And taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a second class management tree.
Optionally, in one possible implementation manner of the first aspect, the receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to a sequence to obtain a sequence dimension serial number, and performing fusion processing on the management tree based on a fusion policy and the sequence dimension serial number to obtain a second class management tree, where the method includes:
receiving fused demand information of a user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a second-class demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount dimension, an invoice type dimension, a time dimension and a company name dimension, and the second-class demand interval is a demand interval actively input by the user;
sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a demand dimension sequence;
Acquiring a first demand dimension in a demand dimension sequence as a current demand dimension, calling a class-one total node corresponding to the current demand dimension as a class-two total node, and a corresponding class-one folder as a first folder, and associating the class-two total node with the first folder;
generating a plurality of class II child nodes connected with the class II total nodes based on class II demand intervals corresponding to the current demand dimension, and generating corresponding second folders;
classifying the target notes in the first folder according to the second-class demand intervals and the second text information to obtain second-class notes corresponding to each second-class demand interval, placing the second-class notes in the corresponding second folders, and associating the second-class folders with the second-class child nodes based on the classifying intervals;
directly connecting the second class child nodes with the second class total nodes to generate a new management tree corresponding to the current requirement dimension;
extracting the next demand dimension in the demand dimension sequence as the current demand dimension, determining a second class demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each second class child node based on the first relay interval, and moving the first relay folder into a second folder corresponding to the second class child node;
Acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
and repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a class-II management tree.
In a second aspect of an embodiment of the present invention, there is provided a financial data management system based on image processing, including:
the extraction module is used for calling a preset identification layer to be overlapped above the target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas;
the verification module is used for retrieving a text verification strategy corresponding to the dimension label to verify the first text information, taking the first text information meeting the verification requirement as second text information of the target bill, and taking the corresponding dimension label as a second dimension label of the target bill;
The generation module is used for generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes;
the fusion module is used for receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first management tree based on the fusion strategy and the sequence dimension serial numbers to obtain a second management tree.
In a third aspect of the embodiments of the present invention, there is provided a storage medium having stored therein a computer program for implementing the method of the first aspect and the various possible aspects of the first aspect when executed by a processor.
The beneficial effects of the invention are as follows:
1. the invention can selectively and automatically identify the content in the bill, check and generate the second dimension label corresponding to the bill, and generate different second class management trees according to different demands of users. According to the invention, the positioning lines and the plurality of identification target areas at the preset identification layer are utilized to align with the target bill, so that the identification target areas at the preset identification layer correspond to the corresponding target bill areas, the characters required by the user can be conveniently identified later, and important contents in the bill can be conveniently selected for automatic classification. According to the method and the device, the first class management tree is subjected to fusion processing according to the requirements of the user, so that the second class management tree corresponding to the requirements of the user is obtained, different second class management trees can be conveniently obtained according to different requirements of the user, the user is assisted in fusion searching of financial data, and a result meeting the requirements of the user is obtained.
2. According to the method, the target bill is subjected to positioning processing according to the positioning line of the preset recognition layer and the recognition target area, the required text is automatically recognized to obtain the first text information after recognition, and the first text information is respectively checked by the corresponding check strategies, so that the second text information meeting the requirements is obtained. According to the invention, the positioning lines in the preset identification layer and the bill lines in the target bill are aligned with each other through rotation treatment and alignment treatment, first, the first side and the second side of the positioning lines and the bill lines are respectively determined, the slopes of the positioning lines and the bill lines can be obtained to obtain corresponding included angles, the rotating adjusting angles can be determined according to the difference value of the two included angles, the positioning lines and the bill lines are parallel to each other, and then 2 positioning points of the positioning lines and the bill lines are directly aligned, so that the preset identification layer is completely overlapped with the target bill, and only characters in the area corresponding to the identification target area are identified after the positioning is completed, so that automatic identification of important character information in the bill is realized, and the first text information of each dimension label is obtained. Each dimension label has a corresponding verification strategy, if the dimension label is a text attribute, a text with the preset text quantity in the text tail is selected, whether errors occur or not is checked by comparing the text with the tail verification set, if the dimension label is a communication digital attribute, the communication digit is verified, if the dimension label is a decimal point, whether the decimal point is equal to the amount verification quantity or not is checked, the extracted character information is automatically verified, if the dimension label is wrong, an administrator is reminded to re-shoot, and if the dimension label is not wrong, the first text information is used as second text information of the target bill.
3. According to the invention, the class-one management tree corresponding to each second dimension label is automatically generated by using the class-one classification model, so that a user can conveniently and directly call the data of each second dimension label, and different class-two management trees are generated according to different requirements of the user, so that the user can be conveniently assisted to perform fusion search on financial data. According to the method, corresponding class total nodes are generated aiming at different second dimension labels, each class total node corresponds to a corresponding class one folder, each class one folder stores target notes corresponding to the second dimension labels, a plurality of class sub-nodes connected with the class total nodes are generated according to a classification section corresponding to the second dimension labels, a class two folder corresponding to the class sub-nodes is generated, the target notes are classified according to the classification section and second text information of the second dimension labels corresponding to each target note, and are moved to the corresponding class two folders, so that each second dimension label has a corresponding class management tree, and the corresponding target notes can be conveniently and directly called for checking according to different single demands of users. According to the invention, 2 kinds of fusion processing are carried out on one kind of management tree according to the requirements of the user, so that different fusion can be carried out according to different requirements of the user, and firstly, through the fact that the fusion dimension information of the user only comprises the requirement dimension, the one kind of management tree is automatically fused according to different requirement dimensions of the user and is called, a multi-dimensional two kinds of management tree is obtained, the one kind of management tree is automatically fused according to different requirements of the user, and the user can conveniently and subsequently search corresponding target notes. Second, the user's fusion dimension information includes a type of demand interval corresponding to the demand dimension and the demand dimension, and can call a type of management tree and cut out corresponding nodes to obtain a management tree conforming to the user demand dimension and the type of demand interval, and then automatically generate a corresponding type of management tree according to the type of demand interval selected by the user, so that the type of demand interval in the type of management tree meets the user's demand, and can more accurately locate the target bill required by the user.
Drawings
FIG. 1 is a flow chart of a financial data management method based on image processing provided by the invention;
FIG. 2 is a schematic diagram of a management tree according to the present invention;
FIG. 3 is a schematic diagram of an initial management tree according to the present invention;
fig. 4 is a schematic diagram of a first relay node according to the present invention;
fig. 5 is a schematic structural diagram of a financial data management system based on image processing according to the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein.
It should be understood that, in various embodiments of the present invention, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
It should be understood that in the present invention, "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present invention, "plurality" means two or more. "and/or" is merely an association relationship describing an association object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. "comprising A, B and C", "comprising A, B, C" means that all three of A, B, C comprise, "comprising A, B or C" means that one of the three comprises A, B, C, and "comprising A, B and/or C" means that any 1 or any 2 or 3 of the three comprises A, B, C.
It should be understood that in the present invention, "B corresponding to a", "a corresponding to B", or "B corresponding to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information. The matching of A and B is that the similarity of A and B is larger than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection" depending on the context.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
The invention provides a financial data management method based on image processing, as shown in figure 1, comprising S1-S4:
s1, a preset identification layer is called and overlapped above a target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas.
The preset identification layer is a blank bill layer with the same size as the target bill, and it is easy to understand that, for example, the target bill is a financial bill such as an invoice, and then the preset identification layer is a blank bill without content, and the other titles and formats are the same.
It should be noted that, since the preset identification layer is a blank bill layer with the same size as the target bill, the number of the divided areas in the preset identification layer is the same and the positions are the same, for example, the target bill is an invoice, the preset identification layer is a blank invoice, and the titles of the preset identification layer and the preset identification layer in each area are the same, the areas are the same, for example, the areas of the buyer, the seller, the amount, etc., wherein the identification target area is the area to be identified, for example, the seller area is convenient for identifying the content corresponding to the identification target area only.
It can be understood that, by performing alignment processing on the positioning line at the preset recognition layer and the bill line of the target bill, the preset recognition layer is completely corresponding to the target bill, at this time, the server may extract only the text information in the recognition target area, and perform customized extraction on the text information in the recognition target area, and extract only the first text information corresponding to the dimension label, for example, the buyer's area in the invoice has a name, a phone call, an account number, and other dimension titles, extract only the text information corresponding to the dimension title, and extract only the text information corresponding to the name in the buyer's area if the dimension label is the preset title label corresponding to the recognition target area, for example, may be the name, and extract only the text corresponding to the name in the buyer's area.
In some embodiments, in step S1 (aligning the preset recognition layer with the target ticket according to an alignment policy and the positioning line, and extracting the first text information corresponding to each dimension tag based on the recognition target area), S11-S15 include:
s11, an end point of the positioning line on the first side in the preset identification layer is obtained to serve as a first positioning point, an end point of the positioning line on the second side in the preset identification layer is obtained to serve as a second positioning point, an end point of the bill line on the first side in the target bill is obtained to serve as a third positioning point, and an end point of the bill line on the second side in the target bill is obtained to serve as a fourth positioning point.
It should be noted that, the preset identification layer and the target bill are both obtained by the same bill template, for example, the preset identification layer is an invoice, the preset identification layer is a blank invoice, the target bill is an invoice with content, and the layout of the preset identification layer and the target bill is the same, so that one side of the preset identification layer, which is located in the buyer, is taken as a first side, and one side of the preset identification layer, which is located in the password area, is taken as a second side, and the preset identification layer is set according to the actual condition of the bill only for example for facilitating understanding.
It can be understood that a corresponding bill line is arranged above the target bill, for example, a corresponding bill line is arranged above the bill tie, and the preset identification layer is provided with a positioning line corresponding to the bill line, so that an end point of the positioning line on the first side in the preset identification layer is respectively obtained as a first positioning point, an end point of the positioning line on the second side in the preset identification layer is obtained as a second positioning point, an end point of the bill line on the first side in the target bill is obtained as a third positioning point, and an end point of the bill line on the second side in the target bill is obtained as a fourth positioning point, and the bill line and the positioning line are conveniently aligned according to the positioning points in the follow-up process, so that the preset identification layer completely corresponds to the target bill.
And S12, carrying out coordinated processing on the preset identification layer and the target bill to obtain a first coordinate of a first positioning point, a second coordinate of a second positioning point, a third coordinate of a third positioning point and a fourth coordinate of a fourth positioning point.
It is easy to understand that, when the preset recognition layer and the target bill are subjected to the coordinated processing, the first coordinate of the corresponding first positioning point, the second coordinate of the second positioning point, the third coordinate of the third positioning point and the fourth coordinate of the fourth positioning point can be obtained.
S13, obtaining a first slope according to the first coordinate and the second coordinate, obtaining a second slope according to the third coordinate and the fourth coordinate, and obtaining a first angle corresponding to the first slope and a second angle corresponding to the second slope based on an arctangent function of the first slope and the second slope.
It will be appreciated that the equation of the positioning line can be obtained according to the simultaneous equations of the first coordinate and the second coordinate, the first slope of the positioning line can be determined according to the equation, and similarly, the second slope of the bill line can be obtained according to the third coordinate and the fourth coordinate, and it will be understood that the known slope can obtain the angles between 2 lines and the X axis respectively according to the arctangent function, so as to obtain the first angle of the positioning line and the second angle of the bill line.
S14, obtaining an adjusting angle based on the first angle and the second angle, carrying out rotation processing on a preset identification layer according to the adjusting angle, positioning a first positioning point to a third coordinate of a third positioning point, and positioning a second positioning point to a fourth coordinate of a fourth positioning point.
It should be noted that the invention rotates the preset identification layer based on the adjustment angle, and then directly translates the preset identification layer to the upper part of the target bill, so that the corresponding positioning points are overlapped
It can be understood that the adjustment angle is obtained according to the difference between the first angle and the second angle, if the adjustment angle is a negative number, the first angle is smaller than the second angle, then the same adjustment angle is rotated anticlockwise, if the adjustment angle is a positive number, the first angle is larger than the second angle, then the same adjustment angle is rotated clockwise, the first positioning point is positioned at the third coordinate of the third positioning point, the second positioning point is positioned at the fourth coordinate of the fourth positioning point, and therefore the preset identification layer completely corresponds to the target bill.
And S15, performing character recognition on the region aligned with the recognition target region in the target bill based on OCR character recognition to obtain initial text information corresponding to the recognition target region, wherein each recognition target region is provided with a corresponding dimension label, extracting the initial text information corresponding to each dimension label, and obtaining first text information corresponding to each dimension label.
It should be noted that the present invention identifies the area in the invoice that needs to be identified, and identifies the information in the area that needs to be identified, for example, identifies the name and the phone in the buyer area.
It will be appreciated that the character recognition is performed on the region of the target ticket that is aligned with the recognition target area by OCR character recognition, since all the character information in the region, such as the name, the line of opening, the address, and all the character information of the phone, is recognized.
Furthermore, the method and the device extract the initial text information after identifying the dimension labels corresponding to the target area, so as to obtain the first text information corresponding to each dimension label.
For example, the identification target area may be a plurality of target areas of a buyer, a target area of a seller, etc., the target area of the buyer has corresponding dimension tag names and telephones, the dimension tag corresponding to the target area of the seller is the name and the bank, and only the characters behind the names and telephone titles in the target area of the buyer are extracted to obtain the first text information.
S2, a text verification strategy corresponding to the dimension label is called to verify the first text information, the first text information meeting the verification requirement is used as second text information of the target bill, and the corresponding dimension label is used as a second dimension label of the target bill.
It can be understood that different dimension labels have text verification strategies corresponding to the dimension labels, first text information behind the corresponding dimension labels is verified according to the different text verification strategies, the first text information meeting verification requirements is used as second text information of the target bill, the corresponding dimension labels are used as second dimension labels of the target bill, and the dimension labels which do not meet requirements are sent to a user for reminding and re-shooting.
In some embodiments, the step S2 (retrieving the text verification policy corresponding to the dimension tag to verify the first text information, taking the first text information meeting the verification requirement as the second text information of the target ticket, and taking the corresponding dimension tag as the second dimension tag of the target ticket) includes S21-S24:
s21, analyzing the attribute of the dimension tag, and if the attribute of the dimension tag is a text attribute, calling a preset text quantity and a tail verification set, wherein the tail verification set comprises a plurality of preset tail texts, and acquiring texts of the preset text quantity corresponding to the tail of the first text information of the dimension tag as texts to be compared.
It can be understood that the invention has corresponding verification policies corresponding to different dimension labels, so if the attribute of the dimension label is a text attribute, the preset text quantity and the tail verification set are called, wherein the preset text quantity is the preset text quantity, for example, 2 words and 3 words, and the tail verification set is a standard text set of the tail in the first text information with the attribute being the text attribute.
Furthermore, the method and the device can acquire the texts with the preset text quantity corresponding to the tail in the first text information as the texts to be compared, so that the subsequent comparison with the preset tail texts in the tail check set is convenient.
For example, the names in the buyer area in the enterprise invoice are checked, the suffixes are generally companies, groups, colleges and the like for the enterprise names, the companies, groups, colleges and the like are preset tail texts, the set consisting of the companies, groups, colleges and the like is a tail check set, the corresponding preset text quantity is 2, 2 texts at the tail of the names are extracted as texts to be compared, and it is easy to understand that the texts at the tail are generally fixed companies, groups and the like, so that the fonts are not wrong and are not recognized as public and equivalent abnormal texts if the texts to be compared have the same preset tail texts later.
S22, judging that the preset tail text is consistent with the text to be compared, taking the first text information as second text information of the target bill, and taking the corresponding dimension label as a second dimension label of the target bill.
It can be understood that if the text to be compared is consistent with any one of the preset tail texts in the tail verification set, it is indicated that the tail text is not abnormal, if verification is passed, the first text information is used as the second text information of the target bill, and the corresponding dimension label is used as the second dimension label of the target bill.
S23, if the attribute of the dimension label is a communication digital attribute, the communication verification quantity is called, the character quantity of the dimension label corresponding to the first text information is obtained to obtain a to-be-compared quantity, the to-be-compared quantity is judged to be equal to the communication verification quantity, the first text information is used as second text information of the target bill, and the corresponding dimension label is used as second dimension label of the target bill.
It will be appreciated that if the attribute of the dimension tag is a communication number attribute, the communication check number will be called, and it will be appreciated that if the dimension tag is a mobile phone number, a mobile phone number digit error may occur, for example, 11 digits change to 13 digits.
Therefore, the invention extracts the character number of the first text information behind the dimension label with the attribute of the communication digital attribute to obtain the corresponding to-be-compared number, if the to-be-compared number is equal to the communication verification number, the verification is passed, the first text information is used as the second text information of the target bill, and the corresponding dimension label is used as the second dimension label of the target bill.
S24, if the attribute of the dimension label is an amount digital attribute, an amount verification quantity is called, the amount of money at the rear part of a decimal point corresponding to the first text information of the dimension label is obtained to be used as the amount to be verified, the amount to be verified is judged to be equal to the amount to be verified, the first text information is used as second text information of the target bill, and the corresponding dimension label is used as second dimension label of the target bill.
It will be appreciated that if the attribute of the dimension tag is a monetary digital attribute, it will be appreciated that the format of the monetary amount of a ticket such as an invoice is fixed, typically 2 decimal places.
Therefore, the invention can call the amount verification quantity, acquire the amount quantity of the rear part of the decimal point corresponding to the first text information as the amount to be verified, compare the amount to be verified with the amount verification quantity, and if the amount is the same, the format of the description amount is not wrong, so that the corresponding first text information is used as the second text information of the target bill, and the corresponding dimension label is used as the second dimension label of the target bill.
S3, generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and a class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes.
It is easy to understand that the invention can generate a management tree corresponding to each second dimension label, so that the subsequent user can conveniently inquire the bill data under the corresponding second dimension label.
It can be understood that a corresponding class of total nodes and a class of primary folders are generated according to the second dimension labels, each second dimension label has a corresponding classifying section, a corresponding number of class of sub-nodes and class of secondary folders are generated according to the classifying section, target notes corresponding to the corresponding second dimension labels are classified to the corresponding class of secondary folders by using the classifying section, and a class of management tree corresponding to the second dimension labels is generated according to the class of total nodes and the class of sub-nodes.
In some embodiments, in step S3 (generating a class total node and a class folder corresponding to each of the second dimension labels according to a class classification model, generating a plurality of class child nodes and a class secondary folder corresponding to the class total node based on a classification section corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class child nodes), including S31-S34:
S31, generating a class of total nodes corresponding to the second dimension labels according to a class classification model, creating a first-level folder, moving target notes corresponding to the second dimension labels to the first-level folder based on the second text information, and associating the class of total nodes with the first-level folder.
It can be understood that a class of total nodes corresponding to each second dimension label is generated according to a class classification model, a class-one folder is newly built, a target bill related to the second dimension label is moved to the class-one folder based on the second text information, and the class of total nodes are related to the class-one folder.
It is to be understood that the present invention establishes a corresponding class of management tree for each second dimension label, so that a class of total nodes corresponding to the class of management tree is established.
For example, a corresponding management tree is established for the monetary dimension, a class of total nodes corresponding to the monetary and a corresponding class-one folder are established first, at this time, the class-one folder is empty, then all invoices with monetary information are grabbed and moved to the class-one folder related to the monetary, and the class of total nodes are associated with the class-one folder, so that when a subsequent user triggers the class of total nodes, the invoices of all monetary dimensions are called.
S32, generating a plurality of child nodes connected with the total nodes based on the classification section corresponding to the second dimension label, and generating a corresponding secondary folder.
It will be appreciated that each second dimension tag has a corresponding classification interval, thereby generating a plurality of class child nodes connected to a class total node, and generating a corresponding secondary folder.
For example, the classification interval corresponding to the amount is 0-1000 yuan, more than 1000; the classification interval corresponding to the name is 1-3 months, 4-6 months, 7-9 months and 10-12 months, and each second dimension label is provided with a preset classification interval corresponding to the second dimension label. Thus, taking the amount as an example, 3 class child nodes are generated, and 3 secondary folders are generated.
S33, classifying the target notes in the primary folder according to the classifying interval and the second text information to obtain secondary notes corresponding to each classifying interval, placing the secondary notes in the corresponding secondary folder, and associating the secondary folder with the class of child nodes based on the classifying interval.
It can be understood that the target notes in the first-level folder are classified according to the classification intervals and the second text information of the corresponding second dimension labels to obtain the second-level notes corresponding to each classification interval, and it is easy to understand that the second-level notes of each classification interval are obtained by classifying the second text information of each target note to the corresponding classification interval.
For example, the amount corresponds to a classification interval of 0-1000 yuan, more than 1000, and 3 classification intervals, and according to the second text information at the target bill amount, for example, bill 1:100 yuan, ticket 2:5000 yuan and 3:100000 yuan, so that the notes are respectively classified into corresponding classifying intervals to obtain secondary notes, the corresponding secondary notes are moved into secondary folders corresponding to the classifying intervals, and 2 notes are mutually associated based on the secondary folders and the child nodes corresponding to more than 2 intervals of 0-1000 yuan, so that when a subsequent user triggers the child nodes of the corresponding type, target notes corresponding to the corresponding classifying intervals can be automatically called, and only 2 money intervals are taken as examples for facilitating understanding.
S34, directly connecting the child nodes and the total nodes to generate a management tree corresponding to the second dimension label.
It will be appreciated that, as shown in fig. 2, the child nodes of the class and the total nodes of the class are directly connected to generate a class of management tree corresponding to the second dimension label.
S4, receiving fusion demand information of a user, analyzing the fusion demand information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first class management tree based on a fusion strategy and the sequence dimension serial numbers to obtain a second class management tree.
It can be understood that, according to the invention, aiming at the fusion requirement information of different users, the fusion requirement information is analyzed to obtain the fusion dimension information, the fusion dimension information is sequentially ordered according to the sequence of the fusion dimension information to obtain the sequence dimension serial number, and the fusion treatment is carried out on the management tree of one class based on the fusion strategy and the sequence dimension serial number to obtain the management tree of two classes.
According to the method, fusion processing is carried out on the first class management tree in 3 modes to obtain the second class management tree, when a user only inputs corresponding requirement dimension and has no clear requirement on a requirement interval, the first class management tree with the corresponding requirement dimension is automatically called to be automatically fused in the existing classification interval to obtain the second class management tree required by the user, and data fusion retrieval can be carried out according to the fusion requirement of the user to obtain a target bill required by the user, wherein the first mode is as follows:
in some embodiments, the step S4 (receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to the order to obtain a sequence dimension number, and performing fusion processing on the management tree based on the fusion policy and the sequence dimension number to obtain a second class management tree) includes a41-a43:
And A41, receiving the fusion demand information of the user, and analyzing the fusion demand information to obtain fusion dimension information, wherein the fusion dimension information only comprises a demand dimension, and the demand dimension comprises at least one of amount, invoice type, time, purchaser company name and seller company name.
It can be understood that the fused requirement information of the user is received, the fused requirement information is analyzed to obtain fused dimension information, the fused dimension information only comprises a requirement dimension, and it is easy to understand that the user only has corresponding requirements for different dimensions, for example, the user A needs various target notes of the amount dimension in the time dimension. The demand dimension includes at least one of an amount, an invoice type, a time, a purchaser company name, and a seller company name.
A42, sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension serial numbers corresponding to the demand dimensions, sequentially sequencing the demand dimensions based on the sequence dimension serial numbers to obtain a first demand dimension sequence, wherein the demand dimensions are in one-to-one correspondence with the second dimension labels.
It can be understood that the demand dimensions are sequentially ordered according to the sequence to obtain sequential dimension serial numbers corresponding to the demand dimensions, and the demand dimensions are sequentially ordered according to the sequential dimension serial numbers to obtain a first demand dimension sequence, wherein the demand dimensions are in one-to-one correspondence with the second dimension labels.
For example, the user a needs various target notes of time dimension, amount dimension and invoice type, the first required dimension is time, the second amount is amount, and finally the invoice type, so that the sequence dimension number of time is 1, the sequence dimension number of amount is 2, the sequence dimension number of invoice type is 3, and the first required dimension sequence { time, amount and invoice type } is obtained by sorting, and the required dimension corresponds to the second dimension label one by one.
A43, obtaining a management tree of a type corresponding to a second dimension label from the first demand dimension sequence as an initial management tree, deleting the first demand dimension from the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the initial management tree based on a classification section corresponding to the demand dimension in the second demand dimension sequence to obtain a second type management tree.
It can be understood that, a management tree of a second dimension label corresponding to a first dimension in the first dimension sequence is obtained as an initial management tree, the first dimension in the first dimension sequence is deleted to obtain a second dimension sequence, and the initial management tree is sequentially fused according to a classification interval corresponding to a dimension in the second dimension sequence to obtain a second management tree.
For example, as shown in fig. 3, a management tree of a type of a second dimension label corresponding to time in a first requirement dimension sequence { time, amount, invoice type } is obtained as an initial management tree, and a second requirement dimension sequence { amount, invoice type };
and then, according to the classifying interval corresponding to the monetary demand dimension in the second demand dimension sequence, carrying out fusion processing on the initial management tree to obtain a second class management tree, and directly classifying the data in the interval corresponding to the past monetary dimension and generating corresponding nodes.
In some embodiments, in step a43 (based on the classification interval corresponding to the requirement dimension in the second requirement dimension sequence, the merging process is sequentially performed on the initial management tree to obtain a second class management tree), including a431-a435:
a431, extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a classification section corresponding to the current demand dimension as a first relay section, constructing a first relay sub-node and a corresponding first relay folder which are directly connected with each sub-node in the initial management tree based on the first relay section, and moving the first relay folder into a second folder corresponding to the initial management tree.
The first relay section is a classification section corresponding to the current requirement dimension, for example, 2 sections above 1000 sections of 0-1000 yuan of the second requirement dimension sequence { amount, invoice type }.
Further, a first relay child node and a corresponding first relay folder which are directly connected with each child node in the initial management tree are constructed based on the first relay interval, and the first relay folder is moved to a second-level folder corresponding to the initial management tree.
It is to be understood that when the initial management tree is obtained, the target bill is further classified according to the classification interval of the subsequent requirement dimension, and a first relay child node directly connected with each type of child node in the initial management tree is generated.
For example, as shown in fig. 4, after obtaining an initial management tree corresponding to time, a first relay child node directly connected with each type of child node in the initial management tree is constructed according to 2 intervals of 0-1000 yuan, more than 1000 yuan. And the secondary notes in the secondary folder are classified into the corresponding first relay folder, so that the first relay child nodes are associated with the corresponding first relay folder, and a subsequent user can conveniently trigger and inquire the corresponding target notes according to the requirements.
And A432, acquiring second text information of the second bill corresponding to the current requirement dimension in the second folder, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating the first relay child node with the corresponding first relay folder based on the first relay interval.
It can be understood that the second text information of the second bill corresponding to the current requirement dimension in the second folder is obtained, so that the second bill corresponding to the second text information can be conveniently classified by using the first relay interval and the second text information.
Further, the secondary bills in the secondary folder are classified and moved to the corresponding first relay folder through the first relay section and the second text information, and the first relay child nodes are associated with the corresponding first relay folder based on the first relay section.
For example, the secondary bills are classified into corresponding sections by 0-1000, more than 1000 and more than 2 sections of the sum of 100, 5000 and the like of the second text information of the amount in each secondary bill, and are moved to the corresponding first relay folder.
A433, extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a classification interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node directly connected with each first relay sub-node and a corresponding second relay folder based on the second relay interval, and moving the second relay folder to a first relay folder corresponding to the first relay sub-node.
It can be understood that, consistent with the principle that the first relay child nodes and the corresponding first relay folders are constructed after the current demand dimension is determined before, the method and the device continue to extract the next demand dimension in the second demand dimension sequence as the current demand dimension, and the classification section corresponding to the current demand dimension is used as the second relay section, and based on the second relay section, the second relay child nodes and the corresponding second relay folders which are directly connected with the first relay child nodes are constructed, and the second relay folders are moved to the first relay folders corresponding to the first relay child nodes.
It will be appreciated that if the second sequence of demand dimensions for the user is { amount, invoice type }, then the invoice type is processed on the basis of time and amount dimensions, and the same steps as the amount are repeated.
And A434, acquiring second text information of the second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating the second relay child node with the corresponding second relay folder based on the second relay interval.
It can be understood that, consistent with the principle of step a432, the second-level notes in the first relay folder are classified and moved into the corresponding second relay folder based on the second relay interval and the second text information, and the second relay child nodes are associated with the corresponding second relay folder based on the second relay interval, so that the subsequent user can conveniently trigger and inquire the corresponding target notes.
And A435, taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a class II management tree.
It can be understood that the above steps are repeated continuously until the second demand dimension sequence does not have a demand dimension, and a second class management tree is obtained, and it is easy to understand that the target bill is classified continuously according to the demand dimension in the second demand dimension sequence, and a corresponding management tree is generated, so that the finally obtained second class management tree corresponds to the fusion demand information of the user, the subsequent user can trigger the nodes in the second class management tree conveniently, and the corresponding target bill is queried.
The invention carries out fusion processing on the first class management tree through 2 modes to obtain a second class management tree, and when a user determines corresponding demand dimension and selects corresponding first class demand interval, the second class management tree with corresponding demand dimension is automatically called and automatically cut and fused according to the first class demand interval of each demand dimension to obtain the second class management tree required by the user, and the fusion search of data can be carried out according to the fusion demand of the user to obtain a target bill required by the user, wherein the second mode is as follows:
in some embodiments, the step S4 (receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to the order to obtain a sequence dimension number, and performing fusion processing on the management tree based on the fusion policy and the sequence dimension number to obtain a second class management tree) includes B41-B43:
and B41, receiving the fused demand information of the user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a type of demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount, an invoice type, time, a buyer company name and a seller company name, and the type of demand interval is at least one of classification intervals corresponding to the second dimension label.
It can be understood that the fused demand information of the user is received, the fused demand information is analyzed to obtain fused dimension information, the fused dimension information comprises a demand dimension and a type of demand interval corresponding to the demand dimension, and it is easy to understand that the user has not only the demand dimension but also a requirement for the type of demand interval under the demand dimension.
The first-class demand interval is an interval selected by a user for classifying the second dimension label, for example, the classifying interval corresponding to the amount of money is 0-1000 yuan, more than 1000 yuan, and if the user selects 0-1000 yuan, the 0-1000 yuan is the first-class demand interval.
And B42, sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension serial numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension serial numbers to obtain a demand dimension sequence.
It will be appreciated that this is consistent with the a42 principle.
And B43, acquiring a type of management tree corresponding to the first demand dimension in the first demand dimension sequence as a management tree to be cut and a type of demand interval, reserving a type of child nodes corresponding to the type of demand interval in the management tree to be cut to obtain a cut first management tree, deleting the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the first management tree based on the type of demand interval corresponding to the demand dimension in the second demand dimension sequence to obtain a type of management tree.
It can be understood that the server may acquire a type of management tree corresponding to a first demand dimension in the first demand dimension sequence as a to-be-cut management tree, and a classification section in the to-be-cut management tree selected by a user as a type of demand section, reserve the first management tree after cutting the first type of child nodes corresponding to the type of demand section in the to-be-cut management tree, delete the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially fuse the first management tree based on the type of demand section corresponding to the demand dimension in the second demand dimension sequence to obtain a type of management tree.
Through the implementation mode, the requirement dimension and the first-class requirement interval of the user are considered at the same time, only the first-class requirement interval meeting the requirement of the user is classified again, and the corresponding second-class management tree is generated, so that the follow-up user can conveniently inquire the target bill.
In some embodiments, in step B43 (based on a type of demand interval corresponding to a demand dimension in the second demand dimension sequence, the fusion processing is sequentially performed on the first management tree to obtain a type of management tree), including B431-B435:
And B431, extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a type of demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each type of child node in the first management tree based on the first relay interval, and moving the first relay folder into a second folder corresponding to the first management tree.
It is to be understood that, similar to the principle of step a431, the difference is that the present solution only generates the corresponding first relay child node and the corresponding first relay folder for the type of the requirement interval selected by the user, which is not described herein.
And B432, acquiring second text information of the second-level notes in the second-level folder corresponding to the current requirement dimension, classifying the second-level notes in the second-level folder based on the first relay section and the second text information, moving the second-level notes into corresponding first relay folders, and associating the first relay child nodes with the corresponding first relay folders based on the first relay section.
It is to be understood that, consistent with the principle of step a432, the second notes in the second folder are categorized and moved into the corresponding first relay folder by using the first relay section and the second text information, and the first relay child nodes are associated with the corresponding first relay folders based on the first relay section.
B433, extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a first demand interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node directly connected with each first relay sub-node and a corresponding second relay folder based on the second relay interval, and moving the second relay folder to a first relay folder corresponding to the first relay sub-node.
It will be appreciated that consistent with the a433 principle, duplicate portions of a relay child node are constructed.
And B434, acquiring second text information of the second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating the second relay child node with the corresponding second relay folder based on the second relay interval.
And B435, taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a class II management tree.
It can be understood that consistent with the principle of a435, the above steps are repeated until the second sequence of required dimensions does not have required dimensions, and a second class management tree is obtained, and at this time, all nodes in the second class management tree and folders corresponding to the second class management tree are built according to a type of required interval selected by the user
The invention carries on the fusion processing to the management tree of a kind through 3 kinds of ways, get the management tree of a kind, the third kind is when users confirm the corresponding demand dimension, will call the total node and first class folder of a kind of management tree of a kind corresponding to the demand dimension, and receive the second kind of demand interval that users input actively, will automatically build corresponding node and relay folder according to the second kind of demand interval, get the management tree of a kind of users' required, can carry on the fusion search of the data to the different second kind of demand interval of users, get the goal bill that users need, the third kind is as follows:
in some embodiments, the step S4 (receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially ordering the fusion dimension information according to the order to obtain a sequence dimension number, and performing fusion processing on the management tree based on the fusion policy and the sequence dimension number to obtain a second class management tree) includes C41-C49:
And C41, receiving the fused demand information of the user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a second-class demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount dimension, an invoice type dimension, a time dimension and a company name dimension, and the second-class demand interval is a demand interval actively input by the user.
It can be understood that, similar to the B41 principle, the difference is that the first type of demand interval is an interval selected by the user for the classification interval of the second dimension label, and the second type of demand interval in the present solution is an interval directly input by the user, for example, 10-100 yuan.
And C42, sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension serial numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension serial numbers to obtain a demand dimension sequence.
It will be appreciated that the principle is consistent with step B42 and will not be described in detail herein.
And C43, acquiring a first demand dimension in a demand dimension sequence as a current demand dimension, calling a class-one total node corresponding to the current demand dimension as a class-two total node, and associating the class-two total node with a first folder by using a corresponding class-one folder as the first folder.
It can be understood that a first demand dimension in a demand dimension sequence is obtained as a current demand dimension, a class total node and a corresponding class one folder in a class management tree corresponding to the current demand dimension are called, the class total node is taken as a class two total node, the class one folder is taken as a first folder, and the class two total node is associated with the first folder. It is easy to understand that, according to the requirement dimension of the user, a class of total nodes and corresponding class-one folders in the class of management tree corresponding to the class-one total nodes are firstly called.
C44, generating a plurality of class-II child nodes connected with the class-II total nodes based on class-II demand intervals corresponding to the current demand dimension, and generating corresponding second folders;
it can be understood that a plurality of class two child nodes connected with the class two total nodes are generated according to the class two demand intervals actively input by the user, and corresponding second folders are generated.
And C45, classifying the target notes in the first folder according to the second-class demand intervals and the second text information to obtain second-class notes corresponding to each second-class demand interval, placing the second-class notes in the corresponding second folders, and associating the second-class folders with the second-class child nodes based on the classifying intervals.
It can be understood that, the target notes in the first folder are classified according to the second-class demand intervals actively input by the user to obtain second-class notes corresponding to each second-class demand interval, the second-class notes are placed in the corresponding second folders, and the second-class folders and the second-class child nodes are associated based on the classification intervals, so that the principle of determining classification and association is consistent, and details are not repeated herein.
And C46, directly connecting the second class child nodes with the second class total nodes to generate a new management tree corresponding to the current requirement dimension.
It will be appreciated that a new management tree is generated based on the second class of demand intervals entered by the user.
And C47, extracting the next demand dimension in the demand dimension sequence as the current demand dimension, determining a second class demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each second class child node based on the first relay interval, and moving the first relay folder into a second folder corresponding to the second class child node.
It can be understood that the principle is consistent with that of step B431, except that the second type of demand interval is an interval that is artificially and actively input, and nodes and folders can be continuously newly created according to the interval.
And C48, acquiring second text information of the second-level notes in the second folder corresponding to the current demand dimension, classifying the second-level notes in the second folder based on the first relay section and the second text information, moving the second-level notes into corresponding first relay folders, and associating the first relay child nodes with the corresponding first relay folders based on the first relay section.
It can be understood that, consistent with the principle of step B432, the target notes are continuously categorized into the corresponding first relay folders according to the second-class requirement interval input by the user, and the first relay child nodes are associated with the corresponding first relay folders.
And C49, taking the first relay child node as a class II child node and the first relay folder as a second folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a class II management tree.
It can be understood that the above steps are repeated until the second requirement dimension sequence does not have a requirement dimension, and a second class management tree is obtained, where each node and folder in the second class management tree correspond to a requirement actively input by a user.
Through the embodiment, the two-class management tree corresponding to the user requirement can be generated and the corresponding folders are associated, so that the two-class management tree can be directly triggered by the subsequent user without generating different two-class management trees according to the user requirement, and the corresponding target bill can be directly called.
As shown in fig. 5, the financial data management system based on image processing according to the embodiment of the present invention includes:
the extraction module is used for calling a preset identification layer to be overlapped above the target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas;
the verification module is used for retrieving a text verification strategy corresponding to the dimension label to verify the first text information, taking the first text information meeting the verification requirement as second text information of the target bill, and taking the corresponding dimension label as a second dimension label of the target bill;
The generation module is used for generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes;
the fusion module is used for receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first management tree based on the fusion strategy and the sequence dimension serial numbers to obtain a second management tree.
The present invention also provides a storage medium having stored therein a computer program for implementing the methods provided by the various embodiments described above when executed by a processor.
The storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). In addition, the ASIC may reside in a user device. The processor and the storage medium may reside as discrete components in a communication device. The storage medium may be read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tape, floppy disk, optical data storage device, etc.
The present invention also provides a program product comprising execution instructions stored in a storage medium. The at least one processor of the device may read the execution instructions from the storage medium, the execution instructions being executed by the at least one processor to cause the device to implement the methods provided by the various embodiments described above.
In the above embodiment of the apparatus, it should be understood that the processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (10)

1. A financial data management method based on image processing, comprising:
a preset identification layer is called and overlapped above a target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas;
retrieving a text verification strategy corresponding to the dimension label to verify the first text information, taking the first text information meeting the verification requirement as second text information of the target bill, and taking the corresponding dimension label as second dimension label of the target bill;
generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and a class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes;
and receiving fusion demand information of a user, analyzing the fusion demand information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first class management tree based on a fusion strategy and the sequence dimension serial numbers to obtain a second class management tree.
2. The method for managing financial data based on image processing according to claim 1, wherein,
the aligning the preset identification layer with the target bill according to the alignment strategy and the positioning line, and extracting the first text information corresponding to each dimension label based on the identification target area, including:
acquiring an end point of the positioning line in the preset identification layer, which is positioned at a first side, as a first positioning point, and an end point of the positioning line in the preset identification layer, which is positioned at a second side, as a second positioning point, and acquiring an end point of the bill line in the target bill, which is positioned at the first side, as a third positioning point, and an end point of the bill line in the target bill, which is positioned at the second side, as a fourth positioning point;
carrying out coordinated processing on the preset identification layer and the target bill to obtain a first coordinate of a first positioning point, a second coordinate of a second positioning point, a third coordinate of a third positioning point and a fourth coordinate of a fourth positioning point;
obtaining a first slope according to the first coordinate and the second coordinate, obtaining a second slope according to the third coordinate and the fourth coordinate, and obtaining a first angle corresponding to the first slope and a second angle corresponding to the second slope based on an arctangent function of the first slope and the second slope;
Obtaining an adjusting angle based on the first angle and the second angle, performing rotation processing on a preset identification layer according to the adjusting angle, positioning a first positioning point to a third coordinate of a third positioning point, and positioning a second positioning point to a fourth coordinate of a fourth positioning point;
based on OCR character recognition, character recognition is carried out on the region aligned with the recognition target region in the target bill, initial text information of each title in the corresponding recognition target region is obtained, each recognition target region is provided with a corresponding dimension label, and first text information corresponding to each dimension label is extracted.
3. The method for managing financial data based on image processing according to claim 2, wherein,
the step of retrieving the text verification policy corresponding to the dimension tag to verify the first text information, taking the first text information meeting the verification requirement as the second text information of the target bill, and taking the corresponding dimension tag as the second dimension tag of the target bill, comprising:
analyzing the attribute of the dimension tag, and if the attribute of the dimension tag is a text attribute, calling the number of preset texts and a tail verification set, wherein the tail verification set comprises a plurality of preset tail texts, and acquiring texts of the dimension tag corresponding to the number of the preset texts at the tail in the first text information as texts to be compared;
If the preset tail text is judged to be consistent with the text to be compared, the first text information is used as second text information of the target bill, and the corresponding dimension label is used as a second dimension label of the target bill;
if the attribute of the dimension tag is a communication digital attribute, the communication verification quantity is called, the character quantity of the dimension tag corresponding to the first text information is obtained to obtain a to-be-compared number, the to-be-compared number is judged to be equal to the communication verification quantity, the first text information is used as second text information of the target bill, and the corresponding dimension tag is used as second dimension tag of the target bill;
and if the attribute of the dimension label is an amount digital attribute, retrieving an amount verification quantity, acquiring the amount of the rear part of the decimal point corresponding to the first text information of the dimension label as a quantity to be verified, judging that the quantity to be verified is equal to the amount verification quantity, taking the first text information as second text information of the target bill, and taking the corresponding dimension label as second dimension label of the target bill.
4. The method for managing financial data based on image processing as recited in claim 3, wherein,
Generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and a class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes, wherein the class management tree comprises the following steps:
generating a class total node corresponding to the second dimension label according to a class classification model, creating a first-level folder, moving a target bill corresponding to the second dimension label to the first-level folder based on the second text information, and associating the class total node with the first-level folder;
generating a plurality of class child nodes connected with the class total nodes based on the classification interval corresponding to the second dimension label, and generating a corresponding secondary folder;
classifying the target notes in the primary folder according to the classifying interval and the second text information to obtain secondary notes corresponding to each classifying interval, placing the secondary notes in the corresponding secondary folder, and associating the secondary folder with the class of child nodes based on the classifying interval;
And directly connecting the child nodes and the total nodes to generate a management tree corresponding to the second dimension label.
5. The method of claim 4, wherein the image processing-based financial data management method,
the method for obtaining the second class management tree comprises the steps of:
receiving the fusion demand information of a user, and analyzing the fusion demand information to obtain fusion dimension information, wherein the fusion dimension information only comprises a demand dimension, and the demand dimension comprises at least one of money, invoice type, time, buyer company name and seller company name;
sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a first demand dimension sequence, wherein the demand dimensions are in one-to-one correspondence with the second dimension labels;
Acquiring a management tree of a type corresponding to a second dimension label in the first demand dimension sequence as an initial management tree, deleting the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the initial management tree based on a classification interval corresponding to the demand dimension in the second demand dimension sequence to obtain a second type management tree.
6. The method of claim 5, wherein the image processing-based financial data management method,
the merging processing is sequentially performed on the initial management tree based on the classification interval corresponding to the demand dimension in the second demand dimension sequence to obtain a second class management tree, which comprises the following steps:
extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a classification interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each child node in the initial management tree based on the first relay interval, and moving the first relay folder into a second folder corresponding to the initial management tree;
Acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a classification interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node and a corresponding second relay folder which are directly connected with each first relay sub-node based on the second relay interval, and moving the second relay folder into a first relay folder corresponding to the first relay sub-node;
acquiring second text information of a second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating a second relay child node with the corresponding second relay folder based on the second relay interval;
And taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a second class management tree.
7. The method of claim 4, wherein the image processing-based financial data management method,
the method for obtaining the second class management tree comprises the steps of:
receiving fused demand information of a user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a type of demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount, an invoice type, time, a buyer company name and a seller company name, and the type of demand interval is at least one of classification intervals corresponding to the second dimension label;
Sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a demand dimension sequence;
obtaining a first type of management tree corresponding to the first demand dimension in a first demand dimension sequence as a management tree to be cut and a first type of demand interval, reserving a first type of child nodes corresponding to the first type of demand interval in the management tree to be cut to obtain a cut first management tree, deleting the first demand dimension in the first demand dimension sequence to obtain a second demand dimension sequence, and sequentially carrying out fusion processing on the first management tree based on the first type of demand interval corresponding to the demand dimension in the second demand dimension sequence to obtain a second type of management tree.
8. The method of claim 7, wherein the image processing-based financial data management method,
the merging processing is sequentially performed on the first management tree based on a class of demand intervals corresponding to the demand dimensions in the second demand dimension sequence to obtain a class II management tree, which comprises the following steps:
extracting a first demand dimension in the second demand dimension sequence as a current demand dimension, determining a type of demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each type of child node in the first management tree based on the first relay interval, and moving the first relay folder into a second folder corresponding to the first management tree;
Acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
extracting the next demand dimension in the second demand dimension sequence as the current demand dimension, determining a first demand interval corresponding to the current demand dimension as a second relay interval, constructing a second relay sub-node directly connected with each first relay sub-node and a corresponding second relay folder based on the second relay interval, and moving the second relay folder to a first relay folder corresponding to the first relay sub-node;
acquiring second text information of a second-level bill in the first relay folder corresponding to the current requirement dimension, classifying the second-level bill in the first relay folder based on the second relay interval and the second text information, moving the second-level bill into a corresponding second relay folder, and associating a second relay child node with the corresponding second relay folder based on the second relay interval;
And taking the second relay child node as a first relay child node and the second relay folder as a first relay folder, repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a second class management tree.
9. The method of claim 4, wherein the image processing-based financial data management method,
the method for obtaining the second class management tree comprises the steps of:
receiving fused demand information of a user, and analyzing the fused demand information to obtain fused dimension information, wherein the fused dimension information comprises a demand dimension and a second-class demand interval corresponding to the demand dimension, the demand dimension comprises at least one of an amount dimension, an invoice type dimension, a time dimension and a company name dimension, and the second-class demand interval is a demand interval actively input by the user;
sequentially sequencing the demand dimensions according to the sequence to obtain sequence dimension numbers corresponding to the demand dimensions, and sequentially sequencing the demand dimensions based on the sequence dimension numbers to obtain a demand dimension sequence;
Acquiring a first demand dimension in a demand dimension sequence as a current demand dimension, calling a class-one total node corresponding to the current demand dimension as a class-two total node, and a corresponding class-one folder as a first folder, and associating the class-two total node with the first folder;
generating a plurality of class II child nodes connected with the class II total nodes based on class II demand intervals corresponding to the current demand dimension, and generating corresponding second folders;
classifying the target notes in the first folder according to the second-class demand intervals and the second text information to obtain second-class notes corresponding to each second-class demand interval, placing the second-class notes in the corresponding second folders, and associating the second-class folders with the second-class child nodes based on the classifying intervals;
directly connecting the second class child nodes with the second class total nodes to generate a new management tree corresponding to the current requirement dimension;
extracting the next demand dimension in the demand dimension sequence as the current demand dimension, determining a second class demand interval corresponding to the current demand dimension as a first relay interval, constructing a first relay child node and a corresponding first relay folder which are directly connected with each second class child node based on the first relay interval, and moving the first relay folder into a second folder corresponding to the second class child node;
Acquiring second text information of a second bill in the second folder corresponding to the current demand dimension, classifying the second bill in the second folder based on the first relay interval and the second text information, moving the second bill into a corresponding first relay folder, and associating a first relay sub-node with the corresponding first relay folder based on the first relay interval;
and repeating the steps until the second requirement dimension sequence does not have the requirement dimension, and stopping to obtain a class-II management tree.
10. A financial data management system based on image processing, comprising:
the extraction module is used for calling a preset identification layer to be overlapped above the target bill, the preset identification layer comprises a positioning line and a plurality of identification target areas, the preset identification layer is aligned with the target bill according to an alignment strategy and the positioning line, and first text information corresponding to each dimension label is extracted based on the identification target areas;
the verification module is used for retrieving a text verification strategy corresponding to the dimension label to verify the first text information, taking the first text information meeting the verification requirement as second text information of the target bill, and taking the corresponding dimension label as a second dimension label of the target bill;
The generation module is used for generating a class total node and a class-one folder corresponding to each second dimension label according to a class classification model, generating a plurality of class sub-nodes and class-two folders corresponding to the class total node based on a classification interval corresponding to the second dimension label, and generating a class management tree corresponding to the second dimension label according to the class total node and the class sub-nodes;
the fusion module is used for receiving the fusion requirement information of the user, analyzing the fusion requirement information to obtain fusion dimension information, sequentially sequencing the fusion dimension information according to the sequence to obtain sequence dimension serial numbers, and carrying out fusion processing on the first management tree based on the fusion strategy and the sequence dimension serial numbers to obtain a second management tree.
CN202311100459.4A 2023-08-30 2023-08-30 Financial data management method and system based on image processing Active CN116824604B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311100459.4A CN116824604B (en) 2023-08-30 2023-08-30 Financial data management method and system based on image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311100459.4A CN116824604B (en) 2023-08-30 2023-08-30 Financial data management method and system based on image processing

Publications (2)

Publication Number Publication Date
CN116824604A true CN116824604A (en) 2023-09-29
CN116824604B CN116824604B (en) 2023-11-21

Family

ID=88118843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311100459.4A Active CN116824604B (en) 2023-08-30 2023-08-30 Financial data management method and system based on image processing

Country Status (1)

Country Link
CN (1) CN116824604B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060274941A1 (en) * 2003-03-28 2006-12-07 Konstantin Zuev Method of pre-analysis of a machine-readable form image
CN106408358A (en) * 2016-12-08 2017-02-15 用友网络科技股份有限公司 Invoice management method and invoice management apparatus
CN110751143A (en) * 2019-09-26 2020-02-04 中电万维信息技术有限责任公司 Electronic invoice information extraction method and electronic equipment
CN114332883A (en) * 2022-01-04 2022-04-12 上海浦东发展银行股份有限公司 Invoice information identification method and device, computer equipment and storage medium
CN114936191A (en) * 2022-07-18 2022-08-23 国网浙江省电力有限公司 Radial multidimensional file storage method based on core data
CN114969449A (en) * 2022-08-01 2022-08-30 太极计算机股份有限公司 Metadata management method and system based on construction structure tree
CN115294586A (en) * 2022-08-11 2022-11-04 北京分贝通科技有限公司 Invoice identification method and device, storage medium and electronic equipment
CN116486423A (en) * 2023-04-24 2023-07-25 北京闪猫技术有限公司 Financial ticketing data processing method based on image recognition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060274941A1 (en) * 2003-03-28 2006-12-07 Konstantin Zuev Method of pre-analysis of a machine-readable form image
CN106408358A (en) * 2016-12-08 2017-02-15 用友网络科技股份有限公司 Invoice management method and invoice management apparatus
CN110751143A (en) * 2019-09-26 2020-02-04 中电万维信息技术有限责任公司 Electronic invoice information extraction method and electronic equipment
CN114332883A (en) * 2022-01-04 2022-04-12 上海浦东发展银行股份有限公司 Invoice information identification method and device, computer equipment and storage medium
CN114936191A (en) * 2022-07-18 2022-08-23 国网浙江省电力有限公司 Radial multidimensional file storage method based on core data
CN114969449A (en) * 2022-08-01 2022-08-30 太极计算机股份有限公司 Metadata management method and system based on construction structure tree
CN115294586A (en) * 2022-08-11 2022-11-04 北京分贝通科技有限公司 Invoice identification method and device, storage medium and electronic equipment
CN116486423A (en) * 2023-04-24 2023-07-25 北京闪猫技术有限公司 Financial ticketing data processing method based on image recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马驰翔: ""文本图像分析中的关键技术研究"", 《中国优秀博士学位论文全文数据库》 *

Also Published As

Publication number Publication date
CN116824604B (en) 2023-11-21

Similar Documents

Publication Publication Date Title
CN109887153B (en) Finance and tax processing method and system
CN108985912B (en) Data reconciliation
US8738552B2 (en) Method and system for classifying documents
US20050289182A1 (en) Document management system with enhanced intelligent document recognition capabilities
CN114117171B (en) Intelligent project file collecting method and system based on energized thinking
US20050210048A1 (en) Automated posting systems and methods
US20050210047A1 (en) Posting data to a database from non-standard documents using document mapping to standard document types
US10482170B2 (en) User interface for contextual document recognition
JP2008515061A (en) A method for searching data elements on the web using conceptual and contextual metadata search engines
CN110688349A (en) Document sorting method, device, terminal and computer readable storage medium
JP2019204535A (en) Accounting support system
US20050210046A1 (en) Context-based conversion of language to data systems and methods
CN115116068A (en) Archive intelligent filing system based on OCR
WO2024060759A1 (en) Supply chain financial asset auditing method and apparatus, and device and medium
CN116824604B (en) Financial data management method and system based on image processing
CN114998920B (en) Supply chain financial file management method and system based on NLP semantic recognition
JP5243054B2 (en) Data management system, method and program
CN113111829B (en) Method and device for identifying document
CN115408598A (en) Information processing method, apparatus, device, storage medium, and program product
US20100125617A1 (en) System for Consolidating Business Documents
CN110309384B (en) Management method for classifying patent files by using dates
EP3523771A1 (en) System and method for verifying unstructured enterprise resource planning data
US20230067956A1 (en) Multiple product identification assistance in an electronic marketplace application
CN115033619A (en) Project cost digital file sorting and adjusting application method and system
CN117876055A (en) Intelligent solution for invoice arrangement and auditing in trade financing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant