CN115471856A

CN115471856A - Invoice image information identification method and device and storage medium

Info

Publication number: CN115471856A
Application number: CN202211012411.3A
Authority: CN
Inventors: 张文洋; 杨桂珍; 尹旭; 褚夕; 杨寅
Original assignee: State Grid Corp of China SGCC; Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date: 2022-08-23
Filing date: 2022-08-23
Publication date: 2022-12-13

Abstract

The invention relates to an invoice image information identification method, an invoice image information identification device and a storage medium. Acquiring an invoice image and preprocessing the acquired invoice image, and inputting the preprocessed image into a residual error neural network optimized by an evolution algorithm to identify a region to be identified in the invoice; the residual error neural network comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, a first residual block connected with the second pooling layer, a second residual block connected with the first residual block, a third pooling layer connected with the second residual block, a third pooling layer connected with the third pooling layer, a third residual block connected with the third pooling layer, a global average pooling layer connected with the global average pooling layer, and a depth forest classifier based on multi-objective optimization; and identifying the relevant digital information in each area to be identified, and storing the identified information in a formatted manner.

Description

Invoice image information identification method and device and storage medium

Technical Field

The invention relates to the field of invoice identification, in particular to an invoice image information identification method, an invoice image information identification device and a storage medium.

Background

In recent years, with the deep advancement of tax system reform and tax law landing application such as marketing improvement and increase, government supervision is continuously increased, supervision level is rapidly improved, supervision means is continuously enriched, and the use amount of invoices in China is increased sharply, wherein the use amount of special invoices and common invoices of value-added tax accounts for a very large amount and accounts for about 90% of the total amount of various invoices. The reimbursement process of the invoices at the present stage is still very complicated, the financial reimbursement of organizations such as enterprises, public institution and the like is mainly manually input by related personnel of financial departments or personnel of business departments, the labor cost is high, a large amount of time is occupied, and the phenomena of information input errors, insufficient information accuracy and the like easily occur when centralized processing is carried out at the end of a month and the end of a year.

At present, the traditional algorithm can identify information of scanned value-added tax special invoices and value-added tax common invoice images, but challenges exist in identifying information in the invoice images in natural scenes with different zoom ratios, fuzzy degrees, brightness, inclination angles, sizes and background interferences shot by a mobile phone, and if key information in the invoices needs to be intelligently positioned, the traditional template matching algorithm and the anchor point positioning method do not have universality. Text detection and recognition are an important application field of computer vision technology, and text information contained in an image is converted into language which can be directly recognized and used by a computer by using the text recognition technology. Deep learning, also called characterization learning, is an important branch field of machine learning and is one of the most popular scientific research trends today. Deep learning drastically changes pattern recognition, which is a credit allocation mechanism for potential associations that exist between behaviors and outcomes in adaptive systems. Deep learning was introduced into machine learning as early as 1986, and then applied to artificial neural networks in 2000. The deep neural network is also one of the most successful researches, and has made a major breakthrough in the fields of voice recognition, face recognition, natural language processing, medical treatment, security and the like. The development of deep architecture starts with artificial neural networks, which have been the focus of research for a long time. The first generation of artificial neural networks consisted of single-layer perceptrons, the performance of which was limited by simple calculations. And updating the weight parameters of the neurons by using a back propagation algorithm according to the error rate through the second-generation artificial neural network. With the advent of the support vector machine, the low-dimensional inseparable problem can be converted into a high-dimensional separable problem and surpass first and second generation artificial neural networks. Meanwhile, the Boltzmann machine appears, and the problem of a back propagation algorithm in a second generation artificial neural network is solved. Then, a large number of deep learning algorithms and neural networks such as feedforward neural networks, convolutional neural networks, cyclic neural networks, deep belief networks, autoencoders, and deep confrontation networks have appeared. Zheng Wei, etc., a method and system for automatically identifying and managing value-added tax invoices is proposed, which mainly solves the problems of low efficiency and accuracy in the invoice entry task. The method comprises the steps of firstly, automatically identifying an invoice to acquire an invoice image, preprocessing the invoice image after the acquisition is finished to acquire an invoice image gray-scale map, then, identifying and extracting invoice information of the invoice image gray-scale map, detecting each region of invoice content through a cascade target detector, then, identifying the invoice content of a detection division region through an invoice content identifier to acquire an identification result and a score, dividing the score into three levels according to a set confidence interval, finally, manually correcting, and performing warehousing operation after the identification information is manually corrected. Zheng Dixin et al provide an invoice identification method and apparatus, and a computer storage medium, where text identification processing is performed on an invoice image to obtain a text identification result of the invoice image, then the text identification result of the invoice image is subjected to line splitting to obtain at least one text line, and then entry information corresponding to an entry in the invoice image is determined based on a text identification result included in each line of the at least one text line. Determining entry information corresponding to entries in the invoice image based on a text recognition result contained in each line of at least one text line, wherein the entry information comprises: analyzing a text recognition result contained in a first text line in at least one text line, and determining a corresponding relation between at least one item contained in the first text line and at least one item information; and analyzing the text recognition result contained in the next text line of the first text line based on the corresponding relation between the at least one entry contained in the first text line and the at least one entry information. The method is good in innovation and capable of solving the accuracy problem of invoice image recognition, but the provided algorithm needs a large number of samples to support network training, overfitting is easy to occur in small samples, and an effective method is still lacked for unbalanced training sample image recognition which is often seen in life.

Disclosure of Invention

In order to solve the technical problems or at least partially solve the technical problems, the invention provides an invoice image information identification method, an invoice image information identification device and a storage medium.

The invention provides an invoice image information identification method, which comprises the steps of collecting an invoice image, preprocessing the collected invoice image, inputting the preprocessed image into a residual error neural network optimized by an evolution algorithm to identify a region to be identified in an invoice; the residual error neural network comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, a first residual block connected with the second pooling layer, a second residual block connected with the first residual block, a third convolution layer connected with the second residual block, a third pooling layer connected with the third convolution layer, a third residual block connected with the third pooling layer, a global average pooling layer connected with the global average pooling layer, and a depth forest classifier based on multi-objective optimization; and identifying the relevant digital information in each area to be identified, and storing the identified information in a formatted manner.

Furthermore, the preprocessing comprises affine transformation of the image to reach a preset size, perspective transformation of the image to correct perspective deformation of the to-be-identified area of the image, and edge detection of the image to extract effective information of the image.

Furthermore, the first, second and third residual blocks have the same structure and comprise input layers, the input layers are connected with the inner convolutional layer, the output of the inner convolutional layer is connected with the threshold value screening layer, the output of the inner convolutional layer is connected with the global convolutional layer, the output of the global convolutional layer is connected with the batch normalization layer, the ReLU activation function layer and the Sigmoid activation function layer, the output of the inner convolutional layer is weighted with the output of the Sigmoid activation function layer and then input into the threshold value screening layer, and the input layers are weighted with the output of the threshold value screening layer and then output.

Furthermore, the deep forest classifier adopts a cascade forest structure, each layer of forest of the cascade forest structure is decision tree integration, the feature vector generated by each layer of forest is connected with the original feature vector and is input into the next forest until the last forest layer, and the maximum value of the average values of the results of the last forest layer in the deep forest classification model is taken as the classification result output by the deep forest classifier.

Further, automatically determining the number of cascaded layers for the performance improvement of the depth forest classifier according to the increase of the number of cascaded layers of the depth forest classifier comprises: and generating a class vector by each forest through k-fold cross validation, namely, taking each sample data as a training sample for k-1 times, generating k-1 class vectors, obtaining validation data according to the image, evaluating the performance of the whole deep forest classifier according to the validation data when a new layer of forest is generated in an expansion way, and if the performance of the whole deep forest classifier is not obviously improved, the number of layers of the forest is not increased any more.

Furthermore, each layer of the cascade forest structure comprises a random forest and a completely random forest; when the decision tree of the random forest is constructed, the whole process is carried outRandom selection in a feature space

Taking the individual characteristics as candidate characteristics, wherein d is the number of input characteristics, and then selecting the characteristics with the best Gini value as the splitting characteristics of the nodes; the fully random forest member is a split feature that randomly selects 1 feature as a node in the whole feature space.

Furthermore, the hyper-parameters involved in the deep forest classifier include the number w of random forests in each forest layer _i And the number theta of completely random forests in each layer of forests _i The number b of decision trees contained in each forest _i Carrying out optimization by adopting a multi-objective optimization mode and hyper-parameters;

multi-objective optimization utilizes deep forest activation function h as first optimization function

Using the order of magnitude beta of the deep forest parameter as a second optimization function

Optimizing the hyper-parameter through a first optimization function and a second optimization function;

the first and second optimization functions are constrained by a first and second objective, where the first objective is the root mean square error over the training set as:

the second objective is sparsity:

wherein x is ^tr Is a training set sample, N _tr Is the number of training set samples, o represents the Hadamard product of two numbers converted into vectors, omega _i ,θ _i ,b _i Respectively, hyper-parameters of the deep forest classifier; n represents the number of neurons in each layer and the number of layers of the L neural network;

the objective function of the multi-objective optimization model is as follows:

and meanwhile, the two objective functions are minimized, so that the model is sparse as much as possible on the premise of better performance.

Furthermore, the residual neural network optimized by the evolution algorithm is optimized by the evolution algorithm, the population individuals are rearranged by a selection operator of the evolution algorithm by using a sorting method, and the probability of selecting the individuals after the rearrangement is as follows:

p＝s(1-p ₀ ) ^b-1

wherein a is the number of the population in the evolutionary algorithm, p0 is the probability that the optimal individual may be selected, s is a value obtained by normalizing p0, and b is the position of the nth individual after the population is rearranged;

the evolution algorithm optimization adopts the reciprocal of the sum of squared errors as a fitness function:

f (j) =1/E (j), wherein,

e is the sum of squared errors, P is the overall output, w is the weight, x is the input characteristic, F is the fitness, j is the number of generations, y _j Is a theoretical output.

In a second aspect, the present invention provides a container cluster protection device, comprising: the invoice image information recognition method comprises a processing unit, a bus unit, a storage unit and an image acquisition unit, wherein the bus unit is connected with the storage unit, the processing unit and the image acquisition unit, the storage unit stores a computer program, and the computer program realizes the invoice image information recognition method when being executed by the processing unit.

In a third aspect, the present invention provides a storage medium for implementing an invoice image information recognition method, wherein the storage medium stores a computer program, and the computer program implements the invoice image information recognition method when executed by a processor.

Compared with the prior art, the technical scheme provided by the embodiment of the invention has the following advantages: preprocessing the acquired invoice image, and inputting the preprocessed image into a residual error neural network optimized by an evolution algorithm to identify a region to be identified in the invoice; the residual error neural network comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, a first residual block connected with the second pooling layer, a second residual block connected with the first residual block, a third pooling layer connected with the second residual block, a third pooling layer connected with the third pooling layer, a third residual block connected with the third pooling layer, a global average pooling layer connected with the global average pooling layer, and a depth forest classifier based on multi-objective optimization; and identifying the relevant digital information in each area to be identified, and storing the identified information in a formatted manner. The introduction of the global average pooling layer greatly reduces the parameters required to be calculated, greatly improves the calculation speed of the residual neural network, and avoids the overfitting problem because the global average pooling layer does not need a large amount of training and optimizing parameters like a full-connection layer. The global average pooling layer aggregates spatial information and is therefore more robust to spatial transformations of the input data. The first, second and third residual blocks avoid gradient explosion and disappearance by using a quick connection to skip convolution, and contribute to constructing a deeper neural network structure.

And the first residual block, the second residual block and the third residual block improve the threshold value by using an attention mechanism, so that the residual neural network automatically generates corresponding threshold values according to the input data to eliminate noise, and each group of input data can carry out unique characteristic channel weighting adjustment according to different importance degrees of the sample. The data is processed by a global convolution layer, then the data passes through a batch normalization layer, a ReLU activation function layer and a Sigmoid activation function layer, the Sigmoid activation function layer enables output to be mapped into [0,1], a mapping scaling coefficient is marked as alpha, a final threshold value can be expressed as alpha multiplied by A, different samples correspond to different threshold values, and screening is carried out through a threshold value screening layer, so that the purpose of eliminating or weakening noise is achieved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

FIG. 1 is a schematic diagram of a residual neural network provided in an embodiment of the present invention;

fig. 2 is a schematic diagram of a first residual block, a second residual block, and a third residual block according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a deep forest classifier provided in an embodiment of the present invention;

fig. 4 is a schematic diagram of a container cluster protection apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

Example 1

Referring to fig. 1, the present invention provides an invoice image information recognition method, including:

acquiring an invoice image and preprocessing the acquired invoice image. Specifically, the preprocessing includes performing affine transformation on the invoice image to reach a preset size, performing perspective transformation on the invoice image to correct perspective deformation of an area to be identified of the invoice image, and performing edge detection on the invoice image to extract effective information of the image. At present, domestic invoices are mainly divided into two categories: one type is an electronic invoice and the other type is a paper invoice. The electronic invoice has a standard format and very clear font printing, so that the invoice image of the electronic invoice can be directly used as original input data; the paper invoice can be photographed by a mobile phone, a camera and other photographic equipment, or is converted into an electronic image through scanning to serve as original input data. The invoice image obtained by photographing or scanning the paper invoice usually has the problems of unfixed size, perspective deformation and the like. The invention can effectively improve the invoice image obtained by photographing or scanning by preprocessing the invoice image.

And inputting the preprocessed image into a residual error neural network optimized by an evolution algorithm to identify a region to be identified in the invoice. The residual error neural network optimized by the evolution algorithm is optimized by the evolution algorithm, a selection operator of the evolution algorithm rearranges the population individuals by using a sorting method, and the probability of selecting the individuals after rearrangement is as follows:

p＝s(1-p ₀ ) ^b-1

f (j) =1/E (j), wherein,

e is the sum of squares of the errors, P is the overall output, w is the weight, x is the input characteristic, F is the fitness, j is the number of generations, y _j Is a theoretical output.

Referring to fig. 1, the residual neural network includes: the global mean pooling layer is a forest average value processing directly performed on a feature mapping graph of each channel, namely one feature mapping graph outputs one value, and then the result is input to a depth classifier based on multi-objective optimization; and identifying the relevant digital information in each area to be identified, and storing the identified information in a formatted manner. The introduction of the global average pooling layer greatly reduces the parameters required to be calculated, the calculation speed of the residual error neural network is greatly improved, the global average pooling layer does not need a large amount of training and tuning parameters like a full connection layer, and the overfitting problem is avoided. The global average pooling layer aggregates spatial information and is therefore more robust to spatial transformations of the input data.

Referring to fig. 2, the first, second, and third residual blocks have the same structure and include an input layer, the input layer is connected to an internal convolutional layer, an output of the internal convolutional layer is connected to a threshold screening layer, an output of the internal convolutional layer is connected to a global convolutional layer, an output of the global convolutional layer is connected to a batch normalization layer, a ReLU activation function layer, and a Sigmoid activation function layer, an output of the internal convolutional layer is weighted with an output of the Sigmoid activation function layer and then input to the threshold screening layer, and an output of the input layer is weighted with an output of the threshold screening layer and then output. The first, second and third residual blocks avoid gradient explosion and disappearance by using a quick connection to skip convolution, and contribute to constructing a deeper neural network structure. And the first residual block, the second residual block and the third residual block improve the threshold value by using an attention mechanism, so that the residual neural network automatically generates corresponding threshold values according to the input data to eliminate noise, and each group of input data can carry out unique characteristic channel weighting adjustment according to different importance degrees of the sample. The data is processed by a global convolutional layer, and then passes through a batch normalization layer, a ReLU activation function layer and a Sigmoid activation function layer, the Sigmoid activation function layer enables output to be mapped into [0,1], a mapping scaling coefficient is marked as alpha, a final threshold value can be expressed as alpha multiplied by A, different samples correspond to different threshold values, and screening is carried out through a threshold screening layer, so that the purpose of eliminating or weakening noise is achieved.

Referring to fig. 3, the deep forest classifier adopts a cascade forest structure, each layer of forest of the cascade forest structure is a decision tree integration, the feature vector generated by each layer of forest is connected with the original feature vector and is input into the next forest until the next to last layer, and the maximum value of the average values of the results of the last layer of forest in the deep forest classification model is taken as the classification result output by the deep forest classifier.

In one possible embodiment, automatically determining the number of cascaded layers for the improvement in the performance of the depth forest classifier based on the increase in the number of cascaded layers for the depth forest classifier comprises: and generating a class vector by each forest through k-fold cross validation, namely, taking each sample data as a training sample for k-1 times, generating k-1 class vectors, obtaining validation data according to the image, evaluating the performance of the whole deep forest classifier according to the validation data when a new layer of forest is generated in an expansion way, and if the performance of the whole deep forest classifier is not obviously improved, the number of layers of the forest is not increased any more.

In one possible embodiment, the cascaded forest structure comprises random forests and fully random forests per layer; randomly selecting in the whole feature space when constructing the decision tree of the random forest

In one possible embodiment, the hyper-parameters involved in the deep forest classifier include the number w of random forests in each forest layer _i The number theta of completely random forests in each layer of forests _i Each forest contains the number b of decision trees _i Carrying out optimization by adopting a multi-objective optimization mode and hyper-parameters;

Using the magnitude beta of the deep forest parameter as a second optimization function

the first and second optimization functions are constrained by a first objective and a second objective, wherein the first objective is a root mean square error over a training set as:

the second objective is sparsity:

wherein x ^tr Is a training set sample, N _tr Is the number of training set samples, o represents the Hadamard product of two numbers converted into vectors, omega _i ,θ _i ,b _i Respectively, the hyper-parameters of the deep forest classifier; n represents the number of neurons in each layer and the number of layers of the L neural network;

the objective function of the multi-objective optimization model is as follows:

and meanwhile, the two objective functions are minimized, so that the deep forest classifier is sparse as much as possible on the premise of better performance.

Taking the maximum value in the average value of the results of the last layer of forest in the deep forest classifier as the classification corresponding to the output classification result:

Fin(c)＝Max _y {Ave. _m [c ₁₁ ,c ₁₂ ,...,c _1y ,c ₂₁ ,c ₂₂ ,...,c _2y ,...c _m1 ,c _m2 ,...,c _my ]}；

wherein m is the number of forests contained in each layer of the deep forest, y is the number of categories of the data set, c is the category of the classification of the data set, fin (c) is the classification result output by the deep forest classification model, and Max _y The maximum value, ave, of the average values of the results of the last layer of forest in the deep forest classification model. _m The average value of the results of the last layer of forest in the deep forest classification model is obtained. By classifying the invoice informationThe information is identified and, finally, stored in a json file in a formatted manner.

Example 2

Referring to fig. 4, an embodiment of the present invention provides a container cluster protection device, including: the invoice image information identification method comprises a processing unit, a bus unit, a storage unit and an image acquisition unit, wherein the bus unit is connected with the storage unit, the processing unit and the image acquisition unit, the storage unit stores a computer program and an image acquired by the image acquisition unit, and the computer program is executed by the processing unit to realize the invoice image information identification method.

Example 3

The embodiment of the invention provides a storage medium for realizing an invoice image information identification method, wherein the storage medium stores a computer program, and the computer program realizes the invoice image information identification method when being executed by a processor.

In the embodiments provided herein, it should be understood that the disclosed structures and methods may be implemented in other ways. For example, the above-described structural embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and there may be other divisions when the actual implementation is performed, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, structures or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An invoice image information identification method is characterized by comprising the following steps: acquiring an invoice image and preprocessing the acquired invoice image, and inputting the preprocessed image into a residual error neural network optimized by an evolution algorithm to identify a region to be identified in the invoice; the residual error neural network comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, a first residual block connected with the second pooling layer, a second residual block connected with the first residual block, a third convolution layer connected with the second residual block, a third pooling layer connected with the third convolution layer, a third residual block connected with the third pooling layer, a global average pooling layer connected with the global average pooling layer, and a depth forest classifier based on multi-objective optimization; and identifying the relevant digital information in each area to be identified, and storing the identified information in a formatted manner.

2. The invoice image information identification method according to claim 1, wherein the preprocessing comprises performing affine transformation on the invoice image to reach a preset size, performing perspective transformation on the invoice image to correct perspective deformation of an area to be identified of the invoice image, and performing edge detection on the invoice image to extract effective information of the image.

3. The invoice image information recognition method of claim 1, wherein the first, second and third residual blocks have the same structure and comprise input layers, the input layers are connected with an internal convolution layer, the output of the internal convolution layer is connected with a threshold screening layer, the output of the internal convolution layer is connected with a global convolution layer, the output of the global convolution layer is connected with a batch normalization layer, a ReLU activation function layer and a Sigmoid activation function layer, the output of the internal convolution layer is weighted with the output of the Sigmoid activation function layer and then input to the threshold screening layer, and the input layers are weighted with the output of the threshold screening layer and then output.

4. The invoice image information recognition method of claim 1, wherein the deep forest classifier adopts a cascade forest structure, each layer of forest of the cascade forest structure is decision tree integration, the feature vector generated by each layer of forest is connected with the original feature vector and is input into the next forest until the last but one layer, and the maximum value of the average values of the results of the last layer of forest in the deep forest classification model is taken as the classification result output by the deep forest classifier.

5. The invoice image information recognition method of claim 4, wherein automatically determining the number of cascaded layers for depth forest classifier performance improvement based on an increase in the number of cascaded layers for depth forest classifiers comprises: and generating a class vector by each forest through k-fold cross validation, namely, taking each sample data as a training sample for k-1 times, generating k-1 class vectors, obtaining validation data according to the image, evaluating the performance of the whole deep forest classifier according to the validation data when a new layer of forest is generated in an expansion way, and if the performance of the whole deep forest classifier is not obviously improved, the number of layers of the forest is not increased any more.

6. The invoice image information identification method according to claim 4, whichIs characterized in that each layer of the cascade forest structure comprises a random forest and a completely random forest; randomly selecting in the whole feature space when constructing the decision tree of the random forest

7. The invoice image information recognition method of claim 4, characterized in that the hyper-parameters involved in the deep forest classifier comprise the number of random forests in each layer of forest, w _i And the number theta of completely random forests in each layer of forests _i Each forest contains the number b of decision trees _i Carrying out optimization by adopting a multi-objective optimization mode and hyper-parameters;

Optimizing the hyper-parameters through a first optimization function and a second optimization function;

the second objective is sparsity:

the objective function of the multi-objective optimization model is as follows:

and meanwhile, the two objective functions are minimized, so that the deep forest classifier is as sparse as possible on the premise of better performance.

8. The invoice image information identification method according to claim 1, characterized in that the residual neural network optimized by the evolutionary algorithm is optimized by the evolutionary algorithm, the population individuals are rearranged by a selection operator of the evolutionary algorithm by using a sorting method, and the probability of the individuals being selected after the rearrangement is as follows:

p＝s(1-p ₀ ) ^b-1

wherein a is the number of populations in the evolutionary algorithm, p ⁰ For the probability that the optimal individual is likely to be selected, s is the sum of p ⁰ The normalized value, b being the position of the nth individual after rearranging the population;

f (j) =1/E (j), wherein,

9. A container cluster protection device, comprising: the invoice image information recognition method comprises a processing unit, a bus unit, a storage unit and an image acquisition unit, wherein the bus unit is connected with the storage unit, the processing unit and the image acquisition unit, the storage unit stores a computer program, and the computer program realizes the invoice image information recognition method according to any one of claims 1-8 when being executed by the processing unit.

10. A storage medium for implementing an invoice image information recognition method, the storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the invoice image information recognition method according to any one of claims 1-8.