CN118053161A

CN118053161A - Card surface information identification method, apparatus, device, storage medium, and program product

Info

Publication number: CN118053161A
Application number: CN202410339491.6A
Authority: CN
Inventors: 张晓茜; 郑培龙; 余小娟; 黄越
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2024-03-25
Filing date: 2024-03-25
Publication date: 2024-05-17

Abstract

The application relates to a card surface information identification method, a card surface information identification device, computer equipment, a storage medium and a computer program product, and relates to the fields of artificial intelligence technology and financial science and technology. The application can improve the recognition accuracy of the card surface information of the bank card. The method comprises the following steps: acquiring a bank card image to be identified, and positioning character information in the bank card image through a target detection algorithm based on a visual attention mechanism to obtain an input feature map; generating an attention feature map containing character information according to the input feature map; performing feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features; performing sequence analysis on the feature sequence corresponding to the depth feature through a circulating layer of the convolutional circulating neural network model to obtain a card number sequence corresponding to the character information; and performing sequence decoding on the card number sequence through a transcription layer of the convolutional cyclic neural network model to obtain the bank card number information corresponding to the card number sequence in the bank card image.

Description

Card surface information identification method, apparatus, device, storage medium, and program product

Technical Field

The present application relates to the field of artificial intelligence technology and financial technology, and in particular, to a card surface information identification method, apparatus, computer device, storage medium and computer program product.

Background

With the development of computer technology and financial science and technology, the image recognition capability of a computer is improved, and the technology of acquiring a bank card image by using an image acquisition device and utilizing computer-assisted card information input appears in the financial science and technology field.

However, the current image recognition technology can only process images with simple background and foreground, and for bank card images, the recognition of the bank card is interfered by factors such as scale, brightness and definition due to the fact that the background is complex, the font forms are various, and the shooting angles and the illumination conditions of users are different, so that the problem of low recognition accuracy of the bank card surface information in the current technology exists.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a card face information identifying method, apparatus, computer device, computer readable storage medium, and computer program product.

In a first aspect, the present application provides a card surface information identification method, including:

acquiring a bank card image to be identified, and positioning an area where character information in the bank card image is located through a target detection algorithm based on a visual attention mechanism so as to acquire an input feature map containing the character information;

generating an attention feature map containing the character information according to the input feature map;

Performing feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing the character information in the attention feature map;

Performing sequence analysis on the feature sequence corresponding to the depth feature through a circulation layer of the convolutional neural network model to obtain a card number sequence corresponding to the character information;

And performing sequence decoding on the card number sequence through a transcription layer of the convolutional cyclic neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In one embodiment, the generating an attention feature map including the character information according to the input feature map includes:

Splitting the input feature map into a first branch, a second branch and a third branch; performing shape adjustment and dimension conversion on the first branch, performing maximum pooling and shape adjustment on the second branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a first target branch, a second target branch and a third target branch; performing primary matrix multiplication on the first target branch and the second target branch, and performing secondary matrix multiplication on the primary matrix multiplication result and the third target branch; and performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and performing element-by-element addition on the initial attention feature map and input features contained in the input feature image to obtain the attention feature map.

In one embodiment, the generating an attention feature map including the character information according to the input feature map further includes:

Carrying out global maximum pooling and global average pooling on the input feature map to obtain a pooled input feature map; carrying out element-by-element weighted summation on the pooled input feature map and the input features contained in the input feature image to obtain a result feature map; and carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

In one embodiment, the feature extraction of the attention feature map by a convolution layer of a preset convolution cyclic neural network model to obtain a depth feature containing the character information in the attention feature map includes:

Identifying character information contained in the attention feature map through the convolution layer; performing feature extraction on the attention feature map according to the character information to obtain a convolution feature map containing the depth features; and obtaining the depth characteristic according to the convolution characteristic diagram.

In one embodiment, after extracting features of the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features including the character information in the attention feature map, the method further includes:

performing frame-by-frame prediction on the depth features based on the convolution feature map through the circulating layer to obtain feature prediction results; and marking the character information in the depth features into a sequence according to the feature prediction result to obtain a feature sequence corresponding to the depth features.

In one embodiment, the method further comprises:

acquiring feedback data of the convolution layer, and analyzing the feedback data through the circulation layer to obtain error differentiation of the convolution layer; and updating the weight parameters of the convolution layer according to the error differentiation to obtain the updated convolution layer.

In one embodiment, the decoding the card number sequence through the transcription layer of the convolutional recurrent neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card includes:

Inputting the card number sequence into a beam search decoding model in the transcription layer to obtain a plurality of candidate output sequences; screening out a target output sequence with highest probability from a plurality of candidate output sequences through the beam search decoding model; and transcribing the target output sequence into the bank card number information.

In a second aspect, the present application further provides a card surface information identifying apparatus, including:

the area positioning module is used for acquiring a bank card image to be identified, and positioning an area where character information in the bank card image is located through a target detection algorithm based on a visual attention mechanism so as to acquire an input feature map containing the character information;

the picture conversion module is used for generating an attention feature map containing the character information according to the input feature map;

The feature extraction module is used for extracting features of the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing the character information in the attention feature map;

The sequence analysis module is used for carrying out sequence analysis on the characteristic sequence corresponding to the depth characteristic through the circulating layer of the convolutional circulating neural network model to obtain a card number sequence corresponding to the character information;

and the sequence decoding module is used for carrying out sequence decoding on the card number sequence through the transcription layer of the convolutional cyclic neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In a third aspect, the present application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

Acquiring a bank card image to be identified, and positioning an area where character information in the bank card image is located through a target detection algorithm based on a visual attention mechanism so as to acquire an input feature map containing the character information; generating an attention feature map containing the character information according to the input feature map; performing feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing the character information in the attention feature map; performing sequence analysis on the feature sequence corresponding to the depth feature through a circulation layer of the convolutional neural network model to obtain a card number sequence corresponding to the character information; and performing sequence decoding on the card number sequence through a transcription layer of the convolutional cyclic neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:

According to the card surface information identification method, the device, the computer equipment, the storage medium and the computer program product, the visual attention mechanism model is integrated into the target detection algorithm, and the positioning and the detection of the card number of the bank card are carried out through the integrated target detection algorithm, so that the characteristics of the card number information can be better represented, and the detection precision and speed are improved; and then, based on detection, the template matching thought is applied, and the target identification of the card number of the bank card is performed through the convolutional neural network model, so that the interference caused by factors such as scale, brightness, definition and the like can be reduced, and the identification accuracy of the card surface information of the bank card is improved. In conclusion, the whole card surface information identification process does not need a large amount of manual participation, and the scheme can help a user to more conveniently perform the operation of identifying the card number of the bank card, so that the business flow efficiency is improved; meanwhile, the labor cost of financial institutions such as banks can be reduced, the loss of users due to complicated manual input of bank card numbers or low automatic identification accuracy is avoided, and the satisfaction degree and the user experience of the users are improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.

FIG. 1 is an application environment diagram of a card face information identification method in one embodiment;

FIG. 2 is a flow chart of a card information identification method in one embodiment;

FIG. 3 is a flow diagram of the steps for generating an attention profile in one embodiment;

FIG. 4 is a schematic diagram of the structure of an attention mechanism module in one embodiment;

FIG. 5 is a schematic diagram of an attention mechanism module in another embodiment;

FIG. 6 is a schematic diagram of a convolutional recurrent neural network model in one embodiment;

FIG. 7 is a flowchart of a card information identification method in an embodiment;

FIG. 8 is a block diagram showing a card information recognition apparatus according to an embodiment;

Fig. 9 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The card surface information identification method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal communicates with the server through a network. The data storage system may store data that the server needs to process. The data storage system may be integrated on a server or may be placed on a cloud or other network server.

Specifically, the card surface information identification method provided by the embodiment of the application can be executed by a server.

The method comprises the steps that an image of a bank card to be identified is obtained by a server, and an area where character information in the image of the bank card is located through a target detection algorithm based on a visual attention mechanism so as to obtain an input feature map containing the character information; the server generates an attention feature map containing character information according to the input feature map; the server performs feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing character information in the attention feature map; the server carries out sequence analysis on the feature sequence corresponding to the depth feature through a circulating layer of the convolutional circulating neural network model to obtain a card number sequence corresponding to the character information; and the server decodes the card number sequence in sequence through a transcription layer of the convolutional cyclic neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In the application environment as shown in fig. 1, the terminal may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In an exemplary embodiment, as shown in fig. 2, a card surface information identification method is provided, and an example of application of the method to the server in fig. 1 is described, including the following steps S201 to S205. Wherein:

Step S201, a bank card image to be identified is obtained, and an area where character information in the bank card image is located is positioned through a target detection algorithm based on a visual attention mechanism so as to obtain an input feature map containing the character information.

Among other things, visual attention is an attention that mimics the human visual system, with the goal of selecting and focusing on the most relevant and meaningful parts from an input image or video, in order to better understand and process those parts.

It should be noted that the core of the attention mechanism is to let the network adaptively pay attention to the region needing to be focused, and not need to process all the information of the image, and this is generally achieved by assigning a higher weight to the target region. It can be divided into a hard attention mechanism, which simply divides an image region into 1 and 0, the region weight of interest is 1, the non-significant region weight is 0, and a soft attention mechanism, which adaptively obtains different proportions of weight distribution for different regions, wherein the soft attention mechanism comprises an attention mechanism of a channel dimension, an attention mechanism of a space dimension and an attention mechanism of a mixture of the two.

The object detection algorithm may be a YOLO (You Only Look Once, a real-time object detection algorithm) v4 network model, where the YOLO v4 object detection network structure mainly includes four parts of an image input, a backbone network, a neck network, and a prediction network.

The input feature map refers to a data structure in the deep learning model, and generally includes various feature information of input data.

Specifically, the server acquires a to-be-identified bank card image, and locates the area where character information in the bank card image is located through a target detection algorithm based on a visual attention mechanism so as to acquire at least one input feature map containing the character information.

Step S202, generating an attention feature map containing character information according to the input feature map.

Wherein the attention profile refers to a representation generated by an attention mechanism for representing the importance or attention of different regions or features in the input data.

Specifically, the server inputs the input feature map into a target detection algorithm based on a visual attention mechanism, and obtains an attention feature map containing character information.

Step S203, extracting features of the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing character information in the attention feature map.

The convolutional recurrent neural network model is a model structure combining a convolutional neural network and a recurrent neural network, and can be a CRNN (Convolutional Recurrent Neural Network ) model, and the model can be divided into a convolutional layer, a recurrent layer and a transcribed layer.

Wherein, the depth feature refers to a feature representation obtained by converting input data from an original space into a high-dimensional feature space when a deep learning model is used for feature extraction.

Specifically, the server performs feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing character information in the attention feature map.

And S204, carrying out sequence analysis on the feature sequence corresponding to the depth feature through a circulating layer of the convolutional neural network model to obtain a card number sequence corresponding to the character information.

Where sequence analysis refers to the process of analyzing and modeling a sequence of ordered data.

Specifically, the server performs sequence analysis on the feature sequence corresponding to the depth feature through a circulating layer of the convolutional neural network model to obtain a card number sequence corresponding to the character information.

Step S205, the card number sequence is decoded through a transcription layer of the convolutional neural network model, and the card number information corresponding to the card number sequence in the card image is obtained.

The sequence decoding refers to a process of obtaining an output sequence through an inference or generation mode according to a given model and an input sequence; for example, the input sequence may be a card number sequence and the output sequence may be bank card number information.

Specifically, the server decodes the card number sequence through a transcription layer of the convolutional neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In the card information identification method, the visual attention mechanism model is integrated into the target detection algorithm, and the positioning and the detection of the card number of the bank card are carried out through the integrated target detection algorithm, so that the characteristics of the card number information can be better represented, and the detection precision and speed are improved; and then, based on detection, the template matching thought is applied, and the target identification of the card number of the bank card is performed through the convolutional neural network model, so that the interference caused by factors such as scale, brightness, definition and the like can be reduced, and the identification accuracy of the card surface information of the bank card is improved. In conclusion, the whole card surface information identification process does not need a large amount of manual participation, and the scheme can help a user to more conveniently perform the operation of identifying the card number of the bank card, so that the business flow efficiency is improved; meanwhile, the labor cost of financial institutions such as banks can be reduced, the loss of users due to complicated manual input of bank card numbers or low automatic identification accuracy is avoided, and the satisfaction degree and the user experience of the users are improved.

In one embodiment, as shown in fig. 3, in the step S202, an attention feature map containing character information is generated according to the input feature map, and specifically includes the following steps:

In step S301, the input feature map is split into a first branch, a second branch and a third branch.

Step S302, performing shape adjustment and dimension conversion on the first branch, performing maximum pooling and shape adjustment on the second branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a first target branch, a second target branch and a third target branch.

In step S303, the first target branch and the second target branch are subjected to primary matrix multiplication, and the primary matrix multiplication result is subjected to secondary matrix multiplication with the third target branch.

Step S304, performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and adding the initial attention feature map and input features contained in the input feature image element by element to obtain an attention feature map.

Specifically, the server splits the input feature map into a first branch, a second branch and a third branch; performing shape adjustment and dimension conversion on the first branch, performing maximum pooling and shape adjustment on the second branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a first target branch, a second target branch and a third target branch; performing primary matrix multiplication on the first target branch and the second target branch, and performing secondary matrix multiplication on the primary matrix multiplication result and the third target branch; and performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and adding the initial attention feature map and input features contained in the input feature image element by element to obtain an attention feature map.

For example, the overall structure of the NALM (Neural Automata LEARNING MACHINE, neural automaton learning) attention module is shown in FIG. 4, which is a non-local dependency obtained by establishing global relationships between certain locations and other location features. The input characteristic diagram is divided into three branches for processing, wherein the first branch theta (theta) branch is subjected to convolution operation of 1×1 and operation of width-height shape adjustment and dimension conversion, the second branch is subjected to convolution operation of 1×1 and maximum pooling layer operation of phi (phi) branch, then the adjustment on the width-height shape is carried out, the input characteristic of the third branch g branch is firstly subjected to convolution operation of 1×1, then the maximum pooling layer treatment is carried out, and finally the width-height shape adjustment and dimension conversion are carried out. The processed first and second branches are matrix multiplied first, then through a Softmax (an activation function) layer, and then with the output of the third branch. The result is subjected to dimension conversion, width and height shape adjustment, 1X 1 convolution operation and batch normalization processing, and the processed result is added with input features element by element to generate an attention feature map.

In the embodiment, the NALM attention module is fused on the feature extraction backbone of the target detection algorithm, so that the features of the card number information can be better represented, the detection precision and speed are improved, and the method is more suitable for the recognition scene of the complex bank card number.

In one embodiment, in the step S202, an attention feature map containing character information is generated according to the input feature map, and the method specifically includes the following steps:

Carrying out global maximum pooling and global average pooling on the input feature map to obtain a pooled input feature map; carrying out element-by-element weighted summation on the pooled input feature image and the input features contained in the input feature image to obtain a result feature image; and carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

The result feature map is an intermediate output obtained through convolution operation in the convolution neural network, represents abstraction and understanding of different levels of input data, and is the basis for carrying out inference and subsequent tasks on the deep learning model.

Specifically, the server carries out global maximum pooling and global average pooling on the input feature map to obtain the input feature map after pooling treatment; carrying out element-by-element weighted summation on the pooled input feature image and the input features contained in the input feature image to obtain a result feature image; and carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

By way of example, CBAM (Convolutional Block Attention Module, convolution block attention module) the overall structure of the attention module is shown in fig. 5, which can be divided into two parts, channel attention and spatial attention; the two parts are simply connected in series, so that the advantages of the channel domain and the space domain are combined, more effective characteristics are extracted, the parameter number is reduced, and the training difficulty can be reduced. The channel attention module is used for carrying out global maximum pooling operation on the width and the height of the input feature map, then continuing global maximum average pooling, and inputting the obtained product into the shared full-connection layer after two layers of pooling; and carrying out element-by-element addition and summation operation on the feature graphs output by the shared full-connection layer, and weighting through a Sigmoid (an activation function) function to obtain a final result feature graph of the channel attention module. In the spatial attention module, global average pooling and global maximum pooling obtain spatial attention characteristics, correlation among the spatial characteristics is established through two convolutions, meanwhile, the dimension of input and output is kept unchanged, and then, the convolution operation of 7×7 is carried out through a convolution kernel, so that parameter quantity is reduced, calculation is convenient, and the correlation of the spatial characteristics is embodied in a high-dimension level.

In this embodiment, by fusing CBAM attention modules on the feature fusion branch of the target detection algorithm, features of card number information can be better represented, detection accuracy and speed are improved, and the method is more suitable for recognition scenes of complex bank card numbers.

In one embodiment, in the step S203, feature extraction is performed on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features including character information in the attention feature map, and the method specifically includes the following steps:

identifying character information contained in the attention feature map through the convolution layer; according to the character information, carrying out feature extraction on the attention feature map to obtain a convolution feature map containing depth features; and obtaining depth features according to the convolution feature map.

Wherein the convolution feature map is a data representation in a convolution neural network.

It should be noted that, the structure of the convolutional neural network model is shown in fig. 6, and the model may be divided into three parts, namely, a convolutional layer, a cyclic layer and a transcriptional layer, to perform the operations of feature extraction, sequence analysis and sequence decoding, respectively.

Specifically, the server identifies character information contained in the attention feature map through the convolution layer; according to the character information, carrying out feature extraction on the attention feature map to obtain a convolution feature map containing depth features; and obtaining depth features according to the convolution feature map.

For example, the server takes the detected image as input, and obtains depth features with information such as image semantics by using a convolutional neural network.

In this embodiment, feature extraction is performed on the attention feature map by using a convolutional layer of the convolutional recurrent neural network model, so that depth features including character information in the attention feature map are rapidly and accurately obtained.

In one embodiment, after extracting features of the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing character information in the attention feature map, the method further includes the following steps:

Carrying out frame-by-frame prediction on depth features based on a convolution feature map through a circulating layer to obtain a feature prediction result; and marking character information in the depth features into sequences according to the feature prediction result to obtain feature sequences corresponding to the depth features.

Specifically, the server predicts depth features frame by frame based on a convolution feature map through a circulation layer to obtain a feature prediction result; and marking character information in the depth features into sequences according to the feature prediction result to obtain feature sequences corresponding to the depth features.

In this embodiment, by using a loop layer including a loop structure, the prediction of the t-time feature is combined with context information, that is, information of the t-1 time and information of the t+1 time, so that card number sequences with different lengths can be identified, and the processing makes the sequence analysis result more accurate and stable.

In one embodiment, the method further comprises the following steps:

Acquiring feedback data of the convolution layer, and analyzing the feedback data through the circulation layer to obtain error differentiation of the convolution layer; updating the weight parameters of the convolution layer according to the error differentiation to obtain an updated convolution layer;

extracting features of the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing character information in the attention feature map, wherein the method specifically comprises the following steps of:

and extracting the characteristics of the attention characteristic map through the updated convolution layer to obtain the depth characteristics containing character information in the attention characteristic map.

The error differentiation is a method for training a neural network model, and the model can be predicted more accurately by calculating the error between a predicted value and an actual value of the model and back-propagating the error into the weight parameters of a convolution layer, so that the parameters are updated.

Specifically, the server acquires feedback data of the convolution layer, and analyzes the feedback data through the circulation layer to obtain error differentiation of the convolution layer; updating the weight parameters of the convolution layer according to the error differentiation to obtain an updated convolution layer; and extracting the characteristics of the attention characteristic map through the updated convolution layer to obtain the depth characteristics containing character information in the attention characteristic map.

In this embodiment, the model can predict more accurately by back-propagating the error differential into the weight parameters of the convolutional layer of the upper layer, thereby updating the parameters.

In one embodiment, the sequence decoding is performed on the card number sequence through a transcription layer of the convolutional neural network model to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card, and the method specifically comprises the following steps:

Inputting the card number sequence into a bundle search decoding model in a transcription layer to obtain a plurality of candidate output sequences; screening a target output sequence with highest probability from a plurality of candidate output sequences through a beam search decoding model; and transcribing the target output sequence into bank card number information.

The method is more accurate and has higher computational complexity, and the method can consider more possibilities, but has higher computational complexity due to the need of maintaining a candidate sequence set.

Specifically, the server inputs the card number sequence into a bundle search decoding model in a transcription layer to obtain a plurality of candidate output sequences; screening a target output sequence with highest probability from a plurality of candidate output sequences through a beam search decoding model; and transcribing the target output sequence into bank card number information.

For example, CTC (Connectionist Temporal Classification, time series class classification based on neural network) algorithm is used for sequence decoding as a transcription layer; the most likely label is selected by the sequence label obtained in the last step and transcribed into character information, the problem that the input characteristics are inconsistent with the label output finally easily occurs in the transcription process, namely, the same character is judged to be two, and blank (space character) characters introduced by CTC can well solve the occurrence of the situation.

In the embodiment, the transcription layer is utilized to carry out sequence decoding on the card number sequence, and a target output sequence with highest probability is screened out of a plurality of candidate output sequences; the target output sequence is accurately transcribed into the bank card number information, so that the identification accuracy of the bank card surface information is improved.

In one embodiment, as shown in fig. 7, a card surface information identification method in a specific embodiment is provided, which specifically includes the following steps:

step S701, acquiring a to-be-identified bank card image, and positioning a character information area in the bank card image by a target detection algorithm based on a visual attention mechanism to acquire an input feature map containing character information.

Step S702, splitting an input characteristic image into a first branch, a second branch and a third branch; performing shape adjustment and dimension conversion on the first branch, performing maximum pooling and shape adjustment on the second branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a first target branch, a second target branch and a third target branch; performing primary matrix multiplication on the first target branch and the second target branch, and performing secondary matrix multiplication on the primary matrix multiplication result and the third target branch; and performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and adding the initial attention feature map and input features contained in the input feature image element by element to obtain an attention feature map.

Step S703, carrying out global maximum pooling and global average pooling on the input feature map to obtain a pooled input feature map; carrying out element-by-element weighted summation on the pooled input feature image and the input features contained in the input feature image to obtain a result feature image; and carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

Step S704, identifying character information contained in the attention feature map through the convolution layer; according to the character information, carrying out feature extraction on the attention feature map to obtain a convolution feature map containing depth features; and obtaining depth features according to the convolution feature map.

Step S705, predicting depth features frame by frame based on a convolution feature map through a loop layer to obtain a feature prediction result; and marking character information in the depth features into sequences according to the feature prediction result to obtain feature sequences corresponding to the depth features.

Step S706, the sequence analysis is carried out on the feature sequence corresponding to the depth feature through the circulating layer of the convolutional neural network model, and the card number sequence corresponding to the character information is obtained.

Step S707, inputting the card number sequence into the beam search decoding model in the transcription layer to obtain a plurality of candidate output sequences; screening a target output sequence with highest probability from a plurality of candidate output sequences through a beam search decoding model; and transcribing the target output sequence into bank card number information.

The beneficial effects brought by the embodiment are as follows:

the scheme aims to improve the accuracy of bank card number identification, and attention mechanism models are introduced into a target detection method by introducing a YOLO v4 target detection algorithm and a CRNN convolution cyclic neural network, so that the accuracy of identification is improved. The method and the system not only can help the user to more conveniently perform some bank card identification operations and improve business process efficiency, but also can reduce the labor cost of banks, improve customer satisfaction and improve customer experience.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a card surface information identification device for realizing the card surface information identification method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the card surface information identifying device or devices provided below may be referred to the limitation of the card surface information identifying method hereinabove, and will not be repeated herein.

In an exemplary embodiment, as shown in fig. 8, there is provided a card face information identifying apparatus including:

The area positioning module 801 is configured to obtain a to-be-identified card image, and position an area where character information in the card image is located by using a target detection algorithm based on a visual attention mechanism, so as to obtain an input feature map including the character information;

A picture conversion module 802, configured to generate an attention feature map containing character information according to the input feature map;

The feature extraction module 803 is configured to perform feature extraction on the attention feature map through a convolution layer of a preset convolution cyclic neural network model, so as to obtain depth features including character information in the attention feature map;

the sequence analysis module 804 is configured to perform sequence analysis on a feature sequence corresponding to the depth feature through a circulation layer of the convolutional neural network model, so as to obtain a card number sequence corresponding to the character information;

the sequence decoding module 805 is configured to perform sequence decoding on the card number sequence through a transcription layer of the convolutional neural network model, so as to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card.

In one embodiment, the picture conversion module 802 is further configured to split the input feature map into a first branch, a second branch, and a third branch; performing shape adjustment and dimension conversion on the first branch, performing maximum pooling and shape adjustment on the second branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a first target branch, a second target branch and a third target branch; performing primary matrix multiplication on the first target branch and the second target branch, and performing secondary matrix multiplication on the primary matrix multiplication result and the third target branch; and performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and adding the initial attention feature map and input features contained in the input feature image element by element to obtain an attention feature map.

In one embodiment, the picture conversion module 802 is further configured to perform global maximum pooling and global average pooling on the input feature map to obtain a pooled input feature map; carrying out element-by-element weighted summation on the pooled input feature image and the input features contained in the input feature image to obtain a result feature image; and carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

In one embodiment, the feature extraction module 803 is further configured to identify character information contained in the attention feature map through a convolution layer; according to the character information, carrying out feature extraction on the attention feature map to obtain a convolution feature map containing depth features; and obtaining depth features according to the convolution feature map.

In one embodiment, the card surface information identification device further comprises a sequence labeling module, wherein the sequence labeling module is used for predicting depth characteristics frame by frame based on a convolution characteristic diagram through a circulation layer to obtain a characteristic prediction result; and marking character information in the depth features into sequences according to the feature prediction result to obtain feature sequences corresponding to the depth features.

In one embodiment, the card surface information identification device further comprises a parameter updating module, which is used for obtaining feedback data of the convolution layer, and analyzing the feedback data through the circulation layer to obtain error differentiation of the convolution layer; updating the weight parameters of the convolution layer according to the error differentiation to obtain an updated convolution layer;

The feature extraction module 803 is further configured to perform feature extraction on the attention feature map through the updated convolution layer, so as to obtain a depth feature including character information in the attention feature map.

In one embodiment, the sequence decoding module 805 is further configured to input the card number sequence into a bundle search decoding model in the transcription layer to obtain a plurality of candidate output sequences; screening a target output sequence with highest probability from a plurality of candidate output sequences through a beam search decoding model; and transcribing the target output sequence into bank card number information.

The modules in the card surface information identification device can be realized in whole or in part by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one exemplary embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a card face information identification method.

It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to meet the related regulations.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A card face information identification method, characterized in that the method comprises:

Acquiring a bank card image to be identified, and positioning character information in the bank card image through a target detection algorithm based on a visual attention mechanism so as to acquire an input feature map containing the character information;

2. The method of claim 1, wherein generating an attention profile containing the character information from the input profile comprises:

splitting the input feature map into a first branch, a second branch and a third branch;

performing shape adjustment and dimension conversion on the first branch to obtain a first target branch, performing maximum pooling and shape adjustment on the second branch to obtain a second target branch, and performing maximum pooling, shape adjustment and dimension conversion on the third branch to obtain a third target branch;

Performing primary matrix multiplication on the first target branch and the second target branch to obtain a primary matrix multiplication result, and performing secondary matrix multiplication on the primary matrix multiplication result and the third target branch to obtain a secondary matrix multiplication result;

And performing dimension conversion, shape adjustment and batch normalization on the secondary matrix multiplication result to obtain an initial attention feature map, and adding the initial attention feature map and input features contained in the input feature image element by element to obtain the attention feature map.

3. The method of claim 1, wherein generating an attention profile containing the character information from the input profile comprises:

carrying out global maximum pooling and global average pooling on the input feature map to obtain the input feature map after pooling treatment;

Carrying out element-by-element weighted summation on the input feature map subjected to the pooling treatment and input features contained in the input feature image to obtain a result feature map;

And carrying out convolution operation corresponding to a preset convolution kernel on the result feature map to obtain the attention feature map.

4. The method according to claim 1, wherein the feature extraction of the attention feature map by a convolution layer of a preset convolutional recurrent neural network model to obtain depth features containing the character information in the attention feature map, includes:

Identifying the character information contained in the attention feature map through the convolution layer;

Performing feature extraction on the attention feature map according to the character information to obtain a convolution feature map containing the depth features;

And obtaining the depth characteristic according to the convolution characteristic diagram.

5. The method according to claim 4, wherein after extracting features of the attention feature map by a convolution layer of a predetermined convolutional recurrent neural network model to obtain depth features containing the character information in the attention feature map, further comprising:

Performing frame-by-frame prediction on the depth features based on the convolution feature map through the circulating layer to obtain feature prediction results;

and marking the character information in the depth features into a sequence according to the feature prediction result to obtain the feature sequence corresponding to the depth features.

6. The method according to claim 1, wherein the method further comprises:

Acquiring feedback data of the convolution layer, and analyzing the feedback data through the circulation layer to obtain error differentiation of the convolution layer;

Updating the weight parameters of the convolution layer according to the error differentiation to obtain the updated convolution layer;

the feature extraction is performed on the attention feature map through a convolution layer of a preset convolution cyclic neural network model to obtain depth features containing the character information in the attention feature map, and the method comprises the following steps:

and extracting the characteristics of the attention characteristic map through the updated convolution layer to obtain the depth characteristics containing the character information in the attention characteristic map.

7. The method according to any one of claims 1 to 6, wherein the step of performing, by the transcription layer of the convolutional neural network model, sequence decoding on the card number sequence to obtain the card number information of the bank card corresponding to the card number sequence in the image of the bank card includes:

inputting the card number sequence into a beam search decoding model in the transcription layer to obtain a plurality of candidate output sequences;

screening out a target output sequence with highest probability from a plurality of candidate output sequences through the beam search decoding model;

And transcribing the target output sequence into the bank card number information.

8. A card face information recognition apparatus, characterized by comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

11. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.