CN113627297A - Image recognition method, device, equipment and medium - Google Patents

Image recognition method, device, equipment and medium Download PDF

Info

Publication number
CN113627297A
CN113627297A CN202110874690.3A CN202110874690A CN113627297A CN 113627297 A CN113627297 A CN 113627297A CN 202110874690 A CN202110874690 A CN 202110874690A CN 113627297 A CN113627297 A CN 113627297A
Authority
CN
China
Prior art keywords
sample
chinese character
image
value
stroke
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110874690.3A
Other languages
Chinese (zh)
Inventor
李书涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110874690.3A priority Critical patent/CN113627297A/en
Publication of CN113627297A publication Critical patent/CN113627297A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Abstract

The invention relates to the field of artificial intelligence, and provides an image identification method, a device, equipment and a medium, which can accurately determine a threshold value of binarization processing by utilizing a normal distribution algorithm, make the distinction between characters and a background more obvious, avoid misjudgment, the second sample is segmented according to the segmentation template, more targeted identification is realized, the strokes of the Chinese characters are taken as basic elements to extract the characteristic value of each Chinese character characteristic, not only the difference of the Chinese character structure can be reflected, but also the common points of the shapes and the characters on the structure can be reflected, a rarely-used word dictionary and a word-near dictionary are further introduced on the basis of the constructed Chinese character sample, so that targeted training can be performed, a recognition model is obtained through training to generate a recognition result, and further, the image recognition is realized by combining an artificial intelligence means, the accuracy of image text recognition is improved, and the recognition performance is also improved. In addition, the invention also relates to a block chain technology, and the identification model can be stored in the block chain node.

Description

Image recognition method, device, equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an image identification method, device, equipment and medium.
Background
In recent years, as the popularity of artificial intelligence has increased, the field of image recognition has also been receiving attention. Image recognition refers to the automatic recognition of the text content on an image file by computer technology. In daily life, how to efficiently acquire text contents on a large number of image files and quickly input text information contained in images into a system is an urgent research issue to be solved.
The manual mode adopted at present needs to input image content manually, the efficiency is low, for a service system with huge user quantity, the method of manual input and verification is obviously not preferable if the user inputs images, especially when image files contain a large amount of text content information and the actual system needs less information, the time is undoubtedly wasted, and the processing efficiency is reduced.
The existing image recognition scheme adopts more traditional modes on the aspects of early-stage data processing and later-stage feature extraction, and does not consider the particularity of Chinese character forms, so that the accuracy rate is still to be improved.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an image recognition method, apparatus, device and medium, which can implement image recognition by combining with artificial intelligence means, improve the accuracy of image text recognition, and also improve the recognition performance.
An image recognition method, the image recognition method comprising:
obtaining a sample image, and carrying out gray processing on the sample image to obtain a first sample;
determining a target threshold value according to the first sample by utilizing a normal distribution algorithm;
preprocessing the first sample according to the target threshold value to obtain a second sample;
obtaining a pre-configured cutting template, and cutting the second sample according to the cutting template to obtain a third sample;
acquiring a digital feature, an English feature and a Chinese character feature from the third sample;
extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements, and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic;
acquiring a rarely-used word dictionary and a shape-near word dictionary which are configured in advance, and adding the rarely-used word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample;
training a preset network according to the digital features, the English features and the target Chinese character sample to obtain a recognition model;
and acquiring an image to be recognized, recognizing the image to be recognized by using the recognition model, and generating a recognition result according to output data of the recognition model.
According to a preferred embodiment of the present invention, the graying the sample image to obtain a first sample includes:
obtaining the R value, the G value and the B value of each sample image;
determining a first weight corresponding to the R value, a second weight corresponding to the G value, and a third weight corresponding to the B value;
calculating a weighted average value according to the R value, the G value and the B value of each sample image, the first weight, the second weight and the third weight to obtain a gray value of each sample image;
and converting each sample image according to the gray value of each sample image to obtain the first sample.
According to a preferred embodiment of the present invention, the determining the target threshold according to the first sample by using a normal distribution algorithm includes:
identifying a text image and a background image of each of the first samples;
acquiring the density of the character image of each first sample and acquiring the density of the background image of each first sample;
acquiring the proportion of the character image pixels of each first sample as a first proportion, and calculating the proportion of the background image pixels of each first sample as a second proportion according to the first proportion;
calculating the mixed probability density of the text image and the background image of each first sample according to the first ratio, the second ratio, the density of the text image of each first sample and the density of the background image of each first sample;
acquiring an initial threshold value;
calculating the error probability sum of the text image and the background image of each first sample according to the initial threshold value and the mixed probability density of the text image and the background image of each first sample;
and when the error probability sum is the minimum value, acquiring the value of the initial threshold value as the target threshold value.
According to a preferred embodiment of the present invention, the preprocessing the first sample according to the target threshold to obtain a second sample includes:
carrying out binarization processing on the first sample according to the target threshold value to obtain a first image set;
carrying out noise reduction processing on the first image set to obtain a second image set;
calculating the angle of the characteristic connecting line in the second image set by adopting a Hough transform algorithm;
and correcting the angle of the characteristic connecting line in the second image set to a horizontal position according to a rotation algorithm to obtain the second sample.
According to the preferred embodiment of the present invention, said extracting the feature value of each chinese character feature by using the strokes of the chinese character as the basic elements comprises:
carrying out region division on each Chinese character feature to obtain a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
randomly acquiring pixel points from each Chinese character characteristic as initial pixel points;
determining the initial pixel points as starting points, and detecting black pixel points in a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
determining transverse strokes according to the number and the length of black pixel points in the detected transverse subgraph of each Chinese character feature;
determining longitudinal strokes according to the number and the length of black pixel points in the detected longitudinal subgraph of each Chinese character feature;
determining oblique strokes according to the number and the length of the black pixel points in the detected oblique subgraph of each Chinese character characteristic;
and constructing a characteristic value of each Chinese character characteristic according to the transverse stroke, the longitudinal stroke and the oblique stroke of each Chinese character characteristic.
According to the preferred embodiment of the present invention, the constructing the initial chinese character sample according to the feature value of each chinese character feature comprises:
acquiring a length threshold of a transverse stroke of each Chinese character characteristic as a transverse length threshold, acquiring a length threshold of a longitudinal stroke of each Chinese character characteristic as a longitudinal length threshold, and acquiring a length threshold of an oblique stroke of each Chinese character characteristic as an oblique length threshold;
acquiring the length of the transverse stroke, the length of the longitudinal stroke and the length of the oblique stroke of each Chinese character feature from the feature value of each Chinese character feature;
when the length of the transverse stroke with Chinese character characteristics is detected to be greater than or equal to the transverse length threshold value, determining the detected transverse stroke as a target transverse stroke;
when the length of the longitudinal stroke with Chinese character characteristics is detected to be larger than or equal to the longitudinal length threshold value, determining the detected longitudinal stroke as a target longitudinal stroke;
when the length of the slant stroke with the Chinese character characteristics is detected to be larger than or equal to the slant length threshold value, determining the detected slant stroke as a target slant stroke;
and constructing Chinese character information corresponding to each Chinese character characteristic according to the target transverse stroke, the target longitudinal stroke and the target oblique stroke corresponding to each Chinese character characteristic to obtain the initial Chinese character sample.
According to a preferred embodiment of the present invention, the generating a recognition result according to the output data of the recognition model includes:
calling a pre-configured character library;
acquiring all features in the output data;
matching in the text library by using all features in the output data;
and determining the matched words with all the characteristics as the recognition result.
An image recognition device, the image recognition device comprising:
the processing unit is used for acquiring a sample image and carrying out gray processing on the sample image to obtain a first sample;
a determining unit, configured to determine a target threshold according to the first sample by using a normal distribution algorithm;
the processing unit is further configured to pre-process the first sample according to the target threshold to obtain a second sample;
the cutting unit is used for obtaining a pre-configured cutting template and cutting the second sample according to the cutting template to obtain a third sample;
the acquisition unit is used for acquiring digital characteristics, English characteristics and Chinese character characteristics from the third sample;
the construction unit is used for extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic;
the adding unit is used for acquiring a pre-configured rare word dictionary and a shape-near word dictionary, and adding the rare word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample;
the training unit is used for training a preset network according to the digital features, the English features and the target Chinese character samples to obtain a recognition model;
and the identification unit is used for acquiring an image to be identified, identifying the image to be identified by using the identification model and generating an identification result according to the output data of the identification model.
A computer device, the computer device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the image recognition method.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the image recognition method.
According to the technical scheme, the method can obtain the sample image, perform graying processing on the sample image to obtain the first sample, perform graying processing on the color image to reduce the complexity of image processing, determine the target threshold value according to the first sample by utilizing a normal distribution algorithm, accurately determine the threshold value of binarization processing to ensure that characters are more obviously distinguished from the background and misjudgment is avoided, pre-process the first sample according to the target threshold value to obtain the second sample, obtain the pre-configured cutting template, segment the second sample according to the cutting template to obtain the third sample, perform segmentation on the second sample according to the cutting template, perform feature extraction after segmentation, and achieve more pertinence, the identification accuracy and the identification efficiency are improved, and obtain the digital feature from the third sample, English characteristics and Chinese character characteristics, extracting characteristic values of each Chinese character characteristic by taking Chinese character strokes as basic elements, constructing an initial Chinese character sample according to the characteristic values of each Chinese character characteristic, reflecting the difference of the Chinese character structure, reflecting the common points of the shape and the shape of the near-character, optimizing the extraction of the Chinese character characteristics, enabling the extracted characteristics to be more accurate, obtaining a rarely-used dictionary and a shape and near-character dictionary which are configured in advance, adding the rarely-used dictionary and the shape and near-character dictionary to the initial Chinese character sample to obtain a target Chinese character sample, further introducing the rarely-used dictionary and the shape and near-character dictionary on the basis of the constructed Chinese character sample to further carry out targeted training, effectively improving the identification accuracy of the rarely-used character and the shape and the near-character, training a preset network according to the digital characteristics, the characteristics and the target English Chinese character sample, the method comprises the steps of obtaining an identification model, obtaining an image to be identified, identifying the image to be identified by utilizing the identification model, generating an identification result according to output data of the identification model, further realizing image identification by combining an artificial intelligence means, improving the accuracy of image text identification and also improving the identification performance.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the image recognition method of the present invention.
FIG. 2 is a functional block diagram of an image recognition apparatus according to a preferred embodiment of the present invention.
FIG. 3 is a schematic structural diagram of a computer device for implementing the image recognition method according to the preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a preferred embodiment of the image recognition method according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The image recognition method is applied to one or more computer devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive web Television (IPTV), an intelligent wearable device, and the like.
The computer device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network in which the computer device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
And S10, acquiring a sample image, and carrying out gray processing on the sample image to obtain a first sample.
In this embodiment, the sample image may be obtained by using a web crawler technology, or may be directly obtained from a designated database, which is not limited in the present invention.
The designated database may include any enterprise or platform database in which sufficient images are stored for training.
In at least one embodiment of the present invention, the graying the sample image to obtain the first sample includes:
obtaining R (RED), G (GREEN), B (BLUE) values for each sample image;
determining a first weight corresponding to the R value, a second weight corresponding to the G value, and a third weight corresponding to the B value;
calculating a weighted average value according to the R value, the G value and the B value of each sample image, the first weight, the second weight and the third weight to obtain a gray value of each sample image;
and converting each sample image according to the gray value of each sample image to obtain the first sample.
It can be understood that the sample image is usually a color image, and therefore, in order to reduce the processing amount, the sample image needs to be grayed, that is, the original image is converted into a grayscale image.
For example: after a plurality of tests, obtaining a proper weight value: the processing effect is better when the first weight wR is 0.36, the second weight wG is 0.54, and the third weight wB is 0.10. Further, a weighted average value is calculated, where R ═ G ═ R × wR + G × wG + B ═ 0.36R +0.54G +0.10B, where R denotes the R value, G denotes the G value, and B denotes the B value.
According to the embodiment, the color image can be subjected to gray scale processing, and the complexity of image processing is reduced.
And S11, determining a target threshold value according to the first sample by utilizing a normal distribution algorithm.
It can be understood that in the processing of the chinese character image, the pixels of the chinese character are normally distributed, the background of the chinese character is also normally distributed, and when the pixels are reflected in the same gray level histogram, two peaks and a valley are formed. The threshold is selected to distinguish the Chinese character from the background pixel of the Chinese character, and the threshold is selected as the pixel value at the trough. In actual processing, because the image to be identified has the influence of environmental factors such as illumination and the like, the difference between the wave crest and the wave trough is not particularly obvious, and the selection of the threshold value is influenced. Too high a threshold will cause the kanji pixels to be removed as background. Therefore, the target threshold needs to be determined first.
In at least one embodiment of the present invention, the determining the target threshold from the first sample using a normal distribution algorithm includes:
identifying a text image and a background image of each of the first samples;
acquiring the density of the character image of each first sample and acquiring the density of the background image of each first sample;
acquiring the proportion of the character image pixels of each first sample as a first proportion, and calculating the proportion of the background image pixels of each first sample as a second proportion according to the first proportion;
calculating the mixed probability density of the text image and the background image of each first sample according to the first ratio, the second ratio, the density of the text image of each first sample and the density of the background image of each first sample;
acquiring an initial threshold value;
calculating the error probability sum of the text image and the background image of each first sample according to the initial threshold value and the mixed probability density of the text image and the background image of each first sample;
and when the error probability sum is the minimum value, acquiring the value of the initial threshold value as the target threshold value.
For example, let the density, mean and variance of the text image be P1(x), u and m, respectively2Let the density, mean and variance of the background image be P2(x), v and n, respectively2. Assuming that the text image pixel ratio is Q, and the background image pixel ratio is (1-Q), the mixed probability density of the text image and the background image of each first sample can be obtained:
P(X)=QP1(x)+(1-Q)P2(x)
assuming that the selected initial threshold is T, the probability that the character image pixel point is misjudged is as follows:
Figure BDA0003190171000000091
the probability of mistaking the background image as a Chinese character is:
Figure BDA0003190171000000092
therefore, the sum of the error probabilities of the text image and the background image of each first sample can be obtained as follows:
C(T)=QC1(T)+(1-Q)C2(T)
wherein k is a positive integer.
In order to minimize the probability of error, Q is 0.5 by mathematical calculation,
Figure BDA0003190171000000093
at this time, the value of the current T is determined as the target threshold.
Through the embodiment, the threshold value of the binarization processing can be accurately determined, so that the characters and the background are more obviously distinguished, and the misjudgment is avoided.
And S12, preprocessing the first sample according to the target threshold value to obtain a second sample.
In at least one embodiment of the present invention, the preprocessing the first sample according to the target threshold to obtain a second sample includes:
carrying out binarization processing on the first sample according to the target threshold value to obtain a first image set;
carrying out noise reduction processing on the first image set to obtain a second image set;
calculating the angle of the characteristic connecting line in the second image set by adopting a Hough transform algorithm;
and correcting the angle of the characteristic connecting line in the second image set to a horizontal position according to a rotation algorithm to obtain the second sample.
In this embodiment, binarization processing is adopted to convert the first sample into a picture including only black and white colors, so that characteristic processing with pertinence is facilitated.
Further, in reality, digital images are often affected by noise interference of the imaging device and the external environment during digitization and transmission, and are called noisy images or noisy images. The process of reducing noise in a digital image is referred to as image denoising. There are many sources of noise in images, which may be derived from various aspects of image acquisition, transmission, compression, etc. The types of noise are also different, such as salt and pepper noise, gaussian noise, and the like, and different processing algorithms exist for different noises, which is not described herein.
Furthermore, when image recognition is performed, the uploaded pictures are not horizontal, and therefore, when most of image preprocessing is performed, the pictures need to be rotated by a program so that the pictures are in a position which is most likely to be kept horizontal, so that the pictures are cut, and a good picture is obtained. When performing tilt correction, the picture is kept horizontal using hough transform algorithm. The principle of Hough transform is to connect discontinuous characters into a straight line, which is convenient for straight line detection. After the angle of the straight line is calculated, the inclined picture is corrected to a correct horizontal position by using a rotation algorithm, and then the correction of characters in the image is realized.
Through the embodiment, the sample can be pretreated, and the subsequent use is facilitated.
And S13, obtaining a pre-configured cutting template, and cutting the second sample according to the cutting template to obtain a third sample.
It can be understood that after the inclination correction of the picture is completed, the segmentation of the image into blocks and characters can be performed.
Before splitting, the identification may be an identity card, a bank card, a business license, and other various pictures. Because the formats of the pictures are fixed and have certain regularity, the cutting template of the pictures of the types is configured in advance, the second sample is segmented according to the cutting template, and then the feature extraction is carried out, so that the pertinence can be achieved, and the identification accuracy and the identification efficiency are improved.
For example: the typesetting of the identity card, the bank card and the license of the driving card can be different by extracting the characteristic values of various images. In the image segmentation processing process, a point where a pixel is zero is a segmentation point of a field, namely a segmentation position, a column cut point can be obtained according to a vertical histogram, a row cut point can be judged according to a horizontal histogram, information such as row height and column width is further obtained, and the second sample is rapidly segmented by using the information.
And S14, acquiring digital features, English features and Chinese characters from the third sample.
It can be understood that, because the number of the english letters and the numbers is limited, the numeric features and the english features are easy to extract and process, and because the complexity of the chinese characters is high, the numeric features, the english features and the chinese characters are easy to be put together for recognition, so that the numeric features, the english features and the chinese characters are obtained in the embodiment, and the subsequent targeted processing is performed, thereby improving the processing effect.
And S15, extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements, and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic.
In at least one embodiment of the present invention, the extracting feature values of each chinese character feature by using the strokes of the chinese character as basic elements includes:
carrying out region division on each Chinese character feature to obtain a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
randomly acquiring pixel points from each Chinese character characteristic as initial pixel points;
determining the initial pixel points as starting points, and detecting black pixel points in a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
determining transverse strokes according to the number and the length of black pixel points in the detected transverse subgraph of each Chinese character feature;
determining longitudinal strokes according to the number and the length of black pixel points in the detected longitudinal subgraph of each Chinese character feature;
determining oblique strokes according to the number and the length of the black pixel points in the detected oblique subgraph of each Chinese character characteristic;
and constructing a characteristic value of each Chinese character characteristic according to the transverse stroke, the longitudinal stroke and the oblique stroke of each Chinese character characteristic.
Specifically, Chinese character features are extracted. Firstly, according to the region division of the recognized text Chinese characters, a horizontal subgraph, a vertical subgraph and an oblique subgraph are decomposed, an image handle can be obtained by setting the initial value of the intersection point to be zero, then, pixel points can be detected from left to right, and the number and the length of each basic stroke of the Chinese characters, such as the horizontal stroke, the vertical stroke and the like, can be extracted. For example, when the number of horizontal strokes is processed, the initial value of the horizontal stroke is set to 0, each pixel point is sequentially detected and verified from the left to the right from the starting point, when a black pixel point is detected, the position of the pixel point is marked as the starting point of the horizontal stroke, and the number of the horizontal strokes is added by 1. Thus, every time a pixel point of a horizontal stroke is encountered, the number of the horizontal strokes can be added with 1 until a blank pixel is detected, and the detection is stopped.
Further, the constructing of the initial chinese character sample according to the feature value of each chinese character feature includes:
acquiring a length threshold of a transverse stroke of each Chinese character characteristic as a transverse length threshold, acquiring a length threshold of a longitudinal stroke of each Chinese character characteristic as a longitudinal length threshold, and acquiring a length threshold of an oblique stroke of each Chinese character characteristic as an oblique length threshold;
acquiring the length of the transverse stroke, the length of the longitudinal stroke and the length of the oblique stroke of each Chinese character feature from the feature value of each Chinese character feature;
when the length of the transverse stroke with Chinese character characteristics is detected to be greater than or equal to the transverse length threshold value, determining the detected transverse stroke as a target transverse stroke;
when the length of the longitudinal stroke with Chinese character characteristics is detected to be larger than or equal to the longitudinal length threshold value, determining the detected longitudinal stroke as a target longitudinal stroke;
when the length of the slant stroke with the Chinese character characteristics is detected to be larger than or equal to the slant length threshold value, determining the detected slant stroke as a target slant stroke;
and constructing Chinese character information corresponding to each Chinese character characteristic according to the target transverse stroke, the target longitudinal stroke and the target oblique stroke corresponding to each Chinese character characteristic to obtain the initial Chinese character sample.
Wherein, the Chinese character information includes, but is not limited to: the number of strokes of each type, the length, the number of intersections, etc.
For example: during design, 32-by-32 dot matrix can be adopted to judge whether the number value of the transverse stroke is greater than the quarter length of the dot matrix, if the number value is less than the quarter length, the current stroke is considered not to be the transverse stroke and can be used as noise removal, only when the current stroke is greater than or equal to the quarter length, the stroke can be considered to be the transverse stroke, and at the moment, the length of the transverse stroke is recorded. Vertical strokes, diagonal strokes (left-falling, right-falling, etc.) are processed in the same similar way.
Through the implementation mode, Chinese character information of each Chinese character feature can be constructed by taking the Chinese character strokes as basic elements, not only can the difference of the Chinese character structure be reflected, but also the structural common points of the shape and the shape of the character can be reflected, the extraction of the Chinese character features is optimized, and the extracted features are more accurate.
And S16, acquiring a pre-configured rare word dictionary and a shape-near word dictionary, and adding the rare word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample.
In this embodiment, the uncommon word dictionary and the shape-near word dictionary may be configured by self-definition according to an actual application scenario, which is not limited in the present invention.
In the implementation mode, the rarely-used word dictionary and the word-near-shape dictionary are further introduced on the basis of the constructed Chinese character sample, so that targeted training can be performed, and the accuracy of identifying the rarely-used word and the word-near-shape can be effectively improved.
And S17, training a preset network according to the digital features, the English features and the target Chinese character samples to obtain a recognition model.
The preset network may be an SVM (Support Vector Machine).
Specifically, in the training process, indexes such as accuracy, recall rate, and F (F-Measure) value of the model may be continuously obtained until the indexes such as accuracy, recall rate, and F value meet requirements, and the training is stopped, so that the recognition model may be obtained.
And S18, acquiring an image to be recognized, recognizing the image to be recognized by using the recognition model, and generating a recognition result according to the output data of the recognition model.
In at least one embodiment of the present invention, the generating a recognition result according to the output data of the recognition model includes:
calling a pre-configured character library;
acquiring all features in the output data;
matching in the text library by using all features in the output data;
and determining the matched words with all the characteristics as the recognition result.
For example: when the output data contains characteristic horizontal, left-falling and right-falling, and the cross point is one, the character library can be matched to be large.
The image is preprocessed through the perfect processes of gray processing, binaryzation, image noise reduction, inclination correction, text segmentation and the like. For the binarization processing in the image preprocessing stage, a threshold value which is in accordance with a use scene is obtained by calculation through a proper parameter and method, and the image and the text content can be effectively distinguished. The method of feature extraction is used for carrying out structural region division on the Chinese characters, and then the feature values of the number and the length of the basic structures of the Chinese characters, which are formed by the horizontal strokes, the vertical strokes, the oblique strokes and the like of the Chinese characters, can be respectively obtained. Furthermore, when the characteristic values are classified and trained, the learning of the characteristic training of the rare Chinese characters and the shape near characters is also considered, so that the accuracy of image text recognition is improved, and the recognition performance is also improved.
Further, the recognition result can be fed back to the user for selection.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the identification model may be stored in the blockchain node.
According to the technical scheme, the image recognition can be realized by combining an artificial intelligence means, the accuracy of image text recognition is improved, and the recognition performance is also improved.
Fig. 2 is a functional block diagram of an image recognition apparatus according to a preferred embodiment of the present invention. The image recognition device 11 comprises a processing unit 110, a determining unit 111, a segmentation unit 112, an obtaining unit 113, a constructing unit 114, an adding unit 115, a training unit 116 and a recognition unit 117. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
The processing unit 110 obtains a sample image, and performs a graying process on the sample image to obtain a first sample.
In this embodiment, the sample image may be obtained by using a web crawler technology, or may be directly obtained from a designated database, which is not limited in the present invention.
The designated database may include any enterprise or platform database in which sufficient images are stored for training.
In at least one embodiment of the present invention, the processing unit 110 performs a graying process on the sample image, and obtaining a first sample includes:
obtaining R (RED), G (GREEN), B (BLUE) values for each sample image;
determining a first weight corresponding to the R value, a second weight corresponding to the G value, and a third weight corresponding to the B value;
calculating a weighted average value according to the R value, the G value and the B value of each sample image, the first weight, the second weight and the third weight to obtain a gray value of each sample image;
and converting each sample image according to the gray value of each sample image to obtain the first sample.
It can be understood that the sample image is usually a color image, and therefore, in order to reduce the processing amount, the sample image needs to be grayed, that is, the original image is converted into a grayscale image.
For example: after a plurality of tests, obtaining a proper weight value: the processing effect is better when the first weight wR is 0.36, the second weight wG is 0.54, and the third weight wB is 0.10. Further, a weighted average value is calculated, where R ═ G ═ R × wR + G × wG + B ═ 0.36R +0.54G +0.10B, where R denotes the R value, G denotes the G value, and B denotes the B value.
According to the embodiment, the color image can be subjected to gray scale processing, and the complexity of image processing is reduced.
The determining unit 111 determines a target threshold from the first sample using a normal distribution algorithm.
It can be understood that in the processing of the chinese character image, the pixels of the chinese character are normally distributed, the background of the chinese character is also normally distributed, and when the pixels are reflected in the same gray level histogram, two peaks and a valley are formed. The threshold is selected to distinguish the Chinese character from the background pixel of the Chinese character, and the threshold is selected as the pixel value at the trough. In actual processing, because the image to be identified has the influence of environmental factors such as illumination and the like, the difference between the wave crest and the wave trough is not particularly obvious, and the selection of the threshold value is influenced. Too high a threshold will cause the kanji pixels to be removed as background. Therefore, the target threshold needs to be determined first.
In at least one embodiment of the present invention, the determining unit 111 determines the target threshold according to the first sample by using a normal distribution algorithm, including:
identifying a text image and a background image of each of the first samples;
acquiring the density of the character image of each first sample and acquiring the density of the background image of each first sample;
acquiring the proportion of the character image pixels of each first sample as a first proportion, and calculating the proportion of the background image pixels of each first sample as a second proportion according to the first proportion;
calculating the mixed probability density of the text image and the background image of each first sample according to the first ratio, the second ratio, the density of the text image of each first sample and the density of the background image of each first sample;
acquiring an initial threshold value;
calculating the error probability sum of the text image and the background image of each first sample according to the initial threshold value and the mixed probability density of the text image and the background image of each first sample;
and when the error probability sum is the minimum value, acquiring the value of the initial threshold value as the target threshold value.
For example, let the density, mean and variance of the text image be P1(x), u and m, respectively2Let the density, mean and variance of the background image be P2(x), v and n, respectively2. Assuming that the text image pixel ratio is Q, and the background image pixel ratio is (1-Q), the mixed probability density of the text image and the background image of each first sample can be obtained:
P(X)=QP1(x)+(1-Q)P2(x)
assuming that the selected initial threshold is T, the probability that the character image pixel point is misjudged is as follows:
Figure BDA0003190171000000171
the probability of mistaking the background image as a Chinese character is:
Figure BDA0003190171000000172
therefore, the sum of the error probabilities of the text image and the background image of each first sample can be obtained as follows:
C(T)=QC1(T)+(1-Q)C2(T)
wherein k is a positive integer.
In order to minimize the probability of error, Q is 0.5 by mathematical calculation,
Figure BDA0003190171000000173
at this time, the value of the current T is determined as the target threshold.
Through the embodiment, the threshold value of the binarization processing can be accurately determined, so that the characters and the background are more obviously distinguished, and the misjudgment is avoided.
The processing unit 110 preprocesses the first sample according to the target threshold to obtain a second sample.
In at least one embodiment of the present invention, the preprocessing the first sample by the processing unit 110 according to the target threshold to obtain a second sample includes:
carrying out binarization processing on the first sample according to the target threshold value to obtain a first image set;
carrying out noise reduction processing on the first image set to obtain a second image set;
calculating the angle of the characteristic connecting line in the second image set by adopting a Hough transform algorithm;
and correcting the angle of the characteristic connecting line in the second image set to a horizontal position according to a rotation algorithm to obtain the second sample.
In this embodiment, binarization processing is adopted to convert the first sample into a picture including only black and white colors, so that characteristic processing with pertinence is facilitated.
Further, in reality, digital images are often affected by noise interference of the imaging device and the external environment during digitization and transmission, and are called noisy images or noisy images. The process of reducing noise in a digital image is referred to as image denoising. There are many sources of noise in images, which may be derived from various aspects of image acquisition, transmission, compression, etc. The types of noise are also different, such as salt and pepper noise, gaussian noise, and the like, and different processing algorithms exist for different noises, which is not described herein.
Furthermore, when image recognition is performed, the uploaded pictures are not horizontal, and therefore, when most of image preprocessing is performed, the pictures need to be rotated by a program so that the pictures are in a position which is most likely to be kept horizontal, so that the pictures are cut, and a good picture is obtained. When performing tilt correction, the picture is kept horizontal using hough transform algorithm. The principle of Hough transform is to connect discontinuous characters into a straight line, which is convenient for straight line detection. After the angle of the straight line is calculated, the inclined picture is corrected to a correct horizontal position by using a rotation algorithm, and then the correction of characters in the image is realized.
Through the embodiment, the sample can be pretreated, and the subsequent use is facilitated.
The segmentation unit 112 obtains a preconfigured cutting template, and segments the second sample according to the cutting template to obtain a third sample.
It can be understood that after the inclination correction of the picture is completed, the segmentation of the image into blocks and characters can be performed.
Before splitting, the identification may be an identity card, a bank card, a business license, and other various pictures. Because the formats of the pictures are fixed and have certain regularity, the cutting template of the pictures of the types is configured in advance, the second sample is segmented according to the cutting template, and then the feature extraction is carried out, so that the pertinence can be achieved, and the identification accuracy and the identification efficiency are improved.
For example: the typesetting of the identity card, the bank card and the license of the driving card can be different by extracting the characteristic values of various images. In the image segmentation processing process, a point where a pixel is zero is a segmentation point of a field, namely a segmentation position, a column cut point can be obtained according to a vertical histogram, a row cut point can be judged according to a horizontal histogram, information such as row height and column width is further obtained, and the second sample is rapidly segmented by using the information.
The obtaining unit 113 obtains a numeric feature, an english feature, and a chinese feature from the third sample.
It can be understood that, because the number of the english letters and the numbers is limited, the numeric features and the english features are easy to extract and process, and because the complexity of the chinese characters is high, the numeric features, the english features and the chinese characters are easy to be put together for recognition, so that the numeric features, the english features and the chinese characters are obtained in the embodiment, and the subsequent targeted processing is performed, thereby improving the processing effect.
The construction unit 114 extracts a feature value of each chinese character feature using the chinese character strokes as basic elements, and constructs an initial chinese character sample based on the feature value of each chinese character feature.
In at least one embodiment of the present invention, the constructing unit 114 extracting feature values of each chinese character feature using the strokes of the chinese character as basic elements includes:
carrying out region division on each Chinese character feature to obtain a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
randomly acquiring pixel points from each Chinese character characteristic as initial pixel points;
determining the initial pixel points as starting points, and detecting black pixel points in a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
determining transverse strokes according to the number and the length of black pixel points in the detected transverse subgraph of each Chinese character feature;
determining longitudinal strokes according to the number and the length of black pixel points in the detected longitudinal subgraph of each Chinese character feature;
determining oblique strokes according to the number and the length of the black pixel points in the detected oblique subgraph of each Chinese character characteristic;
and constructing a characteristic value of each Chinese character characteristic according to the transverse stroke, the longitudinal stroke and the oblique stroke of each Chinese character characteristic.
Specifically, Chinese character features are extracted. Firstly, according to the region division of the recognized text Chinese characters, a horizontal subgraph, a vertical subgraph and an oblique subgraph are decomposed, an image handle can be obtained by setting the initial value of the intersection point to be zero, then, pixel points can be detected from left to right, and the number and the length of each basic stroke of the Chinese characters, such as the horizontal stroke, the vertical stroke and the like, can be extracted. For example, when the number of horizontal strokes is processed, the initial value of the horizontal stroke is set to 0, each pixel point is sequentially detected and verified from the left to the right from the starting point, when a black pixel point is detected, the position of the pixel point is marked as the starting point of the horizontal stroke, and the number of the horizontal strokes is added by 1. Thus, every time a pixel point of a horizontal stroke is encountered, the number of the horizontal strokes can be added with 1 until a blank pixel is detected, and the detection is stopped.
Further, the constructing unit 114 constructs the initial chinese character sample according to the feature value of each chinese character feature, including:
acquiring a length threshold of a transverse stroke of each Chinese character characteristic as a transverse length threshold, acquiring a length threshold of a longitudinal stroke of each Chinese character characteristic as a longitudinal length threshold, and acquiring a length threshold of an oblique stroke of each Chinese character characteristic as an oblique length threshold;
acquiring the length of the transverse stroke, the length of the longitudinal stroke and the length of the oblique stroke of each Chinese character feature from the feature value of each Chinese character feature;
when the length of the transverse stroke with Chinese character characteristics is detected to be greater than or equal to the transverse length threshold value, determining the detected transverse stroke as a target transverse stroke;
when the length of the longitudinal stroke with Chinese character characteristics is detected to be larger than or equal to the longitudinal length threshold value, determining the detected longitudinal stroke as a target longitudinal stroke;
when the length of the slant stroke with the Chinese character characteristics is detected to be larger than or equal to the slant length threshold value, determining the detected slant stroke as a target slant stroke;
and constructing Chinese character information corresponding to each Chinese character characteristic according to the target transverse stroke, the target longitudinal stroke and the target oblique stroke corresponding to each Chinese character characteristic to obtain the initial Chinese character sample.
Wherein, the Chinese character information includes, but is not limited to: the number of strokes of each type, the length, the number of intersections, etc.
For example: during design, 32-by-32 dot matrix can be adopted to judge whether the number value of the transverse stroke is greater than the quarter length of the dot matrix, if the number value is less than the quarter length, the current stroke is considered not to be the transverse stroke and can be used as noise removal, only when the current stroke is greater than or equal to the quarter length, the stroke can be considered to be the transverse stroke, and at the moment, the length of the transverse stroke is recorded. Vertical strokes, diagonal strokes (left-falling, right-falling, etc.) are processed in the same similar way.
Through the implementation mode, Chinese character information of each Chinese character feature can be constructed by taking the Chinese character strokes as basic elements, not only can the difference of the Chinese character structure be reflected, but also the structural common points of the shape and the shape of the character can be reflected, the extraction of the Chinese character features is optimized, and the extracted features are more accurate.
The adding unit 115 obtains a pre-configured rare word dictionary and a shape-near word dictionary, and adds the rare word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample.
In this embodiment, the uncommon word dictionary and the shape-near word dictionary may be configured by self-definition according to an actual application scenario, which is not limited in the present invention.
In the implementation mode, the rarely-used word dictionary and the word-near-shape dictionary are further introduced on the basis of the constructed Chinese character sample, so that targeted training can be performed, and the accuracy of identifying the rarely-used word and the word-near-shape can be effectively improved.
The training unit 116 trains a preset network according to the digital features, the English features and the target Chinese character samples to obtain a recognition model.
The preset network may be an SVM (Support Vector Machine).
Specifically, in the training process, indexes such as accuracy, recall rate, and F (F-Measure) value of the model may be continuously obtained until the indexes such as accuracy, recall rate, and F value meet requirements, and the training is stopped, so that the recognition model may be obtained.
The recognition unit 117 acquires an image to be recognized, recognizes the image to be recognized using the recognition model, and generates a recognition result from output data of the recognition model.
In at least one embodiment of the present invention, the recognition unit 117 generating the recognition result according to the output data of the recognition model includes:
calling a pre-configured character library;
acquiring all features in the output data;
matching in the text library by using all features in the output data;
and determining the matched words with all the characteristics as the recognition result.
For example: when the output data contains characteristic horizontal, left-falling and right-falling, and the cross point is one, the character library can be matched to be large.
The image is preprocessed through the perfect processes of gray processing, binaryzation, image noise reduction, inclination correction, text segmentation and the like. For the binarization processing in the image preprocessing stage, a threshold value which is in accordance with a use scene is obtained by calculation through a proper parameter and method, and the image and the text content can be effectively distinguished. The method of feature extraction is used for carrying out structural region division on the Chinese characters, and then the feature values of the number and the length of the basic structures of the Chinese characters, which are formed by the horizontal strokes, the vertical strokes, the oblique strokes and the like of the Chinese characters, can be respectively obtained. Furthermore, when the characteristic values are classified and trained, the learning of the characteristic training of the rare Chinese characters and the shape near characters is also considered, so that the accuracy of image text recognition is improved, and the recognition performance is also improved.
Further, the recognition result can be fed back to the user for selection.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the identification model may be stored in the blockchain node.
According to the technical scheme, the image recognition can be realized by combining an artificial intelligence means, the accuracy of image text recognition is improved, and the recognition performance is also improved.
Fig. 3 is a schematic structural diagram of a computer device according to a preferred embodiment of the present invention for implementing an image recognition method.
The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as an image recognition program, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the computer device 1, and does not constitute a limitation to the computer device 1, the computer device 1 may have a bus-type structure or a star-shaped structure, the computer device 1 may further include more or less other hardware or software than those shown, or different component arrangements, for example, the computer device 1 may further include an input and output device, a network access device, etc.
It should be noted that the computer device 1 is only an example, and other electronic products that are currently available or may come into existence in the future, such as electronic products that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, for example a removable hard disk of the computer device 1. The memory 12 may also be an external storage device of the computer device 1 in other embodiments, such as a plug-in removable hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 can be used not only for storing application software installed in the computer apparatus 1 and various kinds of data such as codes of an image recognition program, etc., but also for temporarily storing data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects various components of the entire computer device 1 by using various interfaces and lines, and executes various functions and processes data of the computer device 1 by running or executing programs or modules (for example, executing an image recognition program and the like) stored in the memory 12 and calling data stored in the memory 12.
The processor 13 executes the operating system of the computer device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in the various image recognition method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into a processing unit 110, a determining unit 111, a slicing unit 112, an obtaining unit 113, a building unit 114, an adding unit 115, a training unit 116, a recognition unit 117.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the image recognition method according to the embodiments of the present invention.
The integrated modules/units of the computer device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), random-access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one line is shown in FIG. 3, but this does not mean only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the computer device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The computer device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the computer device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the computer device 1 and other computer devices.
Optionally, the computer device 1 may further comprise a user interface, which may be a Display (Display), an input unit, such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the computer device 1 and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 shows only the computer device 1 with the components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the computer device 1 and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
With reference to fig. 1, the memory 12 of the computer device 1 stores a plurality of instructions to implement an image recognition method, and the processor 13 executes the plurality of instructions to implement:
obtaining a sample image, and carrying out gray processing on the sample image to obtain a first sample;
determining a target threshold value according to the first sample by utilizing a normal distribution algorithm;
preprocessing the first sample according to the target threshold value to obtain a second sample;
obtaining a pre-configured cutting template, and cutting the second sample according to the cutting template to obtain a third sample;
acquiring a digital feature, an English feature and a Chinese character feature from the third sample;
extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements, and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic;
acquiring a rarely-used word dictionary and a shape-near word dictionary which are configured in advance, and adding the rarely-used word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample;
training a preset network according to the digital features, the English features and the target Chinese character sample to obtain a recognition model;
and acquiring an image to be recognized, recognizing the image to be recognized by using the recognition model, and generating a recognition result according to output data of the recognition model.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An image recognition method, characterized in that the image recognition method comprises:
obtaining a sample image, and carrying out gray processing on the sample image to obtain a first sample;
determining a target threshold value according to the first sample by utilizing a normal distribution algorithm;
preprocessing the first sample according to the target threshold value to obtain a second sample;
obtaining a pre-configured cutting template, and cutting the second sample according to the cutting template to obtain a third sample;
acquiring a digital feature, an English feature and a Chinese character feature from the third sample;
extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements, and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic;
acquiring a rarely-used word dictionary and a shape-near word dictionary which are configured in advance, and adding the rarely-used word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample;
training a preset network according to the digital features, the English features and the target Chinese character sample to obtain a recognition model;
and acquiring an image to be recognized, recognizing the image to be recognized by using the recognition model, and generating a recognition result according to output data of the recognition model.
2. The image recognition method of claim 1, wherein the graying the sample image to obtain the first sample comprises:
obtaining the R value, the G value and the B value of each sample image;
determining a first weight corresponding to the R value, a second weight corresponding to the G value, and a third weight corresponding to the B value;
calculating a weighted average value according to the R value, the G value and the B value of each sample image, the first weight, the second weight and the third weight to obtain a gray value of each sample image;
and converting each sample image according to the gray value of each sample image to obtain the first sample.
3. The image recognition method of claim 1, wherein the determining a target threshold from the first sample using a normal distribution algorithm comprises:
identifying a text image and a background image of each of the first samples;
acquiring the density of the character image of each first sample and acquiring the density of the background image of each first sample;
acquiring the proportion of the character image pixels of each first sample as a first proportion, and calculating the proportion of the background image pixels of each first sample as a second proportion according to the first proportion;
calculating the mixed probability density of the text image and the background image of each first sample according to the first ratio, the second ratio, the density of the text image of each first sample and the density of the background image of each first sample;
acquiring an initial threshold value;
calculating the error probability sum of the text image and the background image of each first sample according to the initial threshold value and the mixed probability density of the text image and the background image of each first sample;
and when the error probability sum is the minimum value, acquiring the value of the initial threshold value as the target threshold value.
4. The image recognition method of claim 1, wherein the preprocessing the first sample according to the target threshold to obtain a second sample comprises:
carrying out binarization processing on the first sample according to the target threshold value to obtain a first image set;
carrying out noise reduction processing on the first image set to obtain a second image set;
calculating the angle of the characteristic connecting line in the second image set by adopting a Hough transform algorithm;
and correcting the angle of the characteristic connecting line in the second image set to a horizontal position according to a rotation algorithm to obtain the second sample.
5. The image recognition method of claim 1, wherein said extracting feature values of each chinese character feature using chinese character strokes as basic elements comprises:
carrying out region division on each Chinese character feature to obtain a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
randomly acquiring pixel points from each Chinese character characteristic as initial pixel points;
determining the initial pixel points as starting points, and detecting black pixel points in a transverse subgraph, a longitudinal subgraph and an oblique subgraph of each Chinese character feature;
determining transverse strokes according to the number and the length of black pixel points in the detected transverse subgraph of each Chinese character feature;
determining longitudinal strokes according to the number and the length of black pixel points in the detected longitudinal subgraph of each Chinese character feature;
determining oblique strokes according to the number and the length of the black pixel points in the detected oblique subgraph of each Chinese character characteristic;
and constructing a characteristic value of each Chinese character characteristic according to the transverse stroke, the longitudinal stroke and the oblique stroke of each Chinese character characteristic.
6. The image recognition method of claim 5, wherein the constructing of the initial kanji sample based on the eigenvalues of each kanji feature comprises:
acquiring a length threshold of a transverse stroke of each Chinese character characteristic as a transverse length threshold, acquiring a length threshold of a longitudinal stroke of each Chinese character characteristic as a longitudinal length threshold, and acquiring a length threshold of an oblique stroke of each Chinese character characteristic as an oblique length threshold;
acquiring the length of the transverse stroke, the length of the longitudinal stroke and the length of the oblique stroke of each Chinese character feature from the feature value of each Chinese character feature;
when the length of the transverse stroke with Chinese character characteristics is detected to be greater than or equal to the transverse length threshold value, determining the detected transverse stroke as a target transverse stroke;
when the length of the longitudinal stroke with Chinese character characteristics is detected to be larger than or equal to the longitudinal length threshold value, determining the detected longitudinal stroke as a target longitudinal stroke;
when the length of the slant stroke with the Chinese character characteristics is detected to be larger than or equal to the slant length threshold value, determining the detected slant stroke as a target slant stroke;
and constructing Chinese character information corresponding to each Chinese character characteristic according to the target transverse stroke, the target longitudinal stroke and the target oblique stroke corresponding to each Chinese character characteristic to obtain the initial Chinese character sample.
7. The image recognition method of claim 1, wherein the generating recognition results from output data of the recognition model comprises:
calling a pre-configured character library;
acquiring all features in the output data;
matching in the text library by using all features in the output data;
and determining the matched words with all the characteristics as the recognition result.
8. An image recognition apparatus, characterized in that the image recognition apparatus comprises:
the processing unit is used for acquiring a sample image and carrying out gray processing on the sample image to obtain a first sample;
a determining unit, configured to determine a target threshold according to the first sample by using a normal distribution algorithm;
the processing unit is further configured to pre-process the first sample according to the target threshold to obtain a second sample;
the cutting unit is used for obtaining a pre-configured cutting template and cutting the second sample according to the cutting template to obtain a third sample;
the acquisition unit is used for acquiring digital characteristics, English characteristics and Chinese character characteristics from the third sample;
the construction unit is used for extracting the characteristic value of each Chinese character characteristic by taking the Chinese character strokes as basic elements and constructing an initial Chinese character sample according to the characteristic value of each Chinese character characteristic;
the adding unit is used for acquiring a pre-configured rare word dictionary and a shape-near word dictionary, and adding the rare word dictionary and the shape-near word dictionary to the initial Chinese character sample to obtain a target Chinese character sample;
the training unit is used for training a preset network according to the digital features, the English features and the target Chinese character samples to obtain a recognition model;
and the identification unit is used for acquiring an image to be identified, identifying the image to be identified by using the identification model and generating an identification result according to the output data of the identification model.
9. A computer device, characterized in that the computer device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the image recognition method of any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executed by a processor in a computer device to implement the image recognition method of any one of claims 1 to 7.
CN202110874690.3A 2021-07-30 2021-07-30 Image recognition method, device, equipment and medium Pending CN113627297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110874690.3A CN113627297A (en) 2021-07-30 2021-07-30 Image recognition method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110874690.3A CN113627297A (en) 2021-07-30 2021-07-30 Image recognition method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN113627297A true CN113627297A (en) 2021-11-09

Family

ID=78381912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110874690.3A Pending CN113627297A (en) 2021-07-30 2021-07-30 Image recognition method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113627297A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210297A (en) * 2019-04-25 2019-09-06 上海海事大学 The method declaring at customs the positioning of single image Chinese word and extracting
CN111435445A (en) * 2019-12-24 2020-07-21 珠海大横琴科技发展有限公司 Training method and device of character recognition model and character recognition method and device
CN111461112A (en) * 2020-03-03 2020-07-28 华南理工大学 License plate character recognition method based on double-cycle transcription network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210297A (en) * 2019-04-25 2019-09-06 上海海事大学 The method declaring at customs the positioning of single image Chinese word and extracting
CN111435445A (en) * 2019-12-24 2020-07-21 珠海大横琴科技发展有限公司 Training method and device of character recognition model and character recognition method and device
CN111461112A (en) * 2020-03-03 2020-07-28 华南理工大学 License plate character recognition method based on double-cycle transcription network

Similar Documents

Publication Publication Date Title
CN112528863A (en) Identification method and device of table structure, electronic equipment and storage medium
US9183452B2 (en) Text recognition for textually sparse images
CN112418216B (en) Text detection method in complex natural scene image
CN110866529A (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN108764352A (en) Duplicate pages content detection algorithm and device
CN112052850A (en) License plate recognition method and device, electronic equipment and storage medium
CN113033543B (en) Curve text recognition method, device, equipment and medium
WO2022126978A1 (en) Invoice information extraction method and apparatus, computer device and storage medium
CN114881698A (en) Advertisement compliance auditing method and device, electronic equipment and storage medium
CN113887438A (en) Watermark detection method, device, equipment and medium for face image
CN111931729B (en) Pedestrian detection method, device, equipment and medium based on artificial intelligence
CN112508145A (en) Electronic seal generation and verification method and device, electronic equipment and storage medium
CN113705462A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN113821602A (en) Automatic answering method, device, equipment and medium based on image-text chatting record
CN111476225B (en) In-vehicle human face identification method, device, equipment and medium based on artificial intelligence
CN112329666A (en) Face recognition method and device, electronic equipment and storage medium
CN112668575A (en) Key information extraction method and device, electronic equipment and storage medium
CN112883346A (en) Safety identity authentication method, device, equipment and medium based on composite data
CN116958957A (en) Training method of multi-mode feature extraction network and three-dimensional feature representation method
CN114913518A (en) License plate recognition method, device, equipment and medium based on image processing
CN115984588A (en) Image background similarity analysis method and device, electronic equipment and storage medium
CN113627297A (en) Image recognition method, device, equipment and medium
CN115439850A (en) Image-text character recognition method, device, equipment and storage medium based on examination sheet
CN114943306A (en) Intention classification method, device, equipment and storage medium
CN112528984A (en) Image information extraction method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination