CN114187593A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN114187593A
CN114187593A CN202111526049.7A CN202111526049A CN114187593A CN 114187593 A CN114187593 A CN 114187593A CN 202111526049 A CN202111526049 A CN 202111526049A CN 114187593 A CN114187593 A CN 114187593A
Authority
CN
China
Prior art keywords
image
training
feature
recognition model
character recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111526049.7A
Other languages
Chinese (zh)
Other versions
CN114187593B (en
Inventor
张家鑫
黄灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youzhuju Network Technology Co Ltd
Original Assignee
Beijing Youzhuju Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youzhuju Network Technology Co Ltd filed Critical Beijing Youzhuju Network Technology Co Ltd
Priority to CN202111526049.7A priority Critical patent/CN114187593B/en
Publication of CN114187593A publication Critical patent/CN114187593A/en
Application granted granted Critical
Publication of CN114187593B publication Critical patent/CN114187593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/02Computing arrangements based on specific mathematical models using fuzzy logic

Abstract

The application discloses an image processing method, which comprises the following steps: acquiring an image to be processed; inputting the image to be processed into a character recognition model obtained by pre-training to obtain characters included in the image to be processed; the character recognition model is used to: and extracting the image characteristics of the image to be processed, and obtaining characters included in the image to be processed according to the image characteristics. When the character recognition model is trained, the first feature of the training image can be extracted, the feature of the corresponding character in the first feature is determined, then the feature of the corresponding character in the first feature is subjected to fuzzy processing to obtain the second feature, the character prediction result is obtained according to the second feature, and the parameter of the character recognition model is updated based on the character prediction result and the label corresponding to the training image. Therefore, the character recognition model has the capability of predicting the real characters corresponding to the features corresponding to the characters subjected to fuzzy processing, and the characters in the image to be processed can be accurately recognized by the method.

Description

Image processing method and device
Technical Field
The present application relates to the field of image processing, and in particular, to an image processing method and apparatus.
Background
In some scenarios, it is desirable to identify characters in an image. However, the current methods for recognizing characters in images cannot accurately recognize the characters in the images.
Therefore, a solution is urgently needed to accurately recognize characters in an image.
Disclosure of Invention
The technical problem to be solved by the application is how to accurately identify characters in an image, and an image processing method and device are provided.
In a first aspect, an embodiment of the present application provides an image processing method, where the method includes:
acquiring an image to be processed including characters;
inputting the image to be processed into the character recognition model to obtain characters included in the image to be processed; wherein:
the character recognition model is used for: extracting image features of the image to be processed, and obtaining characters included in the image to be processed according to the image features; wherein:
the character recognition model is obtained by training in the following way:
acquiring a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image;
training a character recognition model based on the training image and a label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image; wherein:
the training of the character recognition model based on the training images and the labels corresponding to the training images comprises:
extracting a first feature of the training image;
determining the characteristics of the corresponding characters in the first characteristics;
fuzzy processing is carried out on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics;
obtaining a character prediction result according to the second characteristic;
and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
Optionally, the blurring processing on the feature of the corresponding character in the first feature includes any one or more of the following:
removing part of the characteristics of the corresponding characters in the first characteristics; alternatively, the first and second electrodes may be,
and modifying part of the characteristics of the corresponding characters in the first characteristics.
Optionally, the character recognition model includes a decoder and N encoders;
the first i encoders in the N encoders are connected in series, the first i encoders are used for obtaining the first characteristic according to the training image, the first characteristic is the output of the ith encoder, and i is a positive integer smaller than N;
the last (N-i) encoders are connected in series, and the last (N-i) encoders are used for processing the second characteristic to obtain a third characteristic;
and the decoder is used for obtaining the character prediction result according to the third characteristic.
Optionally, when the character recognition model is used to recognize characters in the image to be processed, the N encoders are configured to extract image features of the image to be processed, and the decoder is configured to obtain characters included in the image to be processed according to the image features. 5. The method of claim 1, wherein determining the feature of the first feature for the corresponding character comprises:
and determining the characteristics of the corresponding characters in the first characteristics by using a characteristic extraction module, wherein the characteristic extraction module is used for determining the characteristics of the corresponding characters in the first characteristics according to the first characteristics, and the characteristic extraction module is independent of the character recognition model.
Optionally, the feature extraction module is a CTC classification module using a time-series classification algorithm.
Optionally, the performing fuzzy processing on the feature of the corresponding character in the first feature to obtain a second feature includes:
and carrying out fuzzy processing on the characteristics of the corresponding characters in the first characteristics by using a characteristic processing module independent of the character recognition model to obtain second characteristics.
Optionally, obtaining a character prediction result according to the second feature includes:
and obtaining a character prediction result according to the second characteristic and the characteristic of the corresponding background noise in the first characteristic.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
an acquisition unit configured to acquire an image to be processed including characters;
the processing unit is used for inputting the image to be processed into the character recognition model to obtain characters included in the image to be processed; wherein:
the character recognition model is used for: extracting image features of the image to be processed, and obtaining characters included in the image to be processed according to the image features; wherein:
the character recognition model is obtained by training in the following way:
acquiring a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image;
training a character recognition model based on the training image and a label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image; wherein:
the training of the character recognition model based on the training images and the labels corresponding to the training images comprises:
extracting a first feature of the training image;
determining the characteristics of the corresponding characters in the first characteristics;
fuzzy processing is carried out on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics;
obtaining a character prediction result according to the second characteristic;
and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
Optionally, the blurring processing on the feature of the corresponding character in the first feature includes any one or more of the following:
removing part of the characteristics of the corresponding characters in the first characteristics; alternatively, the first and second electrodes may be,
and modifying part of the characteristics of the corresponding characters in the first characteristics.
Optionally, the character recognition model includes a decoder and N encoders;
the first i encoders in the N encoders are connected in series, the first i encoders are used for obtaining the first characteristic according to the training image, the first characteristic is the output of the ith encoder, and i is a positive integer smaller than N;
the last (N-i) encoders are connected in series, and the last (N-i) encoders are used for processing the second characteristic to obtain a third characteristic;
and the decoder is used for obtaining the character prediction result according to the third characteristic.
Optionally, when the character recognition model is used to recognize characters in the image to be processed, the N encoders are configured to extract image features of the image to be processed, and the decoder is configured to obtain characters included in the image to be processed according to the image features. 5. The apparatus of claim 1, wherein the determining the feature of the first feature for the corresponding character comprises:
and determining the characteristics of the corresponding characters in the first characteristics by using a characteristic extraction module, wherein the characteristic extraction module is used for determining the characteristics of the corresponding characters in the first characteristics according to the first characteristics, and the characteristic extraction module is independent of the character recognition model.
Optionally, the feature extraction module is a CTC classification module using a time-series classification algorithm.
Optionally, the performing fuzzy processing on the feature of the corresponding character in the first feature to obtain a second feature includes:
and carrying out fuzzy processing on the characteristics of the corresponding characters in the first characteristics by using a characteristic processing module independent of the character recognition model to obtain second characteristics.
Optionally, obtaining a character prediction result according to the second feature includes:
and obtaining a character prediction result according to the second characteristic and the characteristic of the corresponding background noise in the first characteristic.
In a third aspect, an embodiment of the present application provides an apparatus, which includes a processor and a memory;
the processor is configured to execute instructions stored in the memory to cause the apparatus to perform the method of any of the first aspects above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium comprising instructions that instruct a device to perform the method according to any one of the above first aspects.
In a fifth aspect, embodiments of the present application provide a computer program product, which when run on a computer, causes the computer to perform the method of any of the above first aspects.
Compared with the prior art, the embodiment of the application has the following advantages:
the embodiment of the application provides an image processing method, which comprises the following steps: acquiring an image to be processed; inputting the image to be processed into a character recognition model obtained by pre-training to obtain characters included in the image to be processed; wherein: the character recognition model is used for: extracting image features of the image to be processed, obtaining characters included in the image to be processed according to the image features, and acquiring a training image and a label corresponding to the training image when training the character recognition model, wherein the label corresponding to the training image is used for indicating the characters included in the training image; and training a character recognition model based on the training image and the label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image. When the character recognition model is trained, the first feature of the training image can be extracted, the feature of the corresponding character in the first feature is determined, then the feature of the corresponding character in the first feature is subjected to fuzzy processing to obtain a second feature, a character prediction result is obtained according to the second feature, and the parameter of the character recognition model is updated based on the character prediction result and the label corresponding to the training image. In the training of the character recognition model, the character prediction result is obtained from the second feature obtained by blurring the feature of the corresponding character in the first feature, so that the character recognition model has the capability of predicting the character corresponding to the feature corresponding to the character subjected to blurring, and therefore, the character can be accurately recognized by the character recognition model even if the character itself in the image to be processed is unclear. Namely: by the image processing method provided by the embodiment of the application, the characters in the image to be processed can be accurately identified.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flowchart of a model training method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an image processing method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a character recognition model according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The inventor of the present application has found through research that, at present, a machine learning model can be trained in advance to use the machine learning model, and the trained machine learning model is used to recognize an image, so as to determine characters included in the image.
In one example, the machine learning model may be a Transformer model, where the Transformer model includes an encoder (encoder) and a decoder (decoder), the encoder is configured to encode an image to obtain image features, and the decoder is configured to decode the features output by the encoder to obtain characters included in the image.
Even if a character in an image is recognized using a transform model, there may be a problem of inaccurate recognition, for example, inaccurate recognition results due to the unclear character itself in the image.
In order to solve the above problem, an embodiment of the present application provides a model training method and apparatus.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
Exemplary method
The embodiment of the application provides an image processing method, in which a character recognition model obtained by pre-training can be used for recognizing characters in an image to be processed, and when the character recognition model is used for recognizing the characters in the image to be processed, the characters can be accurately recognized even if the characters in the image to be processed are not clear.
Next, the training process of the character recognition model will be described first.
Referring to fig. 1, the figure is a schematic flow chart of a model training method provided in the embodiment of the present application. In this embodiment, the method may be executed by a terminal or a server, and the embodiment of the present application is not particularly limited.
The method shown in fig. 1, for example, may comprise the steps of: S101-S102.
It should be noted that the process of model training is a process of multiple iterative computations, each iteration can adjust the parameters of the model, and the adjusted parameters participate in the next iterative computation.
Fig. 1 illustrates a certain iteration process in training a character recognition model, taking a certain training image as an example. It will be appreciated that there are many sets of training images used to train the character recognition model, and that each set of training images is processed similarly when the formula recognition model is trained. After training of a plurality of groups of training images, the character recognition model with the accuracy meeting the requirement can be obtained.
S101: the method comprises the steps of obtaining a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image.
In one example, the training image may be an image that includes characters. The training image may be obtained by shooting with a shooting device, may also be obtained from a network resource, and may also be obtained in other manners, which is not specifically limited in the embodiments of the present application.
In one example, the raw image may be acquired and then processed to obtain a training image. The original image may be processed, for example, by changing the size of the original image, and in one example, the width and height of the original image may be scaled proportionally so that the height of the processed image is a preset height (for example, 32).
In one example, the labels corresponding to the training images may be manually labeled.
S102: and training a character recognition model based on the training image and the label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image.
S102, in a particular implementation, may include the following S1021-S1025.
S1021: extracting a first feature of the training image.
In one example, the character recognition model may include a feature extraction module to extract a first feature of the training image.
In one example, the feature extraction module may include i encoders, i is an integer greater than or equal to 1, and the first feature is an output of the ith encoder. And when i is larger than 1, the i encoders are connected in series, the first encoder in the i encoders is used for processing the training image, and the output of the jth encoder is the input of the (j +1) th encoder. Wherein (j +1) is less than or equal to i. The encoder mentioned here may be a native encoder of a conventional transform model, and will not be described in detail here.
S1022: and determining the characteristics of the corresponding characters in the first characteristics.
S1023: and carrying out fuzzy processing on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics.
S1024: and obtaining a character prediction result according to the second characteristic.
In the embodiment of the present application, in order to train the obtained character recognition model, even when the character itself included in the image to be processed is unclear, the character can be recognized. The first feature may be identified, the feature of the corresponding character in the first feature is subjected to fuzzy processing to obtain a second feature, and a character prediction result is obtained according to the second feature. Since the character prediction result is obtained according to the second feature, the character recognition model actually has the capability of predicting the corresponding character before the feature is blurred according to the feature which is blurred. Thus, the trained character recognition model can accurately recognize the character even if the character itself in the image to be processed is unclear.
With respect to S1022, it should be noted that, in an example, the feature extraction module may be used to determine the feature of the corresponding character in the first feature. The feature extraction module is configured to classify the first feature to determine a feature of a corresponding character in the first feature. Wherein the feature extraction module may be a module independent of the character recognition model. In training the character recognition model, the first features are processed by means of the feature extraction module. After the training of the character recognition model is completed, the feature extraction module does not need to participate in calculation when the character recognition model is used for recognizing characters in the image to be processed.
In one example, in view of a Connection Timing Classification (CTC) algorithm, it is possible to identify whether a feature is a feature of a corresponding character or a feature of a corresponding background noise. Thus, the feature extraction module may be a CTC module. In other words, in particular implementations, S1022 may utilize the CTC module to classify the first feature to determine the feature of the corresponding character in the first feature.
Regarding S1023, it should be noted that, in the specific implementation of S1023, a blurring process may be performed on a part of the first features, so as to obtain second features. For example, 15% of the first features are blurred.
There are many implementations of blurring some of the first features. In one example, some of the features of the corresponding character in the first features may be removed. In yet another example, some of the features of the corresponding character in the first features may be modified, e.g., the features originally corresponding to the first character are modified to correspond to features of other characters different from the first character.
For example, the following steps are carried out: for the 15% feature, a part of the 15% feature (for example, 80%) may be removed, and another part of the 15% feature (for example, 10%) may be modified into a feature corresponding to another character, and the other part of the 15% feature (for example, 10%) may remain unchanged.
In one example, S1023 may be implemented by a feature processing module independent of the character recognition model. In other words, in a specific implementation, the S1023 may use a feature processing module independent of the character recognition model to perform fuzzy processing on the feature of the corresponding character in the first feature to obtain the second feature.
Regarding S1024, it should be noted that, in an example, the character recognition model may include a character recognition module, and the character recognition module is configured to obtain a character prediction result according to the second feature. In one example, the character recognition module may be a decoder, which may be a native decoder of a conventional transform model and will not be described in detail herein.
The "obtaining the character included in the image to be processed according to the second feature" mentioned here may be understood as processing according to a third feature obtained by further processing the second feature, so as to obtain a character prediction result.
In one example, the character recognition model may include an additional (N-i) encoders in addition to the aforementioned i encoders that extract the first features. In other words, the character recognition model may include N encoders. And when (N-i) is larger than 1, the last (N-i) encoders are connected in series, and the last (N-i) encoders are used for processing the second feature to obtain a third feature. It will be appreciated that for the (i +1) th encoder of the N encoders, its input is no longer the first characteristic output by the ith encoder, but is the second characteristic output by the characteristic processing module. In other words, in the training phase of the character recognition model, between the ith encoder and the (i +1) th encoder, a feature extraction module and a feature processing module independent of the character recognition model are further included.
It should be noted here that, although the input of the (i +1) th encoder is the second feature in the training stage of the character recognition model, the input of the (i +1) th encoder is the output of the i-th encoder in the application stage of the character recognition model.
S1025: and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
Since the label corresponding to the training image is used to indicate the character included in the training image, and the character prediction result is the character in the training image recognized by the character recognition model, the parameter of the character recognition model may be updated based on the character prediction result and the label corresponding to the training image. In the following training process, the character prediction result of the character recognition model after parameter adjustment can be closer to the label corresponding to the training image.
In one example, when the character prediction result is obtained, in addition to the second feature, a feature corresponding to background noise in the first feature may be considered. For this case, S1024, when implemented specifically, may be: and obtaining a character prediction result according to the second characteristic and the characteristic of the corresponding background noise in the first characteristic. The "obtaining the character prediction result according to the second feature and the feature of the corresponding background noise in the first feature" may be understood as: "obtaining a character prediction result according to the second feature and a part or all of the features of the first feature corresponding to the background noise".
For this case, the last (N-i) encoders of the character recognition model may process the second feature and the feature corresponding to the background noise in the first feature to obtain a fourth feature. Accordingly, the decoder may obtain a character prediction result for the fourth feature.
In one example, the feature of the first feature corresponding to the background noise may also be determined by the aforementioned feature extraction module.
Next, an image processing method provided in an embodiment of the present application will be described. Referring to fig. 2, the figure is a schematic flowchart of an image processing method according to an embodiment of the present application. The image processing method shown in fig. 2 may include the following S201-S202.
S201: an image to be processed including characters is acquired.
The image to be processed may be obtained by shooting with a shooting device, may also be obtained from a network resource, and may also be obtained in other manners, which is not specifically limited in the embodiment of the present application.
S202: inputting the image to be processed into the character recognition model to obtain characters included in the image to be processed; wherein: the character recognition model is used for: and extracting the image characteristics of the image to be processed, and obtaining characters included in the image to be processed according to the image characteristics.
After the image to be processed is obtained, the image to be processed may be input to a trained character recognition model, and the character recognition model may output characters included in the image to be processed.
The character recognition model mentioned here refers to a model obtained by training using the method shown in fig. 1.
As can be seen from the above description of fig. 1, the character recognition model includes N encoders and a decoder, when the character recognition model is used to recognize characters in an image to be processed, the N encoders are used to extract image features of the image to be processed, and the decoder is used to obtain the characters included in the image to be processed according to the image features.
The character recognition model will now be described with reference to fig. 3. Fig. 3 is a schematic structural diagram of a character recognition model according to an embodiment of the present application.
As shown in fig. 3, the character recognition model 300 includes N encoders and decoders 330, the N encoders including: i encoders 310 and (N-i) encoders 320. Wherein:
the encoder 310 and the encoder 330 have the same structure.
In the model training phase:
the output of the ith encoder is the first feature, and then the feature extraction module 400 and the feature processing module 500 independent of the character recognition model process the first feature to obtain the second feature, which is used as the input of the (i +1) th encoder, and the (N-i) encoders 320 obtain the third feature according to the second feature. The decoder 330 obtains a character prediction result according to the third characteristic.
The feature extraction module 400 and the feature processing module 500 process the first feature to obtain a second feature, when the second feature is specifically implemented: the first feature is used as an input of the feature extraction module 400, an output of the feature extraction module 400 is used as an input of the feature processing module 500, and the feature processing module 500 outputs a second feature.
In one example, N-7, i-2.
In the model use stage:
the N encoders are used for processing the image to be processed to obtain the image characteristics of the image to be processed;
the decoder 330 is configured to obtain characters included in the image to be processed according to the image features.
Exemplary device
Based on the method provided by the above embodiment, the embodiment of the present application further provides an apparatus, which is described below with reference to the accompanying drawings.
Referring to fig. 4, the figure is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application. The apparatus 600 may specifically include, for example: an acquisition unit 601 and a processing unit 602.
An acquisition unit 601 configured to acquire an image to be processed including characters;
a processing unit 602, configured to input the image to be processed into the character recognition model, so as to obtain characters included in the image to be processed; wherein:
the character recognition model is used for: extracting image features of the image to be processed, and obtaining characters included in the image to be processed according to the image features; wherein:
the character recognition model is obtained by training in the following way:
acquiring a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image;
training a character recognition model based on the training image and a label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image; wherein:
the training of the character recognition model based on the training images and the labels corresponding to the training images comprises:
extracting a first feature of the training image;
determining the characteristics of the corresponding characters in the first characteristics;
fuzzy processing is carried out on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics;
obtaining a character prediction result according to the second characteristic;
and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
Optionally, the blurring processing on the feature of the corresponding character in the first feature includes any one or more of the following:
removing part of the characteristics of the corresponding characters in the first characteristics; alternatively, the first and second electrodes may be,
and modifying part of the characteristics of the corresponding characters in the first characteristics.
Optionally, the character recognition model includes a decoder and N encoders;
the first i encoders in the N encoders are connected in series, the first i encoders are used for obtaining the first characteristic according to the training image, the first characteristic is the output of the ith encoder, and i is a positive integer smaller than N;
the last (N-i) encoders are connected in series, and the last (N-i) encoders are used for processing the second characteristic to obtain a third characteristic;
and the decoder is used for obtaining the character prediction result according to the third characteristic.
Optionally, when the character recognition model is used to recognize characters in the image to be processed, the N encoders are configured to extract image features of the image to be processed, and the decoder is configured to obtain characters included in the image to be processed according to the image features. 5. The apparatus of claim 1, wherein the determining the feature of the first feature for the corresponding character comprises:
and determining the characteristics of the corresponding characters in the first characteristics by using a characteristic extraction module, wherein the characteristic extraction module is used for determining the characteristics of the corresponding characters in the first characteristics according to the first characteristics, and the characteristic extraction module is independent of the character recognition model.
Optionally, the feature extraction module is a CTC classification module using a time-series classification algorithm.
Optionally, the performing fuzzy processing on the feature of the corresponding character in the first feature to obtain a second feature includes:
and carrying out fuzzy processing on the characteristics of the corresponding characters in the first characteristics by using a characteristic processing module independent of the character recognition model to obtain second characteristics.
Optionally, obtaining a character prediction result according to the second feature includes:
and obtaining a character prediction result according to the second characteristic and the characteristic of the corresponding background noise in the first characteristic.
Since the apparatus 600 is a device corresponding to the image processing method provided in the above method embodiment, and the specific implementation of each unit of the apparatus 600 is the same as the image processing method described in the above method embodiment, reference may be made to the relevant description part of the above method embodiment for the specific implementation of each unit of the apparatus 600, and details are not repeated here.
An embodiment of the present application further provides an apparatus, which includes a processor and a memory;
the processor is used for executing the instructions stored in the memory so as to cause the equipment to execute the image processing method provided by the above method embodiment.
The embodiment of the application provides a computer-readable storage medium which comprises instructions for instructing equipment to execute the image processing method provided by the method embodiment.
The embodiment of the present application further provides a computer program product, which when running on a computer, causes the computer to execute the image processing method provided by the above method embodiment.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (11)

1. An image processing method, characterized in that the method comprises:
acquiring an image to be processed including characters;
inputting the image to be processed into the character recognition model to obtain characters included in the image to be processed; wherein:
the character recognition model is used for: extracting image features of the image to be processed, and obtaining characters included in the image to be processed according to the image features; wherein:
the character recognition model is obtained by training in the following way:
acquiring a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image;
training a character recognition model based on the training image and a label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image; wherein:
the training of the character recognition model based on the training images and the labels corresponding to the training images comprises:
extracting a first feature of the training image;
determining the characteristics of the corresponding characters in the first characteristics;
fuzzy processing is carried out on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics;
obtaining a character prediction result according to the second characteristic;
and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
2. The method according to claim 1, wherein the blurring the feature of the corresponding character in the first feature includes any one or more of:
removing part of the characteristics of the corresponding characters in the first characteristics; alternatively, the first and second electrodes may be,
and modifying part of the characteristics of the corresponding characters in the first characteristics.
3. The method of claim 1, wherein the character recognition model comprises a decoder and N encoders;
the first i encoders in the N encoders are connected in series, the first i encoders are used for obtaining the first characteristic according to the training image, the first characteristic is the output of the ith encoder, and i is a positive integer smaller than N;
the last (N-i) encoders are connected in series, and the last (N-i) encoders are used for processing the second characteristic to obtain a third characteristic;
and the decoder is used for obtaining the character prediction result according to the third characteristic.
4. The method according to claim 3, wherein the N encoders are configured to extract image features of the image to be processed when the character recognition model is used to recognize characters in the image to be processed, and the decoder is configured to obtain characters included in the image to be processed according to the image features.
5. The method of claim 1, wherein determining the feature of the first feature for the corresponding character comprises:
and determining the characteristics of the corresponding characters in the first characteristics by using a characteristic extraction module, wherein the characteristic extraction module is used for determining the characteristics of the corresponding characters in the first characteristics according to the first characteristics, and the characteristic extraction module is independent of the character recognition model.
6. The method of claim 5, wherein the feature extraction module is a time series classification algorithm (CTC) classification module.
7. The method according to claim 1, wherein the blurring the feature of the corresponding character in the first feature to obtain a second feature comprises:
and carrying out fuzzy processing on the characteristics of the corresponding characters in the first characteristics by using a characteristic processing module independent of the character recognition model to obtain second characteristics.
8. The method according to any one of claims 1-7, wherein obtaining a character prediction result according to the second feature comprises:
and obtaining a character prediction result according to the second characteristic and the characteristic of the corresponding background noise in the first characteristic.
9. An image processing apparatus, characterized in that the apparatus comprises:
an acquisition unit configured to acquire an image to be processed including characters;
the processing unit is used for inputting the image to be processed into the character recognition model to obtain characters included in the image to be processed; wherein:
the character recognition model is used for: extracting image features of the image to be processed, and obtaining characters included in the image to be processed according to the image features; wherein:
the character recognition model is obtained by training in the following way:
acquiring a training image and a label corresponding to the training image, wherein the label corresponding to the training image is used for indicating characters included in the training image;
training a character recognition model based on the training image and a label corresponding to the training image, wherein the character recognition model is used for recognizing characters in the image; wherein:
the training of the character recognition model based on the training images and the labels corresponding to the training images comprises:
extracting a first feature of the training image;
determining the characteristics of the corresponding characters in the first characteristics;
fuzzy processing is carried out on the characteristics of the corresponding characters in the first characteristics to obtain second characteristics;
obtaining a character prediction result according to the second characteristic;
and updating the parameters of the character recognition model based on the character prediction result and the label corresponding to the training image.
10. An apparatus, comprising a processor and a memory;
the processor is to execute instructions stored in the memory to cause the device to perform the method of any of claims 1 to 8.
11. A computer-readable storage medium comprising instructions that direct a device to perform the method of any of claims 1-8.
CN202111526049.7A 2021-12-14 2021-12-14 Image processing method and device Active CN114187593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111526049.7A CN114187593B (en) 2021-12-14 2021-12-14 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111526049.7A CN114187593B (en) 2021-12-14 2021-12-14 Image processing method and device

Publications (2)

Publication Number Publication Date
CN114187593A true CN114187593A (en) 2022-03-15
CN114187593B CN114187593B (en) 2024-01-30

Family

ID=80543747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111526049.7A Active CN114187593B (en) 2021-12-14 2021-12-14 Image processing method and device

Country Status (1)

Country Link
CN (1) CN114187593B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108288078A (en) * 2017-12-07 2018-07-17 腾讯科技(深圳)有限公司 Character identifying method, device and medium in a kind of image
WO2018223994A1 (en) * 2017-06-07 2018-12-13 众安信息技术服务有限公司 Method and device for synthesizing chinese printed character image
CN110516577A (en) * 2019-08-20 2019-11-29 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and storage medium
US20200176121A1 (en) * 2018-11-29 2020-06-04 January, Inc. Systems, methods, and devices for biophysical modeling and response prediction
CN112163435A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Machine translation method, machine translation model training method, device and equipment
CN112215221A (en) * 2020-09-22 2021-01-12 国交空间信息技术(北京)有限公司 Automatic vehicle frame number identification method
CN112883967A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883968A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883966A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN113011420A (en) * 2021-03-10 2021-06-22 北京百度网讯科技有限公司 Character recognition method, model training method, related device and electronic equipment
CN113313064A (en) * 2021-06-23 2021-08-27 北京有竹居网络技术有限公司 Character recognition method and device, readable medium and electronic equipment
CN113436137A (en) * 2021-03-12 2021-09-24 北京世纪好未来教育科技有限公司 Image definition recognition method, device, equipment and medium
CN113642583A (en) * 2021-08-13 2021-11-12 北京百度网讯科技有限公司 Deep learning model training method for text detection and text detection method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018223994A1 (en) * 2017-06-07 2018-12-13 众安信息技术服务有限公司 Method and device for synthesizing chinese printed character image
CN108288078A (en) * 2017-12-07 2018-07-17 腾讯科技(深圳)有限公司 Character identifying method, device and medium in a kind of image
US20200176121A1 (en) * 2018-11-29 2020-06-04 January, Inc. Systems, methods, and devices for biophysical modeling and response prediction
CN110516577A (en) * 2019-08-20 2019-11-29 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and storage medium
CN112215221A (en) * 2020-09-22 2021-01-12 国交空间信息技术(北京)有限公司 Automatic vehicle frame number identification method
CN112163435A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Machine translation method, machine translation model training method, device and equipment
CN112883967A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883968A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883966A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN113011420A (en) * 2021-03-10 2021-06-22 北京百度网讯科技有限公司 Character recognition method, model training method, related device and electronic equipment
CN113436137A (en) * 2021-03-12 2021-09-24 北京世纪好未来教育科技有限公司 Image definition recognition method, device, equipment and medium
CN113313064A (en) * 2021-06-23 2021-08-27 北京有竹居网络技术有限公司 Character recognition method and device, readable medium and electronic equipment
CN113642583A (en) * 2021-08-13 2021-11-12 北京百度网讯科技有限公司 Deep learning model training method for text detection and text detection method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHI QIAO等: "SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition", 《ARXIV》, pages 1 - 10 *
邢远: "深度学习在手写数字识别中的应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 05, pages 138 - 439 *

Also Published As

Publication number Publication date
CN114187593B (en) 2024-01-30

Similar Documents

Publication Publication Date Title
EP3340129A1 (en) Artificial neural network class-based pruning
CN110211119B (en) Image quality evaluation method and device, electronic equipment and readable storage medium
CN112950581A (en) Quality evaluation method and device and electronic equipment
CN110956615B (en) Image quality evaluation model training method and device, electronic equipment and storage medium
CN110827297A (en) Insulator segmentation method for generating countermeasure network based on improved conditions
US20220036167A1 (en) Sorting method, operation method and operation apparatus for convolutional neural network
CN109685805B (en) Image segmentation method and device
CN113221601A (en) Character recognition method, device and computer readable storage medium
CN113327584A (en) Language identification method, device, equipment and storage medium
CN110796003B (en) Lane line detection method and device and electronic equipment
CN112183224A (en) Model training method for image recognition, image recognition method and device
CN114187593B (en) Image processing method and device
CN111327946A (en) Video quality evaluation and feature dictionary training method, device and medium
CN110704678A (en) Evaluation sorting method, evaluation sorting system, computer device and storage medium
CN116128044A (en) Model pruning method, image processing method and related devices
CN116129417A (en) Digital instrument reading detection method based on low-quality image
CN115982965A (en) Carbon fiber material damage detection method and device for denoising diffusion sample increment learning
CN115358410A (en) Method, device and equipment for enhancing field of pre-training model and storage medium
CN111340329B (en) Actor evaluation method and device and electronic equipment
CN114724144A (en) Text recognition method, model training method, device, equipment and medium
CN108021918B (en) Character recognition method and device
CN114220106A (en) Image processing method and device
CN114220107A (en) Image processing method and device
CN115204381A (en) Weak supervision model training method and device and electronic equipment
CN114004974A (en) Method and device for optimizing images shot in low-light environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant