CN112949623B - Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium - Google Patents

Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium Download PDF

Info

Publication number
CN112949623B
CN112949623B CN202110520999.2A CN202110520999A CN112949623B CN 112949623 B CN112949623 B CN 112949623B CN 202110520999 A CN202110520999 A CN 202110520999A CN 112949623 B CN112949623 B CN 112949623B
Authority
CN
China
Prior art keywords
image
transformation
sample
value
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110520999.2A
Other languages
Chinese (zh)
Other versions
CN112949623A (en
Inventor
赵明
田科
朱红
吴中勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century TAL Education Technology Co Ltd
Original Assignee
Beijing Century TAL Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century TAL Education Technology Co Ltd filed Critical Beijing Century TAL Education Technology Co Ltd
Priority to CN202110520999.2A priority Critical patent/CN112949623B/en
Publication of CN112949623A publication Critical patent/CN112949623A/en
Application granted granted Critical
Publication of CN112949623B publication Critical patent/CN112949623B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a model training method, an image quality evaluation method, a model training device, an image quality evaluation device, an electronic device and a medium, which are applied to the technical field of image processing, wherein the model training method comprises the following steps: acquiring a plurality of sample clear images, and performing character recognition on each sample clear image to obtain corresponding first text information; transforming the sample clear image based on the plurality of image attribute transformation indexes to obtain a plurality of sample transformation images corresponding to the sample clear image; performing character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image, and determining the Levenstein distance between the first text information and the second text information corresponding to the clear image of the sample; determining a bad case value corresponding to the sample transformation image based on the Levenson distance; and taking each sample conversion image as input, taking the image attribute conversion index and the bad case value corresponding to each sample conversion image as labels, and training to generate an image quality evaluation model. The method and the device can improve the efficiency of model training.

Description

Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for model training and image quality assessment, an electronic device, and a medium.
Background
At present, when character recognition is performed by an OCR (optical character recognition) technology, an OCR model cannot accurately recognize text information in a low-quality image due to the influence of image quality. Therefore, it is necessary to evaluate the quality of the image to be recognized in advance.
In the related art, a neural network model may be trained in advance, and image quality may be evaluated by the neural network model. However, this method requires a large amount of training data to be collected and labeled manually, resulting in inefficient model training. Also, there is a lack of a method of converting a low-quality image into a high-quality image.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the application provides a model training method, an image quality evaluation method, a model training device, an image quality evaluation device, an electronic device and a medium.
According to a first aspect of the present application, there is provided an image quality assessment model training method, including:
acquiring a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images;
aiming at a single sample clear image, transforming the sample clear image based on a plurality of image attribute transformation indexes to obtain a plurality of sample transformation images corresponding to the sample clear image;
performing character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image, and determining a Levenson distance between first text information corresponding to the sample clear image and the second text information;
determining a bad case value corresponding to the sample transformation image based on the Levensstein distance, wherein the bad case value is used for indicating whether the sample transformation image belongs to a bad case;
and taking each sample conversion image as input, taking an image attribute conversion index and a bad case value corresponding to each sample conversion image as a label, and training to generate an image quality evaluation model.
In an optional implementation manner, the training and generating an image quality evaluation model by using each sample transformation image as an input and using an image attribute transformation index and a bad example value corresponding to each sample transformation image as a label includes:
inputting the sample conversion image into an initial model to obtain an image attribute conversion prediction index and a bad case prediction value corresponding to the sample conversion image;
and training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index and the bad case value corresponding to the sample transformation image to generate an image quality evaluation model.
In an optional implementation manner, the training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image, and the image attribute transformation index and the bad case value corresponding to the sample transformation image includes:
determining a first loss function value based on a first loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index corresponding to the sample transformation image;
determining a second loss function value based on a second loss function according to the image attribute transformation prediction index, the image attribute transformation index and the bad case value corresponding to the sample transformation image;
and determining a loss function value of the initial model according to the first loss function value and the second loss function value, and training the initial model based on the loss function value.
In an optional implementation, the transforming the prediction index, the image attribute transformation index, and the bad case value according to the image attribute corresponding to the sample transformed image, and determining a second loss function value based on a second loss function includes:
if the bad case value corresponding to the sample transformation image is a first numerical value, acquiring an identifiable boundary value of the image corresponding to the sample transformation image;
determining the actual transformation degree corresponding to the sample transformation image based on a second loss function according to the image attribute transformation index corresponding to the sample transformation image and the identifiable boundary value of the image;
transforming the prediction index and the recognizable boundary value of the image according to the image attribute corresponding to the sample transformation image, and determining the prediction transformation degree corresponding to the sample transformation image based on a second loss function;
determining a second loss function value according to the actual transformation degree and the prediction transformation degree;
and if the bad case value corresponding to the sample transformation image is a second numerical value, determining that the second loss function value is 0.
In an optional implementation, the determining, based on the levenstein distance, a bad case value corresponding to the sample transform image includes:
if the Levensstein distance is larger than a preset numerical value, determining that a bad case value corresponding to the sample conversion image is a first numerical value;
and if the Levenson distance is not larger than the preset value, determining that the bad case value corresponding to the sample conversion image is a second value.
In an optional embodiment, before the transforming the sample sharp image based on the plurality of image property transformation indicators, the method further comprises:
obtaining a plurality of transformation indexes corresponding to each image attribute in at least one image attribute;
and generating a plurality of image attribute transformation indexes according to a plurality of transformation indexes corresponding to each image attribute, wherein the image attribute transformation indexes comprise one transformation index corresponding to each image attribute.
In an optional implementation manner, the network structure of the image quality evaluation model is a network structure except for the softmax layer in the ResNet network structure.
According to a second aspect of the present application, there is provided an image quality evaluation method, the method including:
acquiring an image to be evaluated, and processing the image to be evaluated through an image quality evaluation model to obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated;
determining whether the image to be evaluated belongs to a bad case or not based on the bad case value;
wherein the image quality assessment model is trained based on the method of the first aspect.
In an optional implementation manner, after the determining whether the image to be evaluated belongs to a bad case based on the bad case value, the method further includes:
if the image to be evaluated belongs to a bad case, generating an image optimization index corresponding to the image to be evaluated based on the image attribute transformation index;
and performing image transformation on the image to be evaluated based on the image optimization index to obtain a clear image corresponding to the image to be evaluated.
According to a third aspect of the present application, there is provided an image quality evaluation model training apparatus including:
the reference data acquisition module is used for acquiring a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images;
the image transformation module is used for transforming the sample clear image based on a plurality of image attribute transformation indexes aiming at the single sample clear image to obtain a plurality of sample transformation images corresponding to the sample clear image;
the character recognition module is used for carrying out character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image;
the Levenstany distance determining module is used for determining the Levenstany distance between the first text information and the second text information corresponding to the sample clear image;
a bad case value determining module, configured to determine a bad case value corresponding to the sample transformation image based on the levenstein distance, where the bad case value is used to indicate whether the sample transformation image belongs to a bad case;
and the model training module is used for taking each sample conversion image as input, taking the image attribute conversion index and the bad case value corresponding to each sample conversion image as labels, and training to generate an image quality evaluation model.
In an optional implementation manner, the model training module is specifically configured to input the sample transformation image into an initial model, so as to obtain an image attribute transformation prediction index and a bad case prediction value corresponding to the sample transformation image; and training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index and the bad case value corresponding to the sample transformation image to generate an image quality evaluation model.
In an optional implementation manner, the model training module implements the image attribute transformation prediction index corresponding to the sample transformed image, and the image attribute transformation index and the bad case value corresponding to the sample transformed image, and trains the initial model based on a preset loss function:
determining a first loss function value based on a first loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index corresponding to the sample transformation image;
determining a second loss function value based on a second loss function according to the image attribute transformation prediction index, the image attribute transformation index and the bad case value corresponding to the sample transformation image;
and determining a loss function value of the initial model according to the first loss function value and the second loss function value, and training the initial model based on the loss function value.
In an optional implementation manner, the model training module implements the image attribute transformation prediction index, the image attribute transformation index, and the bad case value corresponding to the transformed image according to the sample by:
if the bad case value corresponding to the sample transformation image is a first numerical value, acquiring an identifiable boundary value of the image corresponding to the sample transformation image;
determining the actual transformation degree corresponding to the sample transformation image based on a second loss function according to the image attribute transformation index corresponding to the sample transformation image and the identifiable boundary value of the image;
transforming the prediction index and the recognizable boundary value of the image according to the image attribute corresponding to the sample transformation image, and determining the prediction transformation degree corresponding to the sample transformation image based on a second loss function;
determining a second loss function value according to the actual transformation degree and the prediction transformation degree;
and if the bad case value corresponding to the sample transformation image is a second numerical value, determining that the second loss function value is 0.
In an optional implementation manner, the bad case value determining module is specifically configured to determine that the bad case value corresponding to the sample transformed image is a first numerical value if the levenstein distance is greater than a preset numerical value; and if the Levenson distance is not larger than the preset value, determining that the bad case value corresponding to the sample conversion image is a second value.
In an optional implementation manner, the image quality assessment model training apparatus further includes:
the transformation index acquisition module is used for acquiring a plurality of transformation indexes corresponding to each image attribute in at least one image attribute;
the image attribute transformation index generation module is used for generating a plurality of image attribute transformation indexes according to a plurality of transformation indexes corresponding to each image attribute, wherein the image attribute transformation indexes comprise one transformation index corresponding to each image attribute.
In an optional implementation manner, the network structure of the image quality evaluation model is a network structure except for the softmax layer in the ResNet network structure.
According to a fourth aspect of the present application, there is provided an image quality evaluation apparatus comprising:
the device comprises an image to be evaluated acquisition module, an image quality evaluation module and a comparison module, wherein the image to be evaluated acquisition module is used for acquiring an image to be evaluated and processing the image to be evaluated through an image quality evaluation model to obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated;
the image quality evaluation module is used for determining whether the image to be evaluated belongs to a bad case or not based on the bad case value;
wherein the image quality assessment model is trained based on the method of the first aspect.
In an optional embodiment, the image quality evaluation apparatus further includes:
the image optimization index generation module is used for generating an image optimization index corresponding to the image to be evaluated based on the image attribute transformation index if the image to be evaluated belongs to a bad case;
and the clear image generation module is used for carrying out image transformation on the image to be evaluated based on the image optimization index to obtain a clear image corresponding to the image to be evaluated.
According to a fifth aspect of the present application, there is provided an electronic device comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of the first or second aspect via execution of the executable instructions.
According to a sixth aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the method of the first or second aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
the method comprises the steps of obtaining a plurality of sample clear images, and transforming the sample clear images based on a plurality of image attribute transformation indexes to obtain a plurality of sample transformation images corresponding to the sample clear images. And respectively carrying out character recognition on the sample clear image and the corresponding sample conversion image to obtain first text information and second text information. And determining whether the sample conversion image belongs to a bad example according to the Levinstein distance between the first text information and the second text information. Therefore, data required by model training can be obtained without a manual marking mode, and the efficiency of model training is improved. In addition, the output of the image quality evaluation model comprises the image attribute transformation index and the bad example value of the image to be evaluated, so that the image quality can be determined according to the bad example value, and a reference basis can be provided for a subsequent output clear image according to the image attribute transformation index.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a schematic diagram illustrating a system architecture of an exemplary application environment to which an image quality evaluation model training method and an image quality evaluation method according to an embodiment of the present application can be applied;
FIG. 2 is a flowchart of an image quality assessment model training method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an image quality assessment model training method according to an embodiment of the present disclosure;
FIG. 4 is a schematic illustration of a sample sharp image in an embodiment of the present application;
FIG. 5 is a diagram illustrating a text recognition result of the sample sharp image shown in FIG. 4 according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a sample transformed image in an embodiment of the present application;
FIG. 7 is a diagram illustrating a text recognition result of the sample transformed image shown in FIG. 6 according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a sample transformed image according to an embodiment of the present application;
FIG. 9 is a diagram illustrating a text recognition result of the sample transformed image shown in FIG. 8 according to an embodiment of the present disclosure;
FIG. 10 is a schematic diagram of a sample transformed image according to an embodiment of the present application;
FIG. 11 is a flowchart illustrating a method for training an image quality assessment model according to an embodiment of the present disclosure;
FIG. 12 is a flowchart of an image quality evaluation method according to an embodiment of the present application;
FIG. 13 is a schematic structural diagram of an image quality assessment model training apparatus according to an embodiment of the present application;
FIG. 14 is a schematic structural diagram of an image quality evaluation apparatus according to an embodiment of the present application;
fig. 15 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.
Fig. 1 is a schematic diagram illustrating a system architecture of an exemplary application environment to which an image quality estimation model training method and an image quality estimation method according to an embodiment of the present application can be applied.
As shown in fig. 1, system architecture 100 may include one or more of terminal device 101, terminal device 102, terminal device 103, network 104, and server 105. Network 104 is the medium used to provide communication links between terminal device 101, terminal device 102, terminal device 103, and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The terminal devices 101, 102, 103 may be various electronic devices having a display screen, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The image quality estimation model training method and the image quality estimation method provided by the embodiment of the present application are generally executed by the server 105, and accordingly, the image quality estimation model training device and the image quality estimation device may be disposed in the server 105. However, it is easily understood by those skilled in the art that the image quality evaluation model training method and the image quality evaluation method provided in the embodiments of the present application may also be executed by the terminal device 101, the terminal device 102, and the terminal device 103. For example, the server 105 may train and generate the image quality assessment model in advance through the image quality assessment model training method provided in the embodiment of the present application. The terminal device 101, the terminal device 102 and the terminal device 103 upload the image to be evaluated to the server 105, the server 105 processes the image to be evaluated based on the image quality evaluation model, determines an image attribute transformation index and a bad case value corresponding to the image to be evaluated, and sends the image attribute transformation index and the bad case value to the terminal device 101, the terminal device 102 and the terminal device 103.
Referring to fig. 2, fig. 2 is a flowchart of an image quality assessment model training method in an embodiment of the present application, which may include the following steps:
step S210, obtaining a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images.
In the embodiment of the application, the clear sample image refers to an image which can be recognized by correct characters, that is, the clear sample image is input into an OCR model, and correct text information can be obtained. Character recognition can be carried out on each sample clear image through an OCR (optical character recognition) model or other character recognition technologies, and first text information corresponding to the sample clear images is obtained.
Step S220, for a single sample sharp image, transforming the sample sharp image based on the multiple image attribute transformation indexes to obtain multiple sample transformed images corresponding to the sample sharp image.
Specifically, for each sample clear image, the image attributes of the sample clear image can be transformed to different degrees by an automatic means, so as to obtain a sample transformed image. Wherein the image attributes include at least one of: brightness, saturation, contrast, noise, sharpness, etc. The transformation range of the image attribute can be [ a, b ], wherein a is a minimum value of negative transformation, b is a maximum value of positive transformation, and the transformation range corresponding to each image attribute can be different. For example, the brightness dimming is negative and the brightness dimming is positive. And the character recognition results corresponding to the image transformation extreme values a and b are both empty.
The method and the device can transform a single image attribute to different degrees, and can also transform a plurality of image attributes to different degrees respectively, so that more different sample transform images can be obtained. Thus, when the model is trained, more training data can improve the accuracy of model training.
Step S230, performing character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image, and determining a levenstein distance between the first text information and the second text information corresponding to the sample clear image.
Similarly, the second text information corresponding to each sample converted image can be obtained by performing character recognition on each sample converted image. It is to be understood that the second text information may be correctly recognized text information or may be text information that is not correctly recognized. Therefore, the correctness of the second text information can be judged by calculating the levenstein distance between the first text information and the second text information. The levenstan distance is an editing distance, and refers to the minimum number of editing operations required for converting two strings from one string to another string. The allowed editing operations include replacing one character with another, inserting one character, and deleting one character.
As can be seen, the smaller the levenstein distance is, the higher the correctness of the second text information is represented; the larger the levenstan distance, the lower the correctness of the second text information is represented. When the levenstan distance is 0, the second text information is the same as the first text information, which indicates that the image quality of the sample conversion image is high, and the text information in the sample conversion image can be correctly recognized. In the case where the levenstein distance is greater than or equal to 1, the second text information is different from the first text information, which means that the image quality of the sample-converted image is not high, and the text information in the sample-converted image is not correctly recognized.
Step S240, determining a bad case value corresponding to the sample transformation image based on the Levensian distance, wherein the bad case value is used for indicating whether the sample transformation image belongs to a bad case.
In the embodiment of the present application, the image quality of the sample transform image may be determined based on the levenstein distance. The image quality can be divided into two categories, one category is good case, and the other category is bad case, namely bad case. Accordingly, two bad case values may be set in advance to indicate whether the sample transform image belongs to a bad case. For example, the bad case value corresponding to the good case may be 0, i.e. not a bad case, and the bad case value corresponding to the bad case may be 1, i.e. a bad case.
As described above, the smaller the levenstein distance, the higher the image quality of the sample transform image. In an optional implementation manner, if the levenstein distance is greater than a preset value, determining that a bad case value corresponding to the sample conversion image is a first value, where the first value may be 1; and if the Levenson distance is not larger than the preset value, determining that the bad case value corresponding to the sample conversion image is a second value, wherein the second value can be 0. Wherein the preset value may be 1, 2, etc.
And step S250, taking each sample conversion image as input, taking the image attribute conversion index and the bad case value corresponding to each sample conversion image as labels, and training to generate an image quality evaluation model.
Specifically, an initial model may be set in advance, and a network structure of the initial model is the same as a network structure of the image quality evaluation model generated last. In the training process, network parameter values in the initial model are continuously updated through training data (each sample transformation image, and image attribute transformation indexes and bad example values corresponding to each sample transformation image), so that the initial model is converged, and the image quality evaluation model is generated.
After obtaining each sample conversion image and the image attribute conversion index and the bad example value corresponding to each sample conversion image, the present application may also perform model training using part of the data (for example, 80% of the data) and verify the correctness of the model using the remaining data after the training is completed. And under the condition that the accuracy of the image quality evaluation model does not meet the requirement, a new clear sample image can be obtained again, the steps S210-S240 are executed, more training data are obtained, and the image quality evaluation model is generated through retraining. Thus, the accuracy of the image quality evaluation model can be improved.
According to the image quality evaluation model training method, the plurality of sample clear images are obtained, and the sample clear images are transformed based on the plurality of image attribute transformation indexes to obtain the plurality of sample transformation images corresponding to the sample clear images. And respectively carrying out character recognition on the sample clear image and the corresponding sample conversion image to obtain first text information and second text information. And determining whether the sample conversion image belongs to a bad example according to the Levinstein distance between the first text information and the second text information. Therefore, data required by model training can be obtained without a manual marking mode, and the efficiency of model training is improved. In addition, the output of the image quality evaluation model comprises the image attribute transformation index and the bad example value of the image to be evaluated, so that the image quality can be determined according to the bad example value, and a reference basis can be provided for a subsequent output clear image according to the image attribute transformation index.
Referring to fig. 3, fig. 3 is a schematic diagram of an image quality assessment model training method in an embodiment of the present application.
Firstly, a plurality of sample clear images can be obtained, and for each sample clear image, the optical character recognition model can accurately recognize text information, namely first text information. The image attribute transformation can be performed on each sample sharp image based on the plurality of image attribute transformation indexes, and a plurality of sample transformation images can be obtained. And similarly, character recognition can be carried out on each sample conversion image through the optical character recognition model to obtain second text information. Whether the sample conversion image is a bad case is judged by calculating the Levensian distance between the first text information and the second text information.
Therefore, training data required by the training model can be obtained, namely the sample transformation image is used as input, and the image attribute transformation index and the bad case value corresponding to the sample transformation image are used as labels. And training the preset initial model through the training data to finally generate an image quality evaluation model.
The following embodiment describes the training method of the image quality assessment model in detail.
Assuming that a sample clear image in the plurality of sample clear images is shown in fig. 4, the image 4 is clearer, that is, the image quality is higher, the image is input into an OCR model for character recognition, and recognized text information is shown in fig. 5. It can be seen that with respect to fig. 4, the textual information therein can be accurately identified. Assume that the brightness, saturation, contrast, noise and sharpness of a sample sharp image are 0 respectively, and the image is not a bad example, and therefore the corresponding bad example value is 0, and the label data corresponding to fig. 4 is [0,0,0,0,0,0], where the first five 0 s represent brightness, saturation, contrast, noise and sharpness respectively, and the last 0 represents a bad example value.
Referring to fig. 6, fig. 6 is a schematic diagram of a sample transform image in an embodiment of the present application, when performing image attribute transform, luminance is-50, contrast is-45, sharpness is-70, saturation is 0, and noise is 0. The text information obtained by performing character recognition on fig. 6 can be seen in fig. 7. For example, the character string "cross-sectional area S =2.0cm2, height from the floor" and the character string "are calculated such that the velocity of water at each position on the cross-sectional area of the nozzle" is not recognized, and therefore, the levenstan distance between the text information in fig. 7 and the text information in fig. 5 is large, and the bad case value corresponding to fig. 6 is 1. The tag data corresponding to fig. 6 is [ -50, -45, -70,0, 0,1 ].
Referring to fig. 8, fig. 8 is a schematic diagram of a sample transformed image in the embodiment of the present application, when performing image attribute transformation, the luminance is 60, the contrast is 40, the sharpness is-50, the saturation is 60, and the noise is 0. The text information obtained by performing the character recognition on fig. 8 can be seen in fig. 9. It can be seen that only a few characters are recognized in fig. 9, and thus the bad case value corresponding to fig. 8 is also 1. The tag data corresponding to fig. 8 is [60,40, -50,60,0,1 ].
Referring to fig. 10, fig. 10 is a schematic diagram of a sample transformed image in the embodiment of the present application, when performing image attribute transformation, the luminance is-90, the contrast is-45, the sharpness is-70, the saturation is 0, and the noise is 0. When the character recognition is performed on fig. 10, the recognition result is null, that is, any character cannot be recognized. The tag data corresponding to fig. 10 is [ -90, -45, -70,0,1 ].
By analogy, according to the above manner, a plurality of different image attribute transformations can be performed to obtain a plurality of sample transformation images, character recognition is performed, and corresponding label data can be obtained by judging whether the sample transformation image is a bad example. And training to generate an image quality evaluation model according to the sample transformation images and the label data corresponding to the sample transformation images.
It can be seen that the method transforms the high-quality clear sample image in a multi-dimensional transformation mode, and the transformed data (including five dimensions of brightness, saturation, contrast, noise and definition) and whether the transformed data are bad examples are used as labels for training the image quality evaluation model. The process does not need manual marking, training data are generated in an automatic mode, and the efficiency of model training can be improved. The image quality evaluation model can output a bad case value for judging the quality of the image to be evaluated, and can also output an image attribute transformation index for determining an image optimization index of the image to be evaluated so as to optimize the image to be evaluated and obtain a corresponding clear image.
Referring to fig. 11, fig. 11 is a flowchart of another method for training an image quality assessment model in an embodiment of the present application, which may include the following steps:
step S1110, obtaining a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images.
This step is the same as step S210 in the embodiment of fig. 2, and specific reference may be made to the description in the embodiment of fig. 2, which is not repeated herein.
Step S1120, a plurality of transformation indexes corresponding to each image attribute of the at least one image attribute are obtained.
As previously mentioned, the image attributes may include at least one of: brightness, saturation, contrast, noise, sharpness, etc. For each image attribute, a plurality of transformation indices may be corresponded. For example, as for the luminance, if the luminance of the sample clear image is 0 (%), the plurality of transformation indexes corresponding to the luminance may be-90, -80, -70, -60, -50, -40, -30, -20, -10, 20, 30, 40, 50,60, 70, 80, 90 (%), and the like. Regarding the saturation, if the saturation of the sharp image of the sample is 100 (%), the plurality of transformation indexes corresponding to the saturation may be-20, -15, -10, -5, 10, 15, 20, 25, 30, 35, 40, 45, 50 (%), etc.
Step S1130, a plurality of image attribute transformation indexes are generated according to a plurality of transformation indexes corresponding to each image attribute, where the image attribute transformation indexes include one transformation index corresponding to each image attribute.
In the embodiment of the application, a plurality of transformation indexes corresponding to each image attribute can be combined, and one transformation index corresponding to each image attribute is selected, so that a plurality of image attribute transformation indexes can be obtained. For example, a transformation index-50 corresponding to brightness, a transformation index 60 corresponding to saturation, a transformation index-30 corresponding to contrast, a transformation index 0 corresponding to noise, and a transformation index 20 corresponding to sharpness may be selected. If the image property transformation index is expressed in the order of brightness, saturation, contrast, noise, and sharpness, the image property transformation index may be expressed as [ -50,60, -30,0,20 ]. After obtaining the plurality of image attribute transformation indexes, the sample sharp image may be transformed using the image attribute transformation indexes.
Step S1140, for a single sample sharp image, transform the sample sharp image based on the plurality of image attribute transformation indexes to obtain a plurality of sample transformed images corresponding to the sample sharp image.
Step S1150, performing character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image, and determining a levenstein distance between the first text information and the second text information corresponding to the sample clear image.
Step S1160, determining a bad case value corresponding to the sample transformation image based on the Levensian distance, wherein the bad case value is used for indicating whether the sample transformation image belongs to a bad case.
The steps S1140 to S1160 are the same as the steps S220 to S240 in the embodiment of fig. 2, and refer to the description in the embodiment of fig. 2, which is not repeated herein.
And step S1170, inputting the sample conversion image into the initial model to obtain an image attribute conversion prediction index and a bad case prediction value corresponding to the sample conversion image.
In the embodiment of the present application, the initial model refers to a preset model having the same network structure as the image quality evaluation model, and the network parameter value in the initial model is a preset initial value. In an alternative embodiment, the network structure of the initial model and the image quality evaluation model may be a network structure other than the softmax layer in the ResNet network structure, that is, the final fully-connected layer of the ResNet network structure is changed into an output, and the softmax function in the ResNet network structure is deleted. The ResNet is a residual error network and comprises a plurality of residual error blocks, and because the ResNet introduces jump connection, information of a previous residual error block can flow into a next residual error block without obstruction, so that the information circulation is improved, the problem of disappearing gradient and degradation caused by the fact that the network is too deep is also avoided, and good performance can be guaranteed while a deeper network is trained.
In the training process, the sample conversion image can be input into the initial model, and the image attribute conversion prediction index and the bad case prediction value corresponding to the sample conversion image are obtained. In this embodiment of the application, the predicted value of the bad case output by the initial model may be a preset first value or a preset second value. It can be understood that there is usually a deviation between the image attribute transformation prediction index and the actual image attribute transformation index, and there may also be a deviation between the bad case prediction value and the actual bad case value, and the error can be gradually reduced through training.
It should be noted that, in the present application, the loss function value of the initial model may be calculated according to the image attribute transformation prediction index and the actual image attribute transformation index corresponding to the sample transformation image, and the bad case prediction value and the actual bad case value. The loss function value of the initial model may be calculated without using the bad case prediction value, that is, the loss function value of the initial model may be calculated as in step S1180 below.
Step S1180, training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image, and the image attribute transformation index and the bad case value corresponding to the sample transformation image, and generating an image quality evaluation model.
Specifically, a first loss function value may be calculated according to the image attribute transformation prediction index and the image attribute transformation index corresponding to the sample transformation image, and a second loss function value may be determined according to the bad case value corresponding to the sample transformation image.
In an alternative embodiment, the first loss function value may be determined based on the first loss function according to the image property transformation prediction index corresponding to the sample transformation image and the image property transformation index corresponding to the sample transformation image. The first loss function may be an L1 norm loss function, an L2 norm loss function, or the like. Where the L1 norm (L1 norm) refers to the sum of the absolute values of the elements in the vector, and the L2 norm refers to the sum of the squares of the elements of the vector and then the square root. For example, vector a = [1, -1,3, -2, -1], then the L1 norm of a is 8 and the L2 norm of a is 4.
Suppose that the sample transformed image has corresponding image attribute transformation indexes of [10,70,0,10,10], which in turn correspond to brightness, saturation, contrast, noise, and sharpness. The image property transformation prediction index of the sample transformation image output by the initial model is [12,75,2,11,13 ]. The weights corresponding to the image attributes may be the same or different. If the same weight a is assigned to each image attribute, a first loss function value = a × ((12-10) + (75-70) + (2-0) + (11-10) + (13-10)) =13a is determined based on the L1 norm.
In the embodiment of the present application, the second loss function value may be determined based on the second loss function according to the image attribute transformation prediction index, the image attribute transformation index, and the bad case value corresponding to the sample transformation image. The second loss function may be an L1 norm loss function, an L2 norm loss function, or the like.
The manner in which the second loss function value is determined may be different for different bad case values. Optionally, if the bad case value corresponding to the sample transform image is the first numerical value, which indicates that the image quality of the sample transform image is poor, the recognizable boundary value of the image corresponding to the sample transform image may be obtained. The image recognizable boundary value comprises a plurality of recognizable boundaries of image attributes, each image attribute corresponds to two boundaries, if the image attribute of the sample transformation image is within the corresponding boundary range, the sample transformation image can be recognized, otherwise, the sample transformation image cannot be recognized.
For example, for the aforementioned sample transform image, the corresponding image attribute transform index is [10,70,0,10,10], the recognizable boundary values of the image corresponding to the sample transform image can be represented as [ (-7,5), (-20,60), (-10,5), (-10,10), (-10,10) ], and it can be seen that each image attribute corresponds to two boundary values.
Then, the actual transformation degree corresponding to the sample transformation image can be determined based on the second loss function according to the image attribute transformation index and the image recognizable boundary value corresponding to the sample transformation image. And transforming the prediction index and the recognizable boundary value of the image according to the image attribute corresponding to the sample transformation image, and determining the prediction transformation degree corresponding to the sample transformation image based on the second loss function.
If the second loss function is the L1 loss function, the actual transformation degree and the predicted transformation degree both represent the difference between the transformation index and the boundary value of each image attribute. For each image attribute in the image attribute transformation index and the image attribute transformation prediction index, if the transformation index of the image attribute is within the range of the recognizable boundary value of the image, the difference between the transformation index of the image attribute and the boundary value is 0; otherwise, the difference between the image attribute and the boundary value is the difference between the transformation index of the image attribute and the boundary value in the direction corresponding to the transformation index.
For example, the image attribute transform index corresponding to the sample transform image is [10,70,0,10,10], the image attribute transform predictor of the sample transform image is [12,75,2,11,13], and the image corresponding to the sample transform image may identify boundary values of [ (-7,5), (-20,60), (-10,5), (-10,10), (-10,10) ]. For the first transformation index 10, the corresponding boundary values are-7 and 5, 10 corresponds to the positive boundary 5, and the difference between the transformation index and the boundary value of the image attribute is 10-5; for the third transformation index 0, the difference between the transformation index of the image attribute and the boundary value is 0 in the range of (-10, 5). Therefore, the actual transformation degree corresponding to the sample transformation image is: (10-5) + (70-60) +0+ (10-10) + (10-10) = 15. Similarly, the prediction transformation degree corresponding to the sample transformation image is: (12-5) + (75-60) +0+ (11-10) + (13-10) = 26.
After determining the actual degree of transformation and the predicted degree of transformation, the second loss function value may be determined based on the actual degree of transformation and the predicted degree of transformation. For example, the absolute value of the difference between the two is directly used as the second loss function value.
If the bad case value corresponding to the sample transformed image is the second numerical value, which indicates that the image quality of the sample transformed image is better, it may be determined that the second loss function value is 0. That is, in the case where the bad example value is the second numerical value, the degree of transformation of the sample transformed image may be ignored.
Further, a loss function value of the initial model may be determined from the first loss function value and the second loss function value, and the initial model may be trained based on the loss function value. For example, the sum of the first loss function value and the second loss function value may be used as the loss function value of the initial model, and the weighted sum of the first loss function value and the second loss function value may be used as the loss function value of the initial model.
When the first loss function value and the second loss function value are weighted and summed, since the first loss function value includes the loss function values of a plurality of image attributes, if the number of the image attributes is N, the number of weights of the first loss function value is N, and if the weights corresponding to the N image attributes are all a, and the weight of the second loss function value is b, the relationship between a and b may be satisfied: n × a + b = 1.
According to the image quality evaluation model training method, in the model training process, loss function values are determined by different methods according to different bad case values, so that the bad case values output by the trained image quality evaluation model and the output image attribute transformation indexes have more relevance. For example, when the image attribute transformation indexes are all 0 and the bad case value is 0, which indicates that the image quality is high, and the image attribute transformation indexes output by the image quality evaluation model are all 0, the bad case value output by the image quality evaluation model is also 0; when the image attribute transformation index output by the image quality evaluation model is not 0, the bad example value output by the image quality evaluation model is 1, so that the intuitiveness of the output result of the image quality evaluation model is improved. Besides, the image quality can be determined according to the bad case value, and a reference basis can be provided for a subsequent output clear image according to the image attribute transformation index.
Referring to fig. 12, fig. 12 is a flowchart of an image quality evaluation method in an embodiment of the present application, which may include the following steps:
step S1210, acquiring an image to be evaluated, and processing the image to be evaluated through an image quality evaluation model to obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated; the image quality evaluation model is obtained by training based on the image quality evaluation model training method.
In the embodiment of the application, corresponding to the model training process, in the model application stage, the image to be evaluated is processed through the image quality evaluation model, and the image attribute transformation index and the bad case value corresponding to the image to be evaluated are obtained. The image attribute transformation indexes comprise transformation indexes corresponding to all image attributes, and the transformation indexes are transformation indexes between the image to be evaluated and the clear image corresponding to the image to be evaluated.
In step S1220, based on the bad case value, it is determined whether the image to be evaluated belongs to a bad case.
The bad case value is used for indicating the image quality of the image to be evaluated, for example, if the bad case value is a first numerical value, it is determined that the image to be evaluated belongs to a bad case, and if the bad case value is a second numerical value, it is determined that the image to be evaluated does not belong to the bad case.
In an optional implementation manner, after determining whether the image to be evaluated belongs to a bad case, if the image to be evaluated belongs to the bad case, the image optimization index corresponding to the image to be evaluated can be generated based on the image attribute transformation index; and performing image transformation on the image to be evaluated based on the image optimization index to obtain a clear image corresponding to the image to be evaluated.
For example, when the model is trained, the brightness, saturation, contrast, noise and sharpness corresponding to a sharp image are all 0, then, in the stage of model application, if the image attribute transformation index corresponding to the image to be evaluated is [ -30,75, -60, 0, -50], then the image optimization index is [30, -75,60,0,50 ]. And processing the image to be evaluated according to the image optimization index to obtain a corresponding clear image.
The image quality evaluation method provided by the embodiment of the application can determine the image quality according to the bad case value, and can generate the image optimization index according to the image attribute transformation index under the condition of low image quality so as to optimize the image to be evaluated and obtain the corresponding clear image.
Corresponding to the above method embodiment, an embodiment of the present application further provides an image quality assessment model training device, referring to fig. 13, where fig. 13 is a schematic structural diagram of the image quality assessment model training device in the embodiment of the present application, and the image quality assessment model training device may include:
the reference data obtaining module 1310 is configured to obtain a plurality of sample clear images, perform character recognition on each sample clear image, and obtain first text information corresponding to the sample clear images;
an image transformation module 1320, configured to transform, for a single sample sharp image, the sample sharp image based on a plurality of image attribute transformation indexes, so as to obtain a plurality of sample transformation images corresponding to the sample sharp image;
the character recognition module 1330 is configured to perform character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image;
the levenstein distance determining module 1340 is configured to determine a levenstein distance between the first text information and the second text information corresponding to the sample clear image;
a bad case value determining module 1350, configured to determine a bad case value corresponding to the sample transformed image based on the levenstein distance, where the bad case value is used to indicate whether the sample transformed image belongs to a bad case;
the model training module 1360 is configured to train and generate an image quality evaluation model by using each sample transformed image as an input and using an image attribute transformation index and a bad case value corresponding to each sample transformed image as a label.
In an optional implementation manner, the model training module 1360 is specifically configured to input the sample transformed image into the initial model, so as to obtain an image attribute transformation prediction index and a bad case prediction value corresponding to the sample transformed image; and training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index and the bad case value corresponding to the sample transformation image to generate an image quality evaluation model.
In an alternative embodiment, the model training module 1360 implements training of the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image, and the image attribute transformation index and the bad case value corresponding to the sample transformation image by:
determining a first loss function value based on a first loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index corresponding to the sample transformation image;
determining a second loss function value based on a second loss function according to the image attribute transformation prediction index, the image attribute transformation index and the bad case value corresponding to the sample transformation image;
and determining a loss function value of the initial model according to the first loss function value and the second loss function value, and training the initial model based on the loss function value.
In an alternative embodiment, the model training module 1360 may implement the transforming of the prediction index, the image attribute transformation index, and the bad case value according to the image attribute corresponding to the sample transformed image by:
if the bad case value corresponding to the sample transformation image is the first numerical value, acquiring an identifiable boundary value of the image corresponding to the sample transformation image;
determining the actual transformation degree corresponding to the sample transformation image based on the second loss function according to the image attribute transformation index and the recognizable image boundary value corresponding to the sample transformation image;
transforming the prediction index and the recognizable boundary value of the image according to the image attribute corresponding to the sample transformed image, and determining the prediction transformation degree corresponding to the sample transformed image based on the second loss function;
determining a second loss function value according to the actual transformation degree and the prediction transformation degree;
and if the bad case value corresponding to the sample conversion image is the second numerical value, determining that the second loss function value is 0.
In an optional implementation manner, the bad case value determining module 1350 is specifically configured to determine that the bad case value corresponding to the sample transformed image is a first value if the levenstein distance is greater than a preset value; and if the Levenson distance is not larger than the preset value, determining that the bad case value corresponding to the sample conversion image is the second value.
In an optional implementation manner, the image quality assessment model training apparatus further includes:
the transformation index acquisition module is used for acquiring a plurality of transformation indexes corresponding to each image attribute in at least one image attribute;
the image attribute transformation index generation module is used for generating a plurality of image attribute transformation indexes according to a plurality of transformation indexes corresponding to each image attribute, wherein the image attribute transformation indexes comprise one transformation index corresponding to each image attribute.
In an alternative embodiment, the network structure of the image quality evaluation model is a network structure other than the softmax layer in the ResNet network structure.
Referring to fig. 14, fig. 14 is a schematic structural diagram of an image quality evaluation apparatus in an embodiment of the present application, and may include:
the to-be-evaluated image obtaining module 1410 is configured to obtain an image to be evaluated, process the image to be evaluated through the image quality evaluation model, and obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated;
the image quality evaluation module 1420 is configured to determine whether the image to be evaluated belongs to a bad case based on the bad case value;
and the image quality evaluation model is obtained by training based on the image quality evaluation model training method.
In an alternative embodiment, the image quality evaluation apparatus further includes:
the image optimization index generation module is used for generating an image optimization index corresponding to the image to be evaluated based on the image attribute transformation index if the image to be evaluated belongs to a bad case;
and the clear image generation module is used for carrying out image transformation on the image to be evaluated based on the image optimization index to obtain a clear image corresponding to the image to be evaluated.
The details of each module or unit in the above device have been described in detail in the corresponding method, and therefore, the details are not described herein again.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
An embodiment of the present application further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the steps of the image quality assessment model training method or the image quality assessment method described above.
Fig. 15 is a schematic structural diagram of an electronic device in an embodiment of the present application. It should be noted that the electronic device 1500 shown in fig. 15 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 15, the electronic apparatus 1500 includes a Central Processing Unit (CPU) 1501 which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 1502 or a program loaded from a storage section 1508 into a Random Access Memory (RAM) 1503. In the RAM 1503, various programs and data necessary for system operation are also stored. The CPU 1501, the ROM 1502, and the RAM 1503 are connected to each other by a bus 1504. An input/output (I/O) interface 1505 is also connected to bus 1504.
The following components are connected to the I/O interface 1505: an input portion 1506 including a keyboard, a mouse, and the like; an output portion 1507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1508 including a hard disk and the like; and a communication section 1509 including a network interface card such as a Local Area Network (LAN) card, a modem, or the like. The communication section 1509 performs communication processing via a network such as the internet. A drive 1510 is also connected to the I/O interface 1505 as needed. A removable medium 1511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1510 as necessary, so that a computer program read out therefrom is mounted into the storage section 1508 as necessary.
In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1509, and/or installed from the removable medium 1511. When the computer program is executed by the central processing unit 1501, various functions defined in the apparatus of the present application are executed.
In an embodiment of the present application, a computer-readable storage medium is further provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the above image quality assessment model training method or image quality assessment method.
It should be noted that the computer readable storage medium shown in the present application can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory, a read-only memory, an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, radio frequency, etc., or any suitable combination of the foregoing.
In an embodiment of the present application, a computer program product is further provided, which, when running on a computer, causes the computer to execute the above image quality assessment model training method or image quality assessment method.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. An image quality assessment model training method, characterized in that the method comprises:
acquiring a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images;
aiming at a single sample clear image, transforming the sample clear image based on a plurality of image attribute transformation indexes to obtain a plurality of sample transformation images corresponding to the sample clear image; wherein each of the image attribute transformation indicators includes: a transformation degree value corresponding to at least one image attribute;
performing character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image, and determining a Levenson distance between first text information corresponding to the sample clear image and the second text information;
determining a bad case value corresponding to the sample transformation image based on the Levensstein distance, wherein the bad case value is used for indicating whether the sample transformation image belongs to a bad case;
inputting the sample conversion image into an initial model to obtain an image attribute conversion prediction index and a bad case prediction value corresponding to the sample conversion image;
training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index and the bad case value corresponding to the sample transformation image to generate an image quality evaluation model;
the training of the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image, and the image attribute transformation index and the bad case value corresponding to the sample transformation image comprises:
determining a first loss function value based on a first loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index corresponding to the sample transformation image;
determining a second loss function value based on a second loss function according to the image attribute transformation prediction index, the image attribute transformation index and the bad case value corresponding to the sample transformation image; wherein the different bad example values correspond to different ways of determining the second loss function value;
and determining a loss function value of the initial model according to the first loss function value and the second loss function value, and training the initial model based on the loss function value.
2. The method of claim 1, wherein transforming the prediction index, the image property transformation index, and the bad case value according to the image property corresponding to the sample transformed image based on a second loss function to determine a second loss function value comprises:
if the bad case value corresponding to the sample transformation image is a first numerical value, acquiring an identifiable boundary value of the image corresponding to the sample transformation image;
determining the actual transformation degree corresponding to the sample transformation image based on a second loss function according to the image attribute transformation index corresponding to the sample transformation image and the identifiable boundary value of the image;
transforming the prediction index and the recognizable boundary value of the image according to the image attribute corresponding to the sample transformation image, and determining the prediction transformation degree corresponding to the sample transformation image based on a second loss function;
determining a second loss function value according to the actual transformation degree and the prediction transformation degree;
and if the bad case value corresponding to the sample transformation image is a second numerical value, determining that the second loss function value is 0.
3. The method of claim 1, wherein the determining the bad case value for the sample transform image based on the Levensian distance comprises:
if the Levensstein distance is larger than a preset numerical value, determining that a bad case value corresponding to the sample conversion image is a first numerical value;
and if the Levenson distance is not larger than the preset value, determining that the bad case value corresponding to the sample conversion image is a second value.
4. The method of claim 1, wherein prior to said transforming the sample sharp image based on the plurality of image property transformation indicators, the method further comprises:
obtaining a plurality of transformation indexes corresponding to each image attribute in at least one image attribute;
and generating a plurality of image attribute transformation indexes according to a plurality of transformation indexes corresponding to each image attribute, wherein the image attribute transformation indexes comprise one transformation index corresponding to each image attribute.
5. The method of claim 1, wherein the network structure of the image quality assessment model is a network structure other than a softmax layer in a ResNet network structure.
6. An image quality evaluation method, characterized in that the method comprises:
acquiring an image to be evaluated, and processing the image to be evaluated through an image quality evaluation model to obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated;
determining whether the image to be evaluated belongs to a bad case or not based on the bad case value;
wherein the image quality assessment model is trained based on the method of any one of claims 1 to 5.
7. The method of claim 6, wherein after the determining whether the image to be evaluated belongs to a bad case based on the bad case value, the method further comprises:
if the image to be evaluated belongs to a bad case, generating an image optimization index corresponding to the image to be evaluated based on the image attribute transformation index;
and performing image transformation on the image to be evaluated based on the image optimization index to obtain a clear image corresponding to the image to be evaluated.
8. An image quality evaluation model training apparatus, characterized in that the apparatus comprises:
the reference data acquisition module is used for acquiring a plurality of sample clear images, and performing character recognition on each sample clear image to obtain first text information corresponding to the sample clear images;
the image transformation module is used for transforming the sample clear image based on a plurality of image attribute transformation indexes aiming at the single sample clear image to obtain a plurality of sample transformation images corresponding to the sample clear image; wherein each of the image attribute transformation indicators includes: a transformation degree value corresponding to at least one image attribute;
the character recognition module is used for carrying out character recognition on each sample conversion image to obtain second text information corresponding to each sample conversion image;
the Levenstany distance determining module is used for determining the Levenstany distance between the first text information and the second text information corresponding to the sample clear image;
a bad case value determining module, configured to determine a bad case value corresponding to the sample transformation image based on the levenstein distance, where the bad case value is used to indicate whether the sample transformation image belongs to a bad case;
the model training module is used for inputting the sample conversion image into an initial model to obtain an image attribute conversion prediction index and a bad case prediction value corresponding to the sample conversion image; training the initial model based on a preset loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index and the bad case value corresponding to the sample transformation image to generate an image quality evaluation model;
the model training module is used for realizing the transformation prediction index according to the image attribute corresponding to the sample transformation image, and the image attribute transformation index and the bad case value corresponding to the sample transformation image, and training the initial model based on a preset loss function:
determining a first loss function value based on a first loss function according to the image attribute transformation prediction index corresponding to the sample transformation image and the image attribute transformation index corresponding to the sample transformation image;
determining a second loss function value based on a second loss function according to the image attribute transformation prediction index, the image attribute transformation index and the bad case value corresponding to the sample transformation image; wherein the different bad example values correspond to different ways of determining the second loss function value;
and determining a loss function value of the initial model according to the first loss function value and the second loss function value, and training the initial model based on the loss function value.
9. An image quality evaluation apparatus characterized by comprising:
the device comprises an image to be evaluated acquisition module, an image quality evaluation module and a comparison module, wherein the image to be evaluated acquisition module is used for acquiring an image to be evaluated and processing the image to be evaluated through an image quality evaluation model to obtain an image attribute transformation index and a bad case value corresponding to the image to be evaluated;
the image quality evaluation module is used for determining whether the image to be evaluated belongs to a bad case or not based on the bad case value;
wherein the image quality assessment model is trained based on the method of any one of claims 1 to 7.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any one of claims 1 to 5, or to perform the method of claim 6 or 7, via execution of the executable instructions.
11. A storage medium having stored thereon a computer program for implementing the method of any one of claims 1 to 5, or for implementing the method of claim 6 or 7, when executed by a processor.
CN202110520999.2A 2021-05-13 2021-05-13 Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium Active CN112949623B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110520999.2A CN112949623B (en) 2021-05-13 2021-05-13 Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110520999.2A CN112949623B (en) 2021-05-13 2021-05-13 Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium

Publications (2)

Publication Number Publication Date
CN112949623A CN112949623A (en) 2021-06-11
CN112949623B true CN112949623B (en) 2021-08-13

Family

ID=76233818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110520999.2A Active CN112949623B (en) 2021-05-13 2021-05-13 Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium

Country Status (1)

Country Link
CN (1) CN112949623B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978578B (en) * 2015-04-21 2018-07-27 深圳市点通数据有限公司 Mobile phone photograph text image method for evaluating quality
US10108883B2 (en) * 2016-10-28 2018-10-23 Intuit Inc. Image quality assessment and improvement for performing optical character recognition
CN111192258A (en) * 2020-01-02 2020-05-22 广州大学 Image quality evaluation method and device
CN112668640B (en) * 2020-12-28 2023-10-17 泰康保险集团股份有限公司 Text image quality evaluation method, device, equipment and medium

Also Published As

Publication number Publication date
CN112949623A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN112711660B (en) Method for constructing text classification sample and method for training text classification model
CN110362814B (en) Named entity identification method and device based on improved loss function
CN112084752B (en) Sentence marking method, device, equipment and storage medium based on natural language
CN109993298A (en) Method and apparatus for compressing neural network
CN111767833A (en) Model generation method and device, electronic equipment and storage medium
CN110990627A (en) Knowledge graph construction method and device, electronic equipment and medium
CN113688986A (en) Longitudinal federal prediction optimization method, device, medium, and computer program product
CN114781611A (en) Natural language processing method, language model training method and related equipment
CN111784401A (en) Order taking rate prediction method, device, equipment and readable storage medium
CN111768247A (en) Order-placing rate prediction method, device and readable storage medium
CN111325031A (en) Resume parsing method and device
CN112949623B (en) Model training method, image quality evaluation method, model training device, image quality evaluation device, electronic device, and medium
CN117972038A (en) Intelligent question-answering method, device and computer readable medium
CN117349424A (en) Processing method and device of prompt template applied to language model and electronic equipment
CN114548192A (en) Sample data processing method and device, electronic equipment and medium
CN116684903A (en) Cell parameter processing method, device, equipment and storage medium
CN111324344A (en) Code statement generation method, device, equipment and readable storage medium
CN116028788A (en) Feature binning method, device, computer equipment and storage medium
CN112447173A (en) Voice interaction method and device and computer storage medium
CN114416990B (en) Method and device for constructing object relation network and electronic equipment
CN115358397A (en) Parallel graph rule mining method and device based on data sampling
CN115345600A (en) RPA flow generation method and device
CN118053049B (en) Image evaluation method and device, electronic equipment and storage medium
CN109062903B (en) Method and apparatus for correcting wrongly written words
CN116306917B (en) Task processing method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant