CN112364860A

CN112364860A - Training method and device of character recognition model and electronic equipment

Info

Publication number: CN112364860A
Application number: CN202011227096.7A
Authority: CN
Inventors: 卢永晨
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2020-11-05
Filing date: 2020-11-05
Publication date: 2021-02-12
Anticipated expiration: 2040-11-05
Also published as: CN112364860B

Abstract

The embodiment of the disclosure discloses a training method and device of a character recognition model and electronic equipment. One embodiment of the method comprises: importing a target training sample comprising a character image into a character recognition network to be trained to obtain a first character, wherein the marking data of the target training sample is a second character indicated by the character image; generating a first loss value according to the second character and the first character, wherein the first loss value is used for representing the difference degree between the first character and the second character; generating a second loss value based on the first loss value, wherein the second loss value is used for representing the parameter adjustment amplitude of the character recognition network to be trained; and adjusting parameters of the character recognition network to be trained based on the second loss value, and generating a character recognition model based on the character recognition network to be trained after the parameters are adjusted. Therefore, a new training mode aiming at the character recognition model can be provided.

Description

Training method and device of character recognition model and electronic equipment

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a method and an apparatus for training a character recognition model, and an electronic device.

Background

Optical Character Recognition (OCR) refers to a process of analyzing and recognizing an image file of text data to obtain text and layout information. I.e. the text in the image is recognized and returned in the form of text.

The character recognition model may be used to recognize what the text indicated by the character is.

Disclosure of Invention

This disclosure is provided to introduce concepts in a simplified form that are further described below in the detailed description. This disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, an embodiment of the present disclosure provides a training method for a character recognition model, where the method includes: importing a target training sample comprising a character image into a character recognition network to be trained to obtain a first character, wherein the marking data of the target training sample is a second character indicated by the character image; generating a first loss value according to the second character and the first character, wherein the first loss value is used for representing the difference degree between the first character and the second character; generating a second loss value based on the first loss value, wherein the second loss value is used for representing the parameter adjustment amplitude of the character recognition network to be trained; and adjusting parameters of the character recognition network to be trained based on the second loss value, and generating a character recognition model based on the character recognition network to be trained after parameter adjustment.

In a second aspect, an embodiment of the present disclosure provides a training apparatus for a character recognition model, which is applied to a terminal device, and includes: the device comprises an importing unit, a character recognition network and a character recognition unit, wherein the importing unit is used for importing a target training sample comprising a character image into the character recognition network to be trained to obtain a first character, and the marking data of the target training sample is a second character indicated by the character image; a first generating unit, configured to generate a first loss value according to the second character and the first character, where the first loss value is used to characterize a difference degree between the first character and the second character; a second generating unit, configured to generate a second loss value based on the first loss value, where the second loss value is used to characterize a parameter adjustment amplitude of the to-be-trained character recognition network; and the adjusting unit is used for carrying out parameter adjustment on the character recognition network to be trained based on the second loss value and generating a character recognition model based on the character recognition network to be trained after the parameter adjustment.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method of training a character recognition model according to the first aspect.

In a fourth aspect, the disclosed embodiments provide a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the steps of the training method of the character recognition model according to the first aspect.

According to the training method and device for the character recognition model and the electronic equipment, the target training sample can be led into the character recognition network to be trained to obtain a first character (namely a recognition result); then, comparing the first character with second data (namely labeling data of the target training sample) to generate a first loss value, wherein the first loss value can represent the difference degree between the first character and the second character; then, generating a second loss value based on the first loss value, wherein the second loss value can be used for representing the amplitude of the adjustment of the network parameter; finally, parameter adjustment can be carried out on the character recognition network to be trained on the basis of the second loss value, and a character recognition model is generated on the basis of the adjusted character recognition network to be trained. Therefore, a new training method for the character recognition model can be provided, and the training method can perform parameter adjustment of different degrees (characterization by using the second loss value) based on the recognition effect (characterization by using the first loss value) of the character recognition network to be trained on the target training sample, so that the learning strength is increased due to poor recognition effect, and the recognition accuracy of the trained character recognition model is improved.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

FIG. 1 is a flow diagram of one embodiment of a training method of a character recognition model according to the present disclosure;

FIG. 2 is a schematic diagram illustrating an embodiment of an apparatus for training a character recognition model according to the present disclosure;

FIG. 3 is an exemplary system architecture to which the training method of the character recognition model of one embodiment of the present disclosure may be applied;

fig. 4 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Referring to FIG. 1, a flow of one embodiment of a training method of a character recognition model according to the present disclosure is shown. The training method of the character recognition model shown in fig. 1 includes the following steps:

step 101, importing a target training sample including a character image into a character recognition network to be trained to obtain a first character.

In this embodiment, an executive body (e.g., a server) of the training method for the character recognition model may import the target training sample into the character recognition network to be trained, so as to obtain the first character.

In this embodiment, the target training sample may include a character image. Here, the character image may indicate a second character.

In this embodiment, the target training sample may be associated with the second character. Here, the second character may be used as the annotation data.

In this embodiment, the character recognition network to be trained may be an untrained or untrained neural network, and this neural network may be used to detect characters. The structure of the neural network can be built according to the actual application scenario, and is not limited herein. As an example, the character recognition network to be trained may be established based on one or more of a convolutional neural network, a long-term memory network, and the like.

In this embodiment, the input of the character recognition network to be trained may be a training sample, and the output may be a character result obtained by detecting the input training sample.

In this embodiment, the target training sample may be any one or more samples; the addition of the above-mentioned target does not constitute a limitation to the training sample, and is added for convenience of explanation.

Step 102, generating a first loss value according to the second character and the first character.

In this embodiment, the execution body may generate the first loss value according to the first character and the second character.

In this embodiment, the first loss value may be used to characterize the degree of difference between the second character and the first character.

In this embodiment, the second character is used as the marking data, and the difference between the first character and the second character can be used as the first loss value.

In this embodiment, a specific manner of generating the first loss value according to the first character and the second character is not limited. As an example, the first loss value may be generated using various loss functions.

As an example, a loss function may be set, which may be used to compute a function of the performance of the neural network on a particular task.

As an example, the first character may be represented in the form of a first vector. The second character may be represented in the form of a second vector. An absolute value of a difference between the first vector and the second vector may be taken as the first loss value.

As an example, the first character may be represented in the form of a first vector. The second character may be represented in the form of a second vector. Then, a similarity between the first vector and the second vector may be determined. Then, the reciprocal of the similarity is regarded as a first loss value.

As an example, a number may be passed along the neural network for the training samples, and the difference between this number and the actual number used as the annotation data is squared, thus calculating the distance between the predicted value and the actual value, and the training neural network is expected to reduce this distance or the loss function.

Step 103, generating a second loss value based on the first loss value.

In this embodiment, the execution body may transform the first loss value to generate a second loss value.

In this embodiment, the transformation manner of the first loss value may be determined according to an actual application scenario, which is not limited herein.

In this embodiment, the second loss value may be used to characterize the parameter adjustment amplitude of the character recognition network to be trained.

Here, the parameter adjustment range may be understood as a degree of change in the parameter value after adjustment relative to the parameter value before adjustment in a scene in which the parameter is adjusted. As an example, the degree of change in the parameter value may be expressed in terms of a percentage of change; if there are multiple parameters, this can be expressed as an average of the percent change of each parameter.

And 104, performing parameter adjustment on the character recognition network to be trained based on the second loss value, and generating a character recognition model based on the character recognition network to be trained after the parameter adjustment.

In this embodiment, the executing entity may perform parameter adjustment on the character recognition network to be trained based on the second loss value, and generate the character recognition model based on the character recognition network to be trained after the parameter adjustment.

In the present embodiment, the parameters in the character recognition network may also include, but are not limited to, at least one of the following: weights in the character recognition network, bias terms in the character recognition network.

In this embodiment, based on the second loss value, the parameter of the character recognition network to be trained may be adjusted in various ways, which is not limited herein.

By way of example, the character recognition network to be trained may be subjected to parameter adjustment in a manner of back propagation, gradient descent, or the like.

In this embodiment, the character recognition network to be trained after the parameter adjustment may be used as the character recognition network, or the character recognition network to be trained after the parameter adjustment may be trained to obtain the character recognition model.

In some application scenarios, training the character recognition network to be trained may be stopped when a preset stop condition is met. As an example, the preset stop condition may include, but is not limited to: the iteration (or updating) times of the character recognition network to be trained are equal to a preset time threshold, and the loss function value between the recognition result output by the character recognition network to be trained and the marking data is smaller than the preset loss function threshold.

It should be noted that, with the training method of the character recognition model provided in this embodiment, the target training sample may be introduced into the character recognition network to be trained to obtain a first character (i.e., a recognition result); then, comparing the first character with second data (namely labeling data of the target training sample) to generate a first loss value, wherein the first loss value can represent the difference degree between the first character and the second character; then, generating a second loss value based on the first loss value, wherein the second loss value can be used for representing the amplitude of the adjustment of the network parameter; finally, parameter adjustment can be carried out on the character recognition network to be trained on the basis of the second loss value, and a character recognition model is generated on the basis of the adjusted character recognition network to be trained. Therefore, a new training method for the character recognition model can be provided, and the training method can perform parameter adjustment of different degrees (characterization by using the second loss value) based on the recognition effect (characterization by using the first loss value) of the character recognition network to be trained on the target training sample, so that the learning strength is increased due to poor recognition effect, and the recognition accuracy of the trained character recognition model is improved.

In some embodiments, the above method further comprises: and performing character recognition by using the generated character recognition model.

It should be noted that, the generated character recognition model is used to perform character recognition, and in the training process of the character recognition model, a training sample with a poor recognition effect is learned with great pertinence as much as possible, so that the character recognition model has a better recognition effect on the character recognition scene under the condition.

In some embodiments, the step 103 may include: and obtaining a second loss value corresponding to the first loss value of the target training sample according to the corresponding relation between the first loss value and the second loss value.

As an example, a first loss value with a value less than 0.1 may be associated with a second loss value with a value of 0.01; a first loss value having a value of not less than 0.1 and less than 0.5 may be corresponded to a second loss value having a value of 0.5; a first loss value having a value of not less than 0.5 and less than 1 may be corresponded to a second loss value having a value of 1.

In some embodiments, the step 103 may include: the first loss value is transformed to generate a second loss value.

Here, the first loss value itself may be transformed to generate the second loss value.

Here, the second loss value generated by transforming the first loss value itself may be generated based on a specific value of the first loss value; therefore, the second loss value can be generated according to the specific numerical value of the first loss value, the accuracy of the second loss value is improved, and therefore the accuracy of the character recognition model can be improved.

In some embodiments, the transforming the first loss value to generate the second loss value may include: and converting the first loss value according to the sample type of the target training sample to generate a second loss value, wherein the sample type is related to the character type of the characters in the sample.

In some embodiments, a scaling direction of the second loss value relative to the first loss value is related to a sample type to which the target training sample belongs.

Here, three cases may occur with respect to the second loss value with respect to the first loss value: larger, smaller, or unchanged. If the loss value is larger, the scaling direction of the second loss value relative to the first loss value is enlarged; if the loss is smaller, the scaling direction of the second loss value relative to the first loss value is reduced; if it is not, the scaling direction of the second loss value with respect to the first loss value is kept still.

Here, the sample type of the training sample may be predefined.

In some application scenarios, the sample type to which the target training sample belongs may be divided according to a sample scale. As an example, a sample including the uncommon word, which may be a small proportion of the sample set, is determined to be a first type sample; samples that include common words, which may be a large proportion of the set of samples, may be determined to be samples of the second type. It can be understood that the classification of uncommon words and common words may refer to relevant criteria, and will not be described herein.

In some application scenarios, if the target training sample type belongs to a first type of sample, the second loss value may be amplified relative to the first loss value; if the target training sample type belongs to a second type of sample, the second penalty value may be reduced relative to the first penalty value. Since the probability of remaining unchanged is very small, an accident can be interpreted.

It should be noted that, by using the scaling direction of the second loss value relative to the first loss value, which is related to the sample type to which the target training sample belongs, the scaling direction of the first loss value can be changed for different types of samples, so that for different types of samples, the character recognition network to be trained can be adjusted by adopting different parameter adjustment degrees. Thus, the character recognition network to be trained can be improved

For example, for the first type of sample, the recognition result of the character recognition network to be trained may be poor (i.e., the first loss value may be large), and then the first loss value may be amplified to obtain the second loss value. Therefore, parameter adjustment is carried out on the character recognition network to be trained on the basis of the second loss value, and compared with the parameter adjustment range based on the first loss value, the parameter adjustment range in the character recognition network to be trained can be enlarged, so that the character recognition network to be trained can fully learn the relevant characteristics of the first type sample.

As an example, for the second type sample, the recognition result of the character recognition model to be trained may be better (i.e., the first loss value may be smaller), and then the first loss value may be reduced to obtain the second loss value. Therefore, parameter adjustment is carried out on the character recognition network to be trained based on the second loss value, and compared with the adjustment range based on the first loss value, the adjustment range of the parameters in the character recognition network to be trained can be reduced, so that the learning degree of the character recognition network to be trained on the second type sample is reduced under the condition that the character recognition network to be trained can recognize the second type sample, and the learning result of the character recognition network to be trained on the first type sample is consolidated.

In some embodiments, the transforming the first loss value to generate the second loss value may include: and transforming the first loss value by using a preset transformation function to generate the second loss value.

Here, the argument of the transformation function is a first loss value, and the dependent variable of the transformation function is a second loss value.

Here, the form of the transformation function may be set according to an actual application scenario, and is not limited herein.

In some embodiments, the transformation function may include at least one coefficient.

Here, the transform function is provided, and the first loss value can be directly transformed by the transform function without setting a threshold value for the first loss value. This can improve the flexibility of converting the first loss value. Specifically, the function of scaling the first loss value can be realized by setting the transformation function, and some coefficients of the transformation function can be adjusted according to an actual application scenario, so that the coefficients can be flexibly adjusted for different sample sets or different samples, thereby improving the flexibility of training, increasing the training speed as much as possible, and improving the accuracy of the obtained model.

In some embodiments, the coefficients in the transform function are adjusted according to the ratio of the first type of samples in the set of training samples.

Here, the coefficients in the transformation function may be adjusted according to the ratio of the first type samples in the training sample set.

As an example, the occupied ratio is large, and the adjustment direction of the coefficient may be adjusted in a direction of reducing the degree of scaling; the proportion is smaller, and the adjustment direction of the coefficient can be adjusted to the direction of increasing the scaling degree. Therefore, when the proportion is large, the probability that the character recognition network to be trained recognizes the first type sample is high, and the amplitude of adjusting parameters on one training sample can be properly reduced; when the proportion is small, the probability that the character recognition network to be trained recognizes the first type sample is less, and the amplitude of adjusting the parameters on one training sample can be properly increased.

In some embodiments, the transforming the first loss value to generate the second loss value may include: determining a scaling factor according to the first loss value; and determining the product of the first loss value and the scaling coefficient as a second loss value.

Here, the scaling factor for the first loss value may be determined according to the first loss value itself. The scaling factor determined from the first loss value may have a higher degree of matching with the first loss value. Alternatively, in the case where the first loss value is smaller, the scaling factor may be smaller; in the case where the first loss value is larger, the scaling factor may be larger.

Thereby, the second loss value can be determined from the first loss value itself, whereby the accuracy of the second loss value can be improved. The more accurate second loss value can be utilized to adjust the parameters of the character recognition network to be trained in a finer granularity.

In some embodiments, the determining a scaling factor according to the first loss value may include: determining a first candidate value by taking a first coefficient larger than 1 as a base and taking the opposite number of the first loss value as a finger; determining a second candidate value by taking a difference value between the second coefficient and the first candidate value as a base and a third coefficient as a finger; and determining the product of the second candidate value and a third coefficient as a scaling coefficient.

Here, the first coefficient may be greater than 1, and as an example, the first coefficient may be a natural base number.

Here, the inverse of the first loss value may be inversely related to the first loss value. The first candidate value is inversely related to the first loss value.

Here, the value of the second coefficient may be determined according to actual conditions, and as an example, the value of the second coefficient may be 1. The second candidate is positively correlated with the first candidate.

Here, the value of the third coefficient may be determined according to actual conditions, and as an example, the value of the third coefficient may be a positive number.

It should be noted that, by the above-mentioned manner of determining the scaling factor, the determination of the scaling factor can be adjusted more accurately by adjusting a plurality of coefficients, so as to improve the accuracy of the determined second loss value.

In some embodiments, the transforming the first loss value to generate the second loss value may include at least one of: amplifying the first loss value in response to determining that the first loss value is greater than a first threshold; in response to determining that the first loss value is not greater than a first threshold, scaling down the first loss value.

Here, the first loss value may be compared with a first threshold value, and whether to enlarge or reduce the first loss value may be determined according to the comparison result.

Optionally, the manner of enlarging or reducing the first loss value may be set according to an actual application scenario, and is not limited herein.

It should be noted that, by setting the first threshold value and determining the scaling direction of the first loss value, the calculation amount for determining the scaling direction can be reduced, and the calculation speed can be increased, thereby increasing the speed for training the character recognition network to be trained.

With further reference to fig. 2, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an apparatus for training a character recognition model, where the apparatus embodiment corresponds to the method embodiment shown in fig. 1, and the apparatus may be applied to various electronic devices.

As shown in fig. 2, the training apparatus for a character recognition model of the present embodiment includes: an importing unit 201, a first generating unit 202, a second generating unit 203, and an adjusting unit 204. The device comprises an importing unit, a character recognition network and a character recognition unit, wherein the importing unit is used for importing a target training sample comprising a character image into the character recognition network to be trained to obtain a first character, and the marking data of the target training sample is a second character indicated by the character image; a first generating unit, configured to generate a first loss value according to the second character and the first character, where the first loss value is used to characterize a difference degree between the first character and the second character; a second generating unit, configured to generate a second loss value based on the first loss value, where the second loss value is used to characterize a parameter adjustment amplitude of the to-be-trained character recognition network; and the adjusting unit is used for carrying out parameter adjustment on the character recognition network to be trained based on the second loss value and generating a character recognition model based on the character recognition network to be trained after the parameter adjustment.

In this embodiment, specific processes of the importing unit 201, the first generating unit 202, the second generating unit 203, and the adjusting unit 204 of the training apparatus for character recognition models and technical effects brought by the specific processes can refer to the related descriptions of step 101, step 102, step 103, and step 104 in the corresponding embodiment of fig. 1, and are not repeated herein.

In some embodiments, the generating a second loss value based on the first loss value comprises: and transforming the first loss value to generate the second loss value.

In some embodiments, said transforming said first penalty value to generate said second penalty value comprises: and converting the first loss value according to the sample type of the target training sample to generate a second loss value, wherein the sample type is related to the character type of the characters in the sample.

In some embodiments, the sample types include at least one of: the first type and the second type, wherein the first type sample comprises an uncommon word character image, and the second type sample comprises a common word character image.

In some embodiments, the generating a second loss value based on the first loss value comprises: and transforming the first loss value by using a preset transformation function to generate the second loss value, wherein the independent variable of the transformation function is the first loss value, the dependent variable of the transformation function is the second loss value, and the transformation function comprises at least one coefficient.

In some embodiments, the coefficients in the transformation function are adjusted according to a ratio of a first type of sample in the training sample set, wherein the first type of sample comprises an image of a uncommon word character.

In some embodiments, the generating a second loss value based on the first loss value comprises: determining a scaling factor according to the first loss value; and determining the product of the first loss value and the scaling coefficient as a second loss value.

In some embodiments, said determining a scaling factor from the first loss value comprises: determining a first candidate value by taking a first coefficient larger than 1 as a base and taking the opposite number of the first loss value as a finger; determining a second candidate value by taking a difference value between the second coefficient and the first candidate value as a base and a third coefficient as a finger; and determining the product of the second candidate value and a third coefficient as a scaling coefficient.

In some embodiments, the generating a second loss value based on the first loss value comprises at least one of: amplifying the first loss value in response to determining that the first loss value is greater than a first threshold; in response to determining that the first loss value is not greater than a first threshold, scaling down the first loss value.

In some embodiments, the apparatus is further configured to: and performing character recognition by using the generated character recognition model.

Referring to fig. 3, fig. 3 illustrates an exemplary system architecture to which the training method of the character recognition model of one embodiment of the present disclosure may be applied.

As shown in fig. 3, the system architecture may include

terminal devices

301, 302, 303, a network 304, and a server 305. The network 304 serves as a medium for providing communication links between the

terminal devices

301, 302, 303 and the server 305. Network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The

terminal devices

301, 302, 303 may interact with a server 305 over a network 304 to receive or send messages or the like. The

terminal devices

301, 302, 303 may have various client applications installed thereon, such as a web browser application, a search-type application, a news-information-type application. The client application in the

terminal device

301, 302, 303 may receive the instruction of the user, and complete the corresponding function according to the instruction of the user, for example, add the corresponding information to the information according to the instruction of the user.

The

terminal devices

301, 302, 303 may be hardware or software. When the

terminal devices

301, 302, 303 are hardware, they may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like. When the

terminal device

301, 302, 303 is software, it can be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 305 may be a server providing various services, for example, receiving an information acquisition request sent by the

terminal devices

301, 302, 303, and acquiring the presentation information corresponding to the information acquisition request in various ways according to the information acquisition request. And the relevant data of the presentation information is sent to the

terminal devices

301, 302, 303.

It should be noted that the training method of the character recognition model provided by the embodiment of the present disclosure may be executed by a terminal device, and accordingly, the training apparatus of the character recognition model may be disposed in the

terminal device

301, 302, 303. In addition, the training method of the character recognition model provided by the embodiment of the present disclosure may also be executed by the server 305, and accordingly, a training device of the character recognition model may be disposed in the server 305.

It should be understood that the number of terminal devices, networks, and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to fig. 4, shown is a schematic diagram of an electronic device (e.g., a terminal device or a server of fig. 3) suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 4, the electronic device may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 401 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 401.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: importing a target training sample comprising a character image into a character recognition network to be trained to obtain a first character, wherein the marking data of the target training sample is a second character indicated by the character image; generating a first loss value according to the second character and the first character, wherein the first loss value is used for representing the difference degree between the first character and the second character; generating a second loss value based on the first loss value, wherein the second loss value is used for representing the parameter adjustment amplitude of the character recognition network to be trained; and adjusting parameters of the character recognition network to be trained based on the second loss value, and generating a character recognition model based on the character recognition network to be trained after parameter adjustment.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation on the unit itself, for example, an import unit may also be described as a "unit importing a character recognition network to be trained".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A training method of a character recognition model is characterized by comprising the following steps:

importing a target training sample comprising a character image into a character recognition network to be trained to obtain a first character, wherein the marking data of the target training sample is a second character indicated by the character image;

generating a first loss value according to the second character and the first character, wherein the first loss value is used for representing the difference degree between the first character and the second character;

generating a second loss value based on the first loss value, wherein the second loss value is used for representing the parameter adjustment amplitude of the character recognition network to be trained;

and adjusting parameters of the character recognition network to be trained based on the second loss value, and generating a character recognition model based on the character recognition network to be trained after parameter adjustment.

2. The method of claim 1, wherein generating a second penalty value based on the first penalty value comprises:

and transforming the first loss value to generate the second loss value.

3. The method of claim 2, wherein transforming the first loss value to generate the second loss value comprises:

and converting the first loss value according to the sample type of the target training sample to generate a second loss value, wherein the sample type is related to the character type of the characters in the sample.

4. The method of claim 3, wherein the sample type comprises at least one of: the first type and the second type, wherein the first type sample comprises an uncommon word character image, and the second type sample comprises a common word character image.

5. The method of claim 2, wherein generating a second penalty value based on the first penalty value comprises:

and transforming the first loss value by using a preset transformation function to generate the second loss value, wherein the independent variable of the transformation function is the first loss value, the dependent variable of the transformation function is the second loss value, and the transformation function comprises at least one coefficient.

6. The method of claim 5, wherein the coefficients in the transformation function are adjusted according to the proportion of a first type of sample in the training sample set, wherein the first type of sample comprises an image of a rare character.

7. The method of claim 2, wherein generating a second penalty value based on the first penalty value comprises:

determining a scaling factor according to the first loss value;

and determining the product of the first loss value and the scaling coefficient as a second loss value.

8. The method of claim 7, wherein determining a scaling factor based on the first loss value comprises:

determining a first candidate value by taking a first coefficient larger than 1 as a base and taking the opposite number of the first loss value as a finger;

determining a second candidate value by taking a difference value between the second coefficient and the first candidate value as a base and a third coefficient as a finger;

and determining the product of the second candidate value and a third coefficient as a scaling coefficient.

9. The method of claim 2, wherein generating a second loss value based on the first loss value comprises at least one of:

amplifying the first loss value in response to determining that the first loss value is greater than a first threshold;

in response to determining that the first loss value is not greater than a first threshold, scaling down the first loss value.

10. The method according to any one of claims 1-9, further comprising:

and performing character recognition by using the generated character recognition model.

11. An apparatus for training a character recognition model, comprising:

the device comprises an importing unit, a character recognition network and a character recognition unit, wherein the importing unit is used for importing a target training sample comprising a character image into the character recognition network to be trained to obtain a first character, and the marking data of the target training sample is a second character indicated by the character image;

a first generating unit, configured to generate a first loss value according to the second character and the first character, where the first loss value is used to characterize a difference degree between the first character and the second character;

a second generating unit, configured to generate a second loss value based on the first loss value, where the second loss value is used to characterize a parameter adjustment amplitude of the to-be-trained character recognition network;

and the adjusting unit is used for carrying out parameter adjustment on the character recognition network to be trained based on the second loss value and generating a character recognition model based on the character recognition network to be trained after the parameter adjustment.

12. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-10.

13. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-10.