CN111325177B

CN111325177B - Weight-customized-based target detection partial identification method

Info

Publication number: CN111325177B
Application number: CN202010144718.3A
Authority: CN
Inventors: 田博帆
Original assignee: Nanjing Hongsong Information Technology Co ltd
Current assignee: Nanjing Hongsong Information Technology Co ltd
Priority date: 2020-03-04
Filing date: 2020-03-04
Publication date: 2023-05-12
Anticipated expiration: 2040-03-04
Also published as: CN111325177A

Abstract

The invention relates to a weight-based self-defined target detection partial identification method, which comprises the following steps: (1) character labeling: firstly, labeling each character contained in the score; (2) self-weight: weight distribution and fractional identification are carried out through self definition; (3) model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters; (4) character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); (5) identifying a reconstruction: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing a fractional expression of the output score; and (6) analysis and judgment: and (5) analyzing the result of the partial expression output in the step (5) and giving out judgment. The method solves the problem of accuracy of character detection so as to improve the recognition rate of the partial formula.

Description

Weight-customized-based target detection partial identification method

Technical Field

The invention relates to the technical field of image processing, in particular to a weight-based self-defined target detection partial identification method.

Background

In the 21 st century, artificial intelligence technology has been rapidly developed, and the ancient cooking vessel of artificial intelligence technology will be entered in the future, and in the process of continuously developing technology, new technology will gradually replace the traditional technology, and has more advantages than the traditional technical method.

In the field of text recognition, numerous automated review products are successively born and support for automated review of various types of questions, each of which involves a different type of technology. Thus, the entire automated review product represents an integrated body of multiple complex technologies. In the review of the mathematical discipline, the identification and judgment of the questions comprising the division may be involved. In the conventional segmentation recognition method, a segmentation method is mostly adopted, and is used for obtaining a numerator and a denominator as a guide, so that the difficulty of the problem is reduced to the accuracy of segmentation. However, with the introduction of deep learning artificial intelligence technology, such problems are gradually simplified by fractional recognition, and can be solved by various technical modes, such as: end-to-end recognition, character detection, etc.

Here, the method concept of character detection is used for detecting the fractional characters, so that the purpose of fractional recognition is achieved. Therefore, based on the thought, the invention provides a weight-customized target detection partial formula recognition method, which solves the problem of accuracy of character detection so as to improve the recognition rate of partial formulas.

Disclosure of Invention

The invention aims to solve the technical problem of providing a weight-based self-defined target detection partial identification method, which solves the problem of accuracy of character detection and improves the partial identification rate.

In order to solve the technical problems, the invention adopts the following technical scheme: the weight-customized-based target detection partial identification method specifically comprises the following steps:

(1) Character labeling: firstly, labeling each character contained in the score;

(2) Self-weight: weight distribution and fractional identification are carried out through self definition;

(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;

(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3);

(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;

(6) And (3) analysis and judgment: and (3) analyzing the result of the partial expression output by the step (5) and giving out judgment.

As a preferable technical scheme of the invention, the loss function of the target detection algorithm YoLo V3 is adopted in the step (2) for detecting the score line, and a calculation formula is as follows:

the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss;

wherein YoLo V3 grids the character image to be detected into K grids, each grid having MCandidate boxes, (x) _i ,y _i ) Representing the central coordinates of the object,

candidate frame center coordinates representing the object, (w) _i ,h _i ) Representing the true width and height of the target, +.>

Width and height of candidate box representing object, +.>

Whether the jth candidate box representing the ith grid is responsible for detecting the target, if so +.>

Otherwise->

The jth candidate box representing the ith grid is not responsible for the target;

representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; c (C) _i Representing a prediction confidence; p is p _i (c) Representing the true probability of the ith grid being class c,/->

Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) _coord Weighting coefficients, lambda, representing candidate targets _noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased _coord Is set as: 0.8.

as a preferable embodiment of the present invention, the characters in the step (1) include numerals, minus signs, and score lines, wherein the numerals include 0 to 9.

As the preferable technical scheme of the invention, when the model is trained in the step (3), N rounds of iterative training are carried out, the sample number batch size of one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that the training model for character detection is obtained.

As a preferred embodiment of the present invention, the detection result in the step (4) includes: coordinate values of the character and predicted values of the character.

As a preferable technical scheme of the invention, the step (6) of analyzing the output expression result and giving judgment includes judging whether the expression result is a band score or a true/false score; or parsed into a specific form of latex or custom format.

As a preferable technical solution of the present invention, the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is determined; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.

Compared with the prior art, the invention has the following beneficial effects: the target detection partial identification method based on weight self-definition mainly carries out self-definition setting on the weight of a loss function in a target detection network, ensures the weight distribution principle of ' large weight of characters difficult to detect ' and small weight of characters easy to detect ', and achieves the aim of accurately detecting partial characters; the problem of accuracy of character detection is solved, so that the recognition rate of the partial formula is improved.

Drawings

The technical scheme of the invention is further described below with reference to the accompanying drawings:

FIG. 1 is a flow chart of a weight-based custom object detection partial identification method of the present invention;

FIG. 2 is a representation of a partial formula in the weight-based custom object detection partial formula identification method of the present invention.

Detailed Description

The present invention will be further described in detail with reference to the drawings and examples, which are only for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.

Examples: as shown in fig. 1, the weight-based self-defined target detection partial identification method specifically includes the following steps:

as shown in fig. 2, (1) character notation: firstly, labeling each character contained in the score;

the characters in the step (1) comprise numbers, negative signs and score lines, wherein the numbers comprise 0-9; a total of 12 classes of characters;

(2) Self-weight: weight distribution and fractional identification are carried out through self definition; the self-defined weight is the self-defined distribution of weight by manual mode, and the loss weight definition of the score line is mainly used in the score identification, and the score line plays a decisive role as a unique mark for distinguishing the score. In the actual detection process, the score line has the characteristic of difficult detection compared with other characters, so the score line has more strict requirements on the score line, and compared with a target easy to detect, the score line can increase the weight of target loss to improve the target detection capability when the difference between a predicted value and a true value is actually calculated;

in the step (2), a loss function of a target detection algorithm YoLo V3 is used for detecting the score line, and a calculation formula is as follows:

wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) _i ,y _i ) Representing the central coordinates of the object,

candidate frame center coordinates representing the object, (w) _i ,h _i ) Watch (watch)True width and height of the target, +.>

Width and height of candidate box representing object, +.>

Otherwise->

The jth candidate box representing the ith grid is not responsible for the target; />

Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) _coord Weighting coefficients, lambda, representing candidate targets _noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased _coord Is set as: 0.8;

the coordinate loss in the step (2) is used for judging whether the score line can be correctly selected and judging whether the score line is deviated or not; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used for representing the credibility of the calculated value of the score line as the true value of the score line; here, the weight value of the score line can be modified in a self-defined manner, the value is generally set to be larger, and a larger penalty is obtained when the loss is calculated and is used for improving the detection and prediction results of the score line;

when the model is trained in the step (3), carrying out N rounds of iterative training, setting a sample number batch size for one training according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, timely disconnecting the model training, so as to obtain an optimal training model for character detection;

(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); the detection result in the step (4) comprises: coordinate values of the characters and predicted values of the characters;

(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment; analyzing the output result of the partial expression in the step (6) and giving out judgment, wherein the judgment comprises judging whether the result is a fractional band or a true-false fraction; or parsed into a specific form of latex or custom format.

It will be apparent to those skilled in the art that the present invention has been described in detail by way of illustration only, and it is not intended to be limited by the above-described embodiments, as long as various insubstantial modifications of the method concepts and aspects of the invention are employed or the inventive concepts and aspects of the invention are directly applied to other applications without modification, all within the scope of the invention.

Claims

1. The target detection partial identification method based on weight customization is characterized by comprising the following steps of:

(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment;

the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss; wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) _i ,y _i ) Representing the central coordinates of the object,

Width and height of candidate box representing object, +.>

The j candidate box representing the i-th grid isWhether or not to take charge of detecting the target, if so

Otherwise->

Representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; lambda (lambda) _coord Weighting coefficients, lambda, representing candidate targets _noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased _coord Is set as: 0.8; c (C) _i Representing prediction confidence, p _i (c) Representing the true probability of the ith grid being class c,/->

Representing the predicted probability of class c for the ith grid, class represents the defined target class.

2. The weight-based customized object detection fractional recognition method of claim 1, wherein the characters in the step (1) comprise numbers, minus signs, fractional lines, wherein the numbers comprise 0-9.

3. The weight-based self-defined target detection pattern recognition method according to claim 1, wherein during the training of the model in the step (3), N rounds of iterative training are performed, the number of samples batch size for one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that a training model for character detection is obtained.

4. The weight-based customized target detection partial identification method according to claim 3, wherein the detection result in the step (4) comprises: coordinate values of the character and predicted values of the character.

5. The weight-based customized target detection partial identification method according to claim 3, wherein the parsing and judging the output partial expression result in the step (6) comprises judging whether the partial expression result is a fractional or true fraction; or parsed into a specific form of latex or custom format.

6. The weight-based customized object detection fractional recognition method according to claim 3, wherein the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is judged; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.