CN111325177B - Weight-customized-based target detection partial identification method - Google Patents
Weight-customized-based target detection partial identification method Download PDFInfo
- Publication number
- CN111325177B CN111325177B CN202010144718.3A CN202010144718A CN111325177B CN 111325177 B CN111325177 B CN 111325177B CN 202010144718 A CN202010144718 A CN 202010144718A CN 111325177 B CN111325177 B CN 111325177B
- Authority
- CN
- China
- Prior art keywords
- character
- detection
- weight
- representing
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention relates to a weight-based self-defined target detection partial identification method, which comprises the following steps: (1) character labeling: firstly, labeling each character contained in the score; (2) self-weight: weight distribution and fractional identification are carried out through self definition; (3) model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters; (4) character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); (5) identifying a reconstruction: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing a fractional expression of the output score; and (6) analysis and judgment: and (5) analyzing the result of the partial expression output in the step (5) and giving out judgment. The method solves the problem of accuracy of character detection so as to improve the recognition rate of the partial formula.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a weight-based self-defined target detection partial identification method.
Background
In the 21 st century, artificial intelligence technology has been rapidly developed, and the ancient cooking vessel of artificial intelligence technology will be entered in the future, and in the process of continuously developing technology, new technology will gradually replace the traditional technology, and has more advantages than the traditional technical method.
In the field of text recognition, numerous automated review products are successively born and support for automated review of various types of questions, each of which involves a different type of technology. Thus, the entire automated review product represents an integrated body of multiple complex technologies. In the review of the mathematical discipline, the identification and judgment of the questions comprising the division may be involved. In the conventional segmentation recognition method, a segmentation method is mostly adopted, and is used for obtaining a numerator and a denominator as a guide, so that the difficulty of the problem is reduced to the accuracy of segmentation. However, with the introduction of deep learning artificial intelligence technology, such problems are gradually simplified by fractional recognition, and can be solved by various technical modes, such as: end-to-end recognition, character detection, etc.
Here, the method concept of character detection is used for detecting the fractional characters, so that the purpose of fractional recognition is achieved. Therefore, based on the thought, the invention provides a weight-customized target detection partial formula recognition method, which solves the problem of accuracy of character detection so as to improve the recognition rate of partial formulas.
Disclosure of Invention
The invention aims to solve the technical problem of providing a weight-based self-defined target detection partial identification method, which solves the problem of accuracy of character detection and improves the partial identification rate.
In order to solve the technical problems, the invention adopts the following technical scheme: the weight-customized-based target detection partial identification method specifically comprises the following steps:
(1) Character labeling: firstly, labeling each character contained in the score;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3);
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: and (3) analyzing the result of the partial expression output by the step (5) and giving out judgment.
As a preferable technical scheme of the invention, the loss function of the target detection algorithm YoLo V3 is adopted in the step (2) for detecting the score line, and a calculation formula is as follows:
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss;
wherein YoLo V3 grids the character image to be detected into K grids, each grid having MCandidate boxes, (x) i ,y i ) Representing the central coordinates of the object,candidate frame center coordinates representing the object, (w) i ,h i ) Representing the true width and height of the target, +.>Width and height of candidate box representing object, +.>Whether the jth candidate box representing the ith grid is responsible for detecting the target, if so +.>Otherwise-> The jth candidate box representing the ith grid is not responsible for the target;representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; c (C) i Representing a prediction confidence; p is p i (c) Representing the true probability of the ith grid being class c,/->Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8.
as a preferable embodiment of the present invention, the characters in the step (1) include numerals, minus signs, and score lines, wherein the numerals include 0 to 9.
As the preferable technical scheme of the invention, when the model is trained in the step (3), N rounds of iterative training are carried out, the sample number batch size of one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that the training model for character detection is obtained.
As a preferred embodiment of the present invention, the detection result in the step (4) includes: coordinate values of the character and predicted values of the character.
As a preferable technical scheme of the invention, the step (6) of analyzing the output expression result and giving judgment includes judging whether the expression result is a band score or a true/false score; or parsed into a specific form of latex or custom format.
As a preferable technical solution of the present invention, the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is determined; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.
Compared with the prior art, the invention has the following beneficial effects: the target detection partial identification method based on weight self-definition mainly carries out self-definition setting on the weight of a loss function in a target detection network, ensures the weight distribution principle of ' large weight of characters difficult to detect ' and small weight of characters easy to detect ', and achieves the aim of accurately detecting partial characters; the problem of accuracy of character detection is solved, so that the recognition rate of the partial formula is improved.
Drawings
The technical scheme of the invention is further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of a weight-based custom object detection partial identification method of the present invention;
FIG. 2 is a representation of a partial formula in the weight-based custom object detection partial formula identification method of the present invention.
Detailed Description
The present invention will be further described in detail with reference to the drawings and examples, which are only for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Examples: as shown in fig. 1, the weight-based self-defined target detection partial identification method specifically includes the following steps:
as shown in fig. 2, (1) character notation: firstly, labeling each character contained in the score;
the characters in the step (1) comprise numbers, negative signs and score lines, wherein the numbers comprise 0-9; a total of 12 classes of characters;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition; the self-defined weight is the self-defined distribution of weight by manual mode, and the loss weight definition of the score line is mainly used in the score identification, and the score line plays a decisive role as a unique mark for distinguishing the score. In the actual detection process, the score line has the characteristic of difficult detection compared with other characters, so the score line has more strict requirements on the score line, and compared with a target easy to detect, the score line can increase the weight of target loss to improve the target detection capability when the difference between a predicted value and a true value is actually calculated;
in the step (2), a loss function of a target detection algorithm YoLo V3 is used for detecting the score line, and a calculation formula is as follows:
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss;
wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) i ,y i ) Representing the central coordinates of the object,candidate frame center coordinates representing the object, (w) i ,h i ) Watch (watch)True width and height of the target, +.>Width and height of candidate box representing object, +.>Whether the jth candidate box representing the ith grid is responsible for detecting the target, if so +.>Otherwise-> The jth candidate box representing the ith grid is not responsible for the target; />Representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; c (C) i Representing a prediction confidence; p is p i (c) Representing the true probability of the ith grid being class c,/->Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8;
the coordinate loss in the step (2) is used for judging whether the score line can be correctly selected and judging whether the score line is deviated or not; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used for representing the credibility of the calculated value of the score line as the true value of the score line; here, the weight value of the score line can be modified in a self-defined manner, the value is generally set to be larger, and a larger penalty is obtained when the loss is calculated and is used for improving the detection and prediction results of the score line;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
when the model is trained in the step (3), carrying out N rounds of iterative training, setting a sample number batch size for one training according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, timely disconnecting the model training, so as to obtain an optimal training model for character detection;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); the detection result in the step (4) comprises: coordinate values of the characters and predicted values of the characters;
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment; analyzing the output result of the partial expression in the step (6) and giving out judgment, wherein the judgment comprises judging whether the result is a fractional band or a true-false fraction; or parsed into a specific form of latex or custom format.
It will be apparent to those skilled in the art that the present invention has been described in detail by way of illustration only, and it is not intended to be limited by the above-described embodiments, as long as various insubstantial modifications of the method concepts and aspects of the invention are employed or the inventive concepts and aspects of the invention are directly applied to other applications without modification, all within the scope of the invention.
Claims (6)
1. The target detection partial identification method based on weight customization is characterized by comprising the following steps of:
(1) Character labeling: firstly, labeling each character contained in the score;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3);
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment;
in the step (2), a loss function of a target detection algorithm YoLo V3 is used for detecting the score line, and a calculation formula is as follows:
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss; wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) i ,y i ) Representing the central coordinates of the object,candidate frame center coordinates representing the object, (w) i ,h i ) Representing the true width and height of the target, +.>Width and height of candidate box representing object, +.>The j candidate box representing the i-th grid isWhether or not to take charge of detecting the target, if soOtherwise-> The jth candidate box representing the ith grid is not responsible for the target; />Representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8; c (C) i Representing prediction confidence, p i (c) Representing the true probability of the ith grid being class c,/->Representing the predicted probability of class c for the ith grid, class represents the defined target class.
2. The weight-based customized object detection fractional recognition method of claim 1, wherein the characters in the step (1) comprise numbers, minus signs, fractional lines, wherein the numbers comprise 0-9.
3. The weight-based self-defined target detection pattern recognition method according to claim 1, wherein during the training of the model in the step (3), N rounds of iterative training are performed, the number of samples batch size for one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that a training model for character detection is obtained.
4. The weight-based customized target detection partial identification method according to claim 3, wherein the detection result in the step (4) comprises: coordinate values of the character and predicted values of the character.
5. The weight-based customized target detection partial identification method according to claim 3, wherein the parsing and judging the output partial expression result in the step (6) comprises judging whether the partial expression result is a fractional or true fraction; or parsed into a specific form of latex or custom format.
6. The weight-based customized object detection fractional recognition method according to claim 3, wherein the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is judged; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010144718.3A CN111325177B (en) | 2020-03-04 | 2020-03-04 | Weight-customized-based target detection partial identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010144718.3A CN111325177B (en) | 2020-03-04 | 2020-03-04 | Weight-customized-based target detection partial identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111325177A CN111325177A (en) | 2020-06-23 |
CN111325177B true CN111325177B (en) | 2023-05-12 |
Family
ID=71173124
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010144718.3A Active CN111325177B (en) | 2020-03-04 | 2020-03-04 | Weight-customized-based target detection partial identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111325177B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111723777A (en) * | 2020-07-07 | 2020-09-29 | 广州织点智能科技有限公司 | Method and device for judging commodity taking and placing process, intelligent container and readable storage medium |
CN112651353B (en) * | 2020-12-30 | 2024-04-16 | 南京红松信息技术有限公司 | Target calculation positioning and identifying method based on custom label |
CN112801046B (en) * | 2021-03-19 | 2021-08-06 | 北京世纪好未来教育科技有限公司 | Image processing method, image processing device, electronic equipment and computer storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101789073B (en) * | 2009-01-22 | 2013-06-26 | 富士通株式会社 | Character recognition device and character recognition method thereof |
CN105205448B (en) * | 2015-08-11 | 2019-03-15 | 中国科学院自动化研究所 | Text region model training method and recognition methods based on deep learning |
CN109635875A (en) * | 2018-12-19 | 2019-04-16 | 浙江大学滨海产业技术研究院 | A kind of end-to-end network interface detection method based on deep learning |
CN110765865B (en) * | 2019-09-18 | 2022-06-28 | 北京理工大学 | Underwater target detection method based on improved YOLO algorithm |
-
2020
- 2020-03-04 CN CN202010144718.3A patent/CN111325177B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111325177A (en) | 2020-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325177B (en) | Weight-customized-based target detection partial identification method | |
CN108985334B (en) | General object detection system and method for improving active learning based on self-supervision process | |
CN109977780A (en) | A kind of detection and recognition methods of the diatom based on deep learning algorithm | |
CN113298151A (en) | Remote sensing image semantic description method based on multi-level feature fusion | |
CN112307777B (en) | Knowledge graph representation learning method and system | |
CN111369535B (en) | Cell detection method | |
CN108009571A (en) | A kind of semi-supervised data classification method of new direct-push and system | |
CN112949517B (en) | Plant stomata density and opening degree identification method and system based on deep migration learning | |
CN108959474A (en) | Entity relationship extracting method | |
CN109947923A (en) | A kind of elementary mathematics topic type extraction method and system based on term vector | |
CN113723083A (en) | Weighted negative supervision text emotion analysis method based on BERT model | |
CN111461121A (en) | Electric meter number identification method based on YO L OV3 network | |
CN113657098B (en) | Text error correction method, device, equipment and storage medium | |
CN106448660A (en) | Natural language fuzzy boundary determining method with introduction of big data analysis | |
CN111797935B (en) | Semi-supervised depth network picture classification method based on group intelligence | |
CN116958548A (en) | Pseudo tag self-distillation semantic segmentation method based on category statistics driving | |
Azizah et al. | Tajweed-YOLO: Object Detection Method for Tajweed by Applying HSV Color Model Augmentation on Mushaf Images | |
CN112035646A (en) | Key content extraction method | |
CN114120367B (en) | Pedestrian re-recognition method and system based on circle loss measurement under meta-learning framework | |
CN115049870A (en) | Target detection method based on small sample | |
CN112329389B (en) | Chinese character stroke automatic extraction method based on semantic segmentation and tabu search | |
CN112232681A (en) | Intelligent paper marking method for computational analysis type non-selection questions | |
CN110033037A (en) | A kind of recognition methods of digital instrument reading | |
CN117236409B (en) | Small model training method, device and system based on large model and storage medium | |
CN117114004B (en) | Door control deviation correction-based few-sample two-stage named entity identification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |