CN111325177B - Weight-customized-based target detection partial identification method - Google Patents

Weight-customized-based target detection partial identification method Download PDF

Info

Publication number
CN111325177B
CN111325177B CN202010144718.3A CN202010144718A CN111325177B CN 111325177 B CN111325177 B CN 111325177B CN 202010144718 A CN202010144718 A CN 202010144718A CN 111325177 B CN111325177 B CN 111325177B
Authority
CN
China
Prior art keywords
character
detection
weight
representing
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010144718.3A
Other languages
Chinese (zh)
Other versions
CN111325177A (en
Inventor
田博帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Hongsong Information Technology Co ltd
Original Assignee
Nanjing Hongsong Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Hongsong Information Technology Co ltd filed Critical Nanjing Hongsong Information Technology Co ltd
Priority to CN202010144718.3A priority Critical patent/CN111325177B/en
Publication of CN111325177A publication Critical patent/CN111325177A/en
Application granted granted Critical
Publication of CN111325177B publication Critical patent/CN111325177B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to a weight-based self-defined target detection partial identification method, which comprises the following steps: (1) character labeling: firstly, labeling each character contained in the score; (2) self-weight: weight distribution and fractional identification are carried out through self definition; (3) model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters; (4) character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); (5) identifying a reconstruction: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing a fractional expression of the output score; and (6) analysis and judgment: and (5) analyzing the result of the partial expression output in the step (5) and giving out judgment. The method solves the problem of accuracy of character detection so as to improve the recognition rate of the partial formula.

Description

Weight-customized-based target detection partial identification method
Technical Field
The invention relates to the technical field of image processing, in particular to a weight-based self-defined target detection partial identification method.
Background
In the 21 st century, artificial intelligence technology has been rapidly developed, and the ancient cooking vessel of artificial intelligence technology will be entered in the future, and in the process of continuously developing technology, new technology will gradually replace the traditional technology, and has more advantages than the traditional technical method.
In the field of text recognition, numerous automated review products are successively born and support for automated review of various types of questions, each of which involves a different type of technology. Thus, the entire automated review product represents an integrated body of multiple complex technologies. In the review of the mathematical discipline, the identification and judgment of the questions comprising the division may be involved. In the conventional segmentation recognition method, a segmentation method is mostly adopted, and is used for obtaining a numerator and a denominator as a guide, so that the difficulty of the problem is reduced to the accuracy of segmentation. However, with the introduction of deep learning artificial intelligence technology, such problems are gradually simplified by fractional recognition, and can be solved by various technical modes, such as: end-to-end recognition, character detection, etc.
Here, the method concept of character detection is used for detecting the fractional characters, so that the purpose of fractional recognition is achieved. Therefore, based on the thought, the invention provides a weight-customized target detection partial formula recognition method, which solves the problem of accuracy of character detection so as to improve the recognition rate of partial formulas.
Disclosure of Invention
The invention aims to solve the technical problem of providing a weight-based self-defined target detection partial identification method, which solves the problem of accuracy of character detection and improves the partial identification rate.
In order to solve the technical problems, the invention adopts the following technical scheme: the weight-customized-based target detection partial identification method specifically comprises the following steps:
(1) Character labeling: firstly, labeling each character contained in the score;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3);
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: and (3) analyzing the result of the partial expression output by the step (5) and giving out judgment.
As a preferable technical scheme of the invention, the loss function of the target detection algorithm YoLo V3 is adopted in the step (2) for detecting the score line, and a calculation formula is as follows:
Figure GDA0004156541270000021
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss;
wherein YoLo V3 grids the character image to be detected into K grids, each grid having MCandidate boxes, (x) i ,y i ) Representing the central coordinates of the object,
Figure GDA0004156541270000022
candidate frame center coordinates representing the object, (w) i ,h i ) Representing the true width and height of the target, +.>
Figure GDA0004156541270000023
Width and height of candidate box representing object, +.>
Figure GDA0004156541270000024
Whether the jth candidate box representing the ith grid is responsible for detecting the target, if so +.>
Figure GDA0004156541270000025
Otherwise->
Figure GDA0004156541270000026
Figure GDA0004156541270000027
The jth candidate box representing the ith grid is not responsible for the target;
Figure GDA0004156541270000028
representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; c (C) i Representing a prediction confidence; p is p i (c) Representing the true probability of the ith grid being class c,/->
Figure GDA0004156541270000029
Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8.
as a preferable embodiment of the present invention, the characters in the step (1) include numerals, minus signs, and score lines, wherein the numerals include 0 to 9.
As the preferable technical scheme of the invention, when the model is trained in the step (3), N rounds of iterative training are carried out, the sample number batch size of one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that the training model for character detection is obtained.
As a preferred embodiment of the present invention, the detection result in the step (4) includes: coordinate values of the character and predicted values of the character.
As a preferable technical scheme of the invention, the step (6) of analyzing the output expression result and giving judgment includes judging whether the expression result is a band score or a true/false score; or parsed into a specific form of latex or custom format.
As a preferable technical solution of the present invention, the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is determined; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.
Compared with the prior art, the invention has the following beneficial effects: the target detection partial identification method based on weight self-definition mainly carries out self-definition setting on the weight of a loss function in a target detection network, ensures the weight distribution principle of ' large weight of characters difficult to detect ' and small weight of characters easy to detect ', and achieves the aim of accurately detecting partial characters; the problem of accuracy of character detection is solved, so that the recognition rate of the partial formula is improved.
Drawings
The technical scheme of the invention is further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of a weight-based custom object detection partial identification method of the present invention;
FIG. 2 is a representation of a partial formula in the weight-based custom object detection partial formula identification method of the present invention.
Detailed Description
The present invention will be further described in detail with reference to the drawings and examples, which are only for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Examples: as shown in fig. 1, the weight-based self-defined target detection partial identification method specifically includes the following steps:
as shown in fig. 2, (1) character notation: firstly, labeling each character contained in the score;
the characters in the step (1) comprise numbers, negative signs and score lines, wherein the numbers comprise 0-9; a total of 12 classes of characters;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition; the self-defined weight is the self-defined distribution of weight by manual mode, and the loss weight definition of the score line is mainly used in the score identification, and the score line plays a decisive role as a unique mark for distinguishing the score. In the actual detection process, the score line has the characteristic of difficult detection compared with other characters, so the score line has more strict requirements on the score line, and compared with a target easy to detect, the score line can increase the weight of target loss to improve the target detection capability when the difference between a predicted value and a true value is actually calculated;
in the step (2), a loss function of a target detection algorithm YoLo V3 is used for detecting the score line, and a calculation formula is as follows:
Figure GDA0004156541270000041
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss;
wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) i ,y i ) Representing the central coordinates of the object,
Figure GDA0004156541270000042
candidate frame center coordinates representing the object, (w) i ,h i ) Watch (watch)True width and height of the target, +.>
Figure GDA0004156541270000043
Width and height of candidate box representing object, +.>
Figure GDA0004156541270000044
Whether the jth candidate box representing the ith grid is responsible for detecting the target, if so +.>
Figure GDA0004156541270000045
Otherwise->
Figure GDA0004156541270000046
Figure GDA0004156541270000047
The jth candidate box representing the ith grid is not responsible for the target; />
Figure GDA0004156541270000051
Representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; c (C) i Representing a prediction confidence; p is p i (c) Representing the true probability of the ith grid being class c,/->
Figure GDA0004156541270000052
Representing the prediction probability of the ith grid as class c; class represents defined target categories; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8;
the coordinate loss in the step (2) is used for judging whether the score line can be correctly selected and judging whether the score line is deviated or not; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used for representing the credibility of the calculated value of the score line as the true value of the score line; here, the weight value of the score line can be modified in a self-defined manner, the value is generally set to be larger, and a larger penalty is obtained when the loss is calculated and is used for improving the detection and prediction results of the score line;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
when the model is trained in the step (3), carrying out N rounds of iterative training, setting a sample number batch size for one training according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, timely disconnecting the model training, so as to obtain an optimal training model for character detection;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3); the detection result in the step (4) comprises: coordinate values of the characters and predicted values of the characters;
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment; analyzing the output result of the partial expression in the step (6) and giving out judgment, wherein the judgment comprises judging whether the result is a fractional band or a true-false fraction; or parsed into a specific form of latex or custom format.
It will be apparent to those skilled in the art that the present invention has been described in detail by way of illustration only, and it is not intended to be limited by the above-described embodiments, as long as various insubstantial modifications of the method concepts and aspects of the invention are employed or the inventive concepts and aspects of the invention are directly applied to other applications without modification, all within the scope of the invention.

Claims (6)

1. The target detection partial identification method based on weight customization is characterized by comprising the following steps of:
(1) Character labeling: firstly, labeling each character contained in the score;
(2) Self-weight: weight distribution and fractional identification are carried out through self definition;
(3) Model training: performing N rounds of iterative training to obtain an optimized training model for detecting single fractional characters;
(4) Character detection: the character detection comprises character positioning and character prediction, and a detection result of each character in the score is obtained through the training model for character detection obtained in the step (3);
(5) Identifying and reconstructing: sequencing the coordinates of the characters according to the detection result obtained in the step (4) and reconstructing and outputting the fractional expression;
(6) And (3) analysis and judgment: analyzing the result of the partial expression output by the step (5) and giving out judgment;
in the step (2), a loss function of a target detection algorithm YoLo V3 is used for detecting the score line, and a calculation formula is as follows:
Figure FDA0004156541250000011
the formula includes three partial loss calculations, respectively: coordinate loss, confidence loss, and category loss; wherein YoLo V3 gridding the character image to be detected into K x K grids each having M candidate frames, (x) i ,y i ) Representing the central coordinates of the object,
Figure FDA0004156541250000012
candidate frame center coordinates representing the object, (w) i ,h i ) Representing the true width and height of the target, +.>
Figure FDA0004156541250000021
Width and height of candidate box representing object, +.>
Figure FDA0004156541250000022
The j candidate box representing the i-th grid isWhether or not to take charge of detecting the target, if so
Figure FDA0004156541250000023
Otherwise->
Figure FDA0004156541250000024
Figure FDA0004156541250000025
The jth candidate box representing the ith grid is not responsible for the target; />
Figure FDA0004156541250000026
Representing a confidence label, wherein the confidence label is truly 1, and is conversely 0; lambda (lambda) coord Weighting coefficients, lambda, representing candidate targets noobj A weight coefficient indicating no target; to pay attention to the detection of the score line, the lambda is adjusted to be increased coord Is set as: 0.8; c (C) i Representing prediction confidence, p i (c) Representing the true probability of the ith grid being class c,/->
Figure FDA0004156541250000027
Representing the predicted probability of class c for the ith grid, class represents the defined target class.
2. The weight-based customized object detection fractional recognition method of claim 1, wherein the characters in the step (1) comprise numbers, minus signs, fractional lines, wherein the numbers comprise 0-9.
3. The weight-based self-defined target detection pattern recognition method according to claim 1, wherein during the training of the model in the step (3), N rounds of iterative training are performed, the number of samples batch size for one training is set according to the actual video memory of the display card in each round of iteration, and when the loss value loss continuously reaches a stable state, the model training is timely disconnected, so that a training model for character detection is obtained.
4. The weight-based customized target detection partial identification method according to claim 3, wherein the detection result in the step (4) comprises: coordinate values of the character and predicted values of the character.
5. The weight-based customized target detection partial identification method according to claim 3, wherein the parsing and judging the output partial expression result in the step (6) comprises judging whether the partial expression result is a fractional or true fraction; or parsed into a specific form of latex or custom format.
6. The weight-based customized object detection fractional recognition method according to claim 3, wherein the coordinate loss in the step (2) is used for correctly framing the score line, and whether deviation exists in the score line framing is judged; the class loss is used for judging whether the score line is calculated correctly or not; the confidence loss is used to represent the degree of confidence that the score line calculation is its true value.
CN202010144718.3A 2020-03-04 2020-03-04 Weight-customized-based target detection partial identification method Active CN111325177B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010144718.3A CN111325177B (en) 2020-03-04 2020-03-04 Weight-customized-based target detection partial identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010144718.3A CN111325177B (en) 2020-03-04 2020-03-04 Weight-customized-based target detection partial identification method

Publications (2)

Publication Number Publication Date
CN111325177A CN111325177A (en) 2020-06-23
CN111325177B true CN111325177B (en) 2023-05-12

Family

ID=71173124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010144718.3A Active CN111325177B (en) 2020-03-04 2020-03-04 Weight-customized-based target detection partial identification method

Country Status (1)

Country Link
CN (1) CN111325177B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723777A (en) * 2020-07-07 2020-09-29 广州织点智能科技有限公司 Method and device for judging commodity taking and placing process, intelligent container and readable storage medium
CN112651353B (en) * 2020-12-30 2024-04-16 南京红松信息技术有限公司 Target calculation positioning and identifying method based on custom label
CN112801046B (en) * 2021-03-19 2021-08-06 北京世纪好未来教育科技有限公司 Image processing method, image processing device, electronic equipment and computer storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789073B (en) * 2009-01-22 2013-06-26 富士通株式会社 Character recognition device and character recognition method thereof
CN105205448B (en) * 2015-08-11 2019-03-15 中国科学院自动化研究所 Text region model training method and recognition methods based on deep learning
CN109635875A (en) * 2018-12-19 2019-04-16 浙江大学滨海产业技术研究院 A kind of end-to-end network interface detection method based on deep learning
CN110765865B (en) * 2019-09-18 2022-06-28 北京理工大学 Underwater target detection method based on improved YOLO algorithm

Also Published As

Publication number Publication date
CN111325177A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
CN111325177B (en) Weight-customized-based target detection partial identification method
CN108985334B (en) General object detection system and method for improving active learning based on self-supervision process
CN109977780A (en) A kind of detection and recognition methods of the diatom based on deep learning algorithm
CN113298151A (en) Remote sensing image semantic description method based on multi-level feature fusion
CN112307777B (en) Knowledge graph representation learning method and system
CN111369535B (en) Cell detection method
CN108009571A (en) A kind of semi-supervised data classification method of new direct-push and system
CN112949517B (en) Plant stomata density and opening degree identification method and system based on deep migration learning
CN108959474A (en) Entity relationship extracting method
CN109947923A (en) A kind of elementary mathematics topic type extraction method and system based on term vector
CN113723083A (en) Weighted negative supervision text emotion analysis method based on BERT model
CN111461121A (en) Electric meter number identification method based on YO L OV3 network
CN113657098B (en) Text error correction method, device, equipment and storage medium
CN106448660A (en) Natural language fuzzy boundary determining method with introduction of big data analysis
CN111797935B (en) Semi-supervised depth network picture classification method based on group intelligence
CN116958548A (en) Pseudo tag self-distillation semantic segmentation method based on category statistics driving
Azizah et al. Tajweed-YOLO: Object Detection Method for Tajweed by Applying HSV Color Model Augmentation on Mushaf Images
CN112035646A (en) Key content extraction method
CN114120367B (en) Pedestrian re-recognition method and system based on circle loss measurement under meta-learning framework
CN115049870A (en) Target detection method based on small sample
CN112329389B (en) Chinese character stroke automatic extraction method based on semantic segmentation and tabu search
CN112232681A (en) Intelligent paper marking method for computational analysis type non-selection questions
CN110033037A (en) A kind of recognition methods of digital instrument reading
CN117236409B (en) Small model training method, device and system based on large model and storage medium
CN117114004B (en) Door control deviation correction-based few-sample two-stage named entity identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant