CN114550181A - Method, device and medium for identifying question - Google Patents

Method, device and medium for identifying question Download PDF

Info

Publication number
CN114550181A
CN114550181A CN202210126218.6A CN202210126218A CN114550181A CN 114550181 A CN114550181 A CN 114550181A CN 202210126218 A CN202210126218 A CN 202210126218A CN 114550181 A CN114550181 A CN 114550181A
Authority
CN
China
Prior art keywords
correction
trace
result
question
small
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210126218.6A
Other languages
Chinese (zh)
Other versions
CN114550181B (en
Inventor
秦曙光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Readboy Software Technology Co Ltd
Original Assignee
Zhuhai Readboy Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Readboy Software Technology Co Ltd filed Critical Zhuhai Readboy Software Technology Co Ltd
Priority to CN202210126218.6A priority Critical patent/CN114550181B/en
Publication of CN114550181A publication Critical patent/CN114550181A/en
Application granted granted Critical
Publication of CN114550181B publication Critical patent/CN114550181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a small question identification method based on machine learning, which identifies all test question areas and correction traces in a test paper in a machine learning mode, further identifies whether the test questions have the small questions, further determines the correction results of each small question and answer according to whether the small questions and the correction traces exist, and can more accurately identify the correction results of the test paper.

Description

Method, device and medium for identifying question
Technical Field
The invention relates to the technical field of education, in particular to a method, a device and a medium for identifying a question.
Background
At present wisdom classroom rapid development, the unified functions such as examination paper, teaching assistance have appeared, but still remain perfect to the wholesale function of examination paper and teaching assistance etc. present identification system can only accomplish the big question discernment basically, but the big question discernment is unfavorable for the teacher to unite the branch also does not favor subsequent according to the question of recommending of the knowledge point that becomes more meticulous.
Topic identification is mainly faced with the following problems: 1) teachers are used to modify all questions which are right or wrong in a big question only by one modification symbol, for example, for a plurality of small questions in a big question, drawing a hook or a cross to show that all the small questions are right or wrong. 2) The corresponding small questions of a big question may have a right or a wrong, and the teacher may also individually correct each small question. 3) The distribution of the questions may be longitudinal or transverse, which is not favorable for accurately obtaining the information of the questions under the condition of a correction trace.
The background description provided herein is for the purpose of generally presenting the context of the disclosure. Unless otherwise indicated herein, the material described in this section is not prior art to the claims in this application and is not admitted to be prior art by inclusion in this section.
Disclosure of Invention
Aiming at the technical problems in the related art, the invention provides a topic identification method based on machine learning, which comprises the following steps:
s1, obtaining template data of the target to be recognized, performing image segmentation on the target to be recognized according to the template data, and sequentially performing correction trace extraction on segmented images; identifying the result of the correction trace by using a preset identification model;
s2, acquiring the number of effective correction marks in the segmented image, wherein if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
s3, if the number of the correction marks is not one, further acquiring the mark with the largest area occupied by the correction marks in all the correction marks, and taking the result of the recognition of the correction marks as the default correction result of the question and recording the result as a default value;
s4, obtaining the area corresponding to each small topic of the divided area, if no small topic exists, judging whether the result is wrong correction trace in the topic range, if so, judging that the whole correction result is wrong;
and S5, if the small questions exist, sequentially traversing the areas corresponding to all the small questions, and judging whether correction marks exist in the range of the small questions. If the correction trace exists, the result of the correction trace is used as the recognition result of the question, and if the correction trace does not exist, a default value is used as the correction result.
Specifically, the method for determining the default value of the correction mark in step S3 may further perform a secondary verification on the largest correction mark, where the secondary verification step further includes:
s31, judging whether the question has a subtotal or not, if not, then no secondary verification is needed, if yes, then further identifying the distribution mode of the subtotal;
s32, if the subtotal is longitudinally distributed, further calculating the proportion value of the height of the maximum correction trace in the longitudinal direction to the height of the whole subtotal, if the proportion value exceeds a preset threshold value, taking the result of the maximum correction trace as a default value;
s33, if the subtotal is transversely distributed, further calculating the transverse length of the maximum correction trace as the proportion value of the overall transverse length of the subtotal, and if the proportion value exceeds a preset threshold value, taking the result of the maximum correction trace as a default value;
and S34, if the small questions are distributed longitudinally and transversely, further judging whether the whole questions are more inclined to be longitudinally distributed or transversely distributed according to the size or the proportional relation between the longitudinal height and the transverse length, and identifying according to the inclined distribution structure of the whole questions.
Specifically, the correction trace in step S5 is subjected to secondary verification, and if and only if the proportion of the correction trace to the subtotal area is greater than the preset threshold, the correction trace is considered as an effective correction trace.
Specifically, the template data includes page data and title data.
Specifically, the page data includes width and height data of the page or/and a page number; the title data includes title coordinate data.
In a second aspect, another embodiment of the present invention discloses a device for identifying a topic for machine learning, which includes the following units:
the correction trace recognition unit is used for acquiring template data of a target to be recognized, segmenting the target to be recognized according to the template data and sequentially extracting correction traces of the segmented image; identifying the result of the correction trace by using a preset identification model;
the effective correction mark judging unit is used for acquiring the number of effective correction marks in the segmented image, and if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
a maximum correction trace judging unit, configured to further obtain, if the number of the correction traces is not one, a trace in which the correction trace occupies the largest area among all the correction traces, and record a result of recognition of the correction trace as a default correction result of the question as a default value;
the small question judging unit is used for acquiring the area corresponding to each small question of the divided area, if no small question exists, judging whether a result is an error correction trace in the scope of the question, and if the error correction trace exists, judging that the whole correction result of the question is an error;
the correction result determining unit is used for sequentially traversing the areas corresponding to all the small questions if the small questions exist, and judging whether correction traces exist in the range of the small questions; if the correction trace exists, the result of the correction trace is used as the recognition result of the question, and if the correction trace does not exist, a default value is used as the correction result.
Specifically, the maximum trace determining unit further includes:
a secondary verification unit: the correcting trace default value judging method in the maximum trace judging unit can further perform secondary verification on the maximum correcting trace by using a secondary verifying unit, and the secondary verifying unit further comprises:
the second question judging unit is used for judging whether the question has a question or not, if the question does not have a question, secondary verification is not needed, and if the question exists, the distribution mode of the question is further identified;
the first subtotal direction processing unit is used for further calculating the height occupied by the maximum correcting mark in the longitudinal direction as the proportion value of the height of the whole subtotal in the longitudinal direction if the subtotal is longitudinally distributed, and taking the result of the maximum correcting mark as a default value if the proportion value exceeds a preset threshold value;
the second subtotal direction processing unit is used for further calculating the transverse length of the maximum correction mark as a proportion value of the overall transverse length of the topic if the subtotal is transversely distributed, and taking the result of the maximum correction mark as a default value if the proportion value exceeds a preset threshold value;
and the third topic direction processing unit is used for judging whether the overall topic is more inclined to longitudinal distribution or transverse distribution according to the size or proportional relation between the longitudinal height and the transverse length if the topics are distributed longitudinally and transversely, and then identifying according to the distribution structure of the overall deviation.
Specifically, the correction trace in the small question judging unit is subjected to secondary verification, and if and only if the proportion of the correction trace to the small question area is greater than a preset threshold value, the correction trace is regarded as an effective correction trace.
Specifically, the template data includes page data and title data; the page data comprises width and height data or/and page numbers of the page; the title data includes title coordinate data.
In a third aspect, another embodiment of the present invention discloses a non-volatile memory storing instructions, which when executed by a processor, are used for implementing the above-mentioned method for identifying topics based on machine learning.
The invention identifies all test question areas and correction traces in the test paper in a machine learning mode, further identifies whether the test questions have the small questions, further determines the correction result of each small question and the answer according to whether the small questions and the correction traces exist, and can more accurately identify the correction result of the test paper.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram provided by an embodiment of the present invention;
fig. 3 is a schematic diagram provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
Example one
Referring to fig. 1, the embodiment discloses a method for identifying a question, which includes the following steps:
s1, obtaining template data of the target to be recognized, performing image segmentation on the target to be recognized according to the template data, and sequentially performing correction trace extraction on segmented images; identifying the result of the correction trace by using a preset identification model;
the target to be identified is a test paper, and the specific test paper can be a paper test paper which can be obtained through a scanner.
Specifically, the template data of the target to be recognized may be a template that is established in advance and corresponds to the target to be recognized.
The template data at least comprises page data and title data, and can further comprise one or more of data of a title, a corresponding grade, a class, a subject, a chapter, a section and the like of the target to be identified.
The page data at least comprises width and height data of the page, and can further comprise page numbers and other data.
The question data at least comprises the coordinate data of a question, and can further comprise one or more of score data, answer data, analysis data, micro-course link data, knowledge point data, similar question type data and the like of a coordinate region corresponding to the question.
Preferably, when the coordinate data of the topic is obtained, the structure type of the topic is determined, if the topic is only a big topic, only the coordinate data of the big topic is obtained, and if the topic is a structure of the big topic and a small topic corresponding to the big topic, the coordinate data of each level of the topic is further recorded.
The topic coordinate data further refers to coordinate data information of a minimum circumscribed rectangle of the topic, and the data information storage mode is not specifically limited, and may be formed by the coordinate information of the upper left corner of the rectangle and the coordinate information of the lower right corner of the rectangle, or formed by the coordinate information of the upper left corner and the data information of the width and height of the rectangle. In this embodiment, the titles are cut according to the acquired coordinate data, so that the titles in the page are cut respectively.
Specifically, after the target to be recognized is obtained, the target to be recognized is cut according to the coordinate information in the template data of the target to be recognized, so that the question of the target to be recognized is obtained.
The image segmentation in the present embodiment is to segment each of the large rectangular regions in the target as a unit.
In this embodiment, the extraction of the correction traces further refers to limiting HSV color value ranges of the correction traces, and each color corresponds to an HSV color value space, so that the color of the correction pen can be extracted by designating the color of the correction pen, thereby extracting the correction traces.
And extracting all the extracted correction traces according to the minimum external rectangle of the correction traces, and recording the coordinate position of each rectangular picture in the image.
Preferably, when the correction trace is extracted according to the minimum circumscribed rectangle, a redundancy value can be further set, so that the extracted rectangle area is slightly larger than the correction trace for fault tolerance.
Preferably, in the extraction process of the correction traces, preliminary filtering can be further performed according to the size of the correction traces, so as to eliminate redundant noise points. The filtering method is not particularly limited, and may specifically be to screen out noise points whose modification traces are smaller than the preset minimum trace proportion value of the original image, for example, if the area of the modification traces is smaller than 0.4% of the whole image area, the traces are considered to be too small, so as to determine that the modification traces are only interference factors, but not true modification traces.
Specifically, the recognition model of the present embodiment is recognized by machine learning, which trains the recognition model by machine learning. Specific machine learning includes, but is not limited to, neural network models.
The specific training algorithm of the recognition model trained by the machine learning is not specifically limited, and the training process of the recognition model further includes:
the type of the label for training is determined, four kinds of labels are set according to the general correction habit of the teacher, and the labels are respectively correct, oblique lines, crosses and errors represented by hooks, and half pairs represented by circles.
And collecting a large number of samples corresponding to the label types, training and classifying according to a preset algorithm to generate a recognition model, and further predicting the correction result corresponding to the extracted correction trace according to the recognition model.
Preferably, before the identification is carried out according to the identification model, the identification for judging whether the extracted correction trace is the compliance correction trace can be further added, the judgment basis is also that the model for judging whether the extracted correction trace is the compliance correction trace is obtained by training according to a large number of samples, the preliminary screening can be carried out according to the model, and the identification accuracy rate is improved.
S2, acquiring the number of effective correction marks in the segmented image, wherein if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
and S3, if the number of the correction marks is not one, further acquiring the mark with the largest area occupied by the correction marks in all the correction marks, and taking the result of the recognition of the correction marks as the default correction result of the question and recording the result as a default value.
S4, obtaining the area corresponding to each small topic of the divided area, if no small topic exists, judging whether the result is wrong correction trace in the topic range, if so, judging that the whole correction result is wrong.
And S5, if the small questions exist, sequentially traversing the areas corresponding to all the small questions, and judging whether correction marks exist in the range of the small questions. If the correction trace exists, the result of the correction trace is used as the recognition result of the question, and if the correction trace does not exist, a default value is used as the correction result.
Specifically, the method for determining the default value of the correction mark in step S3 may further perform a secondary verification on the largest correction mark, where the secondary verification step further includes:
and S31, judging whether the question has a question or not, if not, needing no secondary verification, and if so, further identifying the distribution mode of the question.
And S32, if the small questions are distributed longitudinally, further calculating a proportion value of the height of the maximum correction trace in the longitudinal direction to the height of the whole longitudinal direction of the questions, and if the proportion value exceeds a preset threshold value, taking the result of the maximum correction trace as a default value (one question has a plurality of small questions in the longitudinal direction, and if a teacher wants to correct the small questions into a full pair, the correction trace is certainly high in the longitudinal direction).
And S33, if the small questions are distributed transversely, further calculating a ratio value of the transverse length of the maximum correction trace to the transverse length of the whole questions, and if the ratio value exceeds a preset threshold, taking the result of the maximum correction trace as a default value (one question has a plurality of small questions in the transverse direction, and if the teacher wants to correct the small questions into a full pair, the correction trace is long enough in the transverse direction).
And S34, if the small questions are distributed longitudinally and transversely, further judging whether the whole questions are more inclined to be longitudinally distributed or transversely distributed according to the size or the proportional relation between the longitudinal height and the transverse length, and identifying according to the inclined distribution structure of the whole questions.
Specifically, if the maximum modification mark cannot satisfy the secondary verification condition, the default value is an identification error; in practical situations, many students are directly left without doing any questions, and teachers do not modify the questions, so that the questions which are not done should be defaulted as wrong questions.
Specifically, the correction trace in step S5 may also be subjected to secondary verification, and if and only if the proportion of the correction trace to the subtotal area is greater than the preset threshold, the correction trace is considered as an effective correction trace.
In the embodiment, all test question areas and correction traces in the test paper are identified in a machine learning mode, whether the test questions have the small questions or not is further identified, each small question and the correction result of the answer are further determined according to whether the small questions and the correction traces exist or not, and the correction result of the test paper can be more accurately identified.
Example two
Referring to fig. 2, the present embodiment discloses a device for identifying a topic for machine learning, which includes the following units:
the correction trace recognition unit is used for acquiring template data of a target to be recognized, segmenting the target to be recognized according to the template data and sequentially extracting correction traces of the segmented image; identifying the result of the correction trace by using a preset identification model;
the target to be identified is a test paper, and the specific test paper can be a paper test paper which can be obtained through a scanner.
Specifically, the template data of the target to be recognized may be a template that is established in advance and corresponds to the target to be recognized.
The template data at least comprises page data and title data, and can further comprise one or more of data of a title, a corresponding grade, a class, a subject, a chapter, a section and the like of the target to be identified.
The page data at least comprises width and height data of the page, and can further comprise page numbers and other data.
The question data at least comprises the coordinate data of a question, and can further comprise one or more of score data, answer data, analysis data, micro-course link data, knowledge point data, similar question type data and the like of a coordinate region corresponding to the question.
Preferably, when the coordinate data of the topic is obtained, the structure type of the topic is determined, if the topic is only a big topic, only the coordinate data of the big topic is obtained, and if the topic is a structure of the big topic and a small topic corresponding to the big topic, the coordinate data of each level of the topic is further recorded, if one big topic comprises three small topics and the first small topic also comprises two small topics, the whole coordinate region of the big topic, the coordinate region of each small topic and the coordinate region of each small topic are required to be recorded.
The topic coordinate data further refers to coordinate data information of a minimum circumscribed rectangle of the topic, and the data information storage mode is not specifically limited, and may be formed by the coordinate information of the upper left corner of the rectangle and the coordinate information of the lower right corner of the rectangle, or formed by the coordinate information of the upper left corner and the data information of the width and height of the rectangle. In this embodiment, the titles are cut according to the obtained coordinate data, so that the titles in the page are cut out respectively.
Specifically, after the target to be recognized is obtained, the target to be recognized is cut according to the coordinate information in the template data of the target to be recognized, so that the question of the target to be recognized is obtained.
The image segmentation in the present embodiment is to segment each of the large rectangular regions in the target as a unit.
In this embodiment, the extraction of the correction traces further refers to limiting HSV color value ranges of the correction traces, and each color corresponds to an HSV color value space, so that the color of the correction pen can be extracted by designating the color of the correction pen, thereby extracting the correction traces.
And extracting all the extracted correction traces according to the minimum external rectangle of the correction traces, and recording the coordinate position of each rectangular picture in the image.
Preferably, when the correction trace is extracted according to the minimum circumscribed rectangle, a redundancy value can be further set, so that the extracted rectangle area is slightly larger than the correction trace for fault tolerance.
Preferably, in the extraction process of the correction traces, preliminary filtering can be further performed according to the size of the correction traces, so as to eliminate redundant noise points. The filtering method is not particularly limited, and may specifically be to screen out noise points whose modification traces are smaller than the preset minimum trace proportion value of the original image, for example, if the area of the modification traces is smaller than 0.4% of the whole image area, the traces are considered to be too small, so as to determine that the modification traces are only interference factors, but not true modification traces.
Specifically, the recognition model of the present embodiment is recognized by machine learning, which trains the recognition model by machine learning. Specific machine learning includes, but is not limited to, neural network models.
The specific training algorithm of the recognition model trained by the machine learning is not specifically limited, and the training process of the recognition model further includes:
the type of the trained label is determined, four labels are set according to the general correction habit of the teacher, the hook represents correct, oblique line, cross and error, and the circle represents half pair.
And collecting a large number of samples corresponding to the label types, training and classifying according to a preset algorithm to generate a recognition model, and further predicting the correction result corresponding to the extracted correction trace according to the recognition model.
Preferably, before the identification is carried out according to the identification model, the identification for judging whether the extracted correction trace is the compliance correction trace can be further added, the judgment basis is also that the model for judging whether the extracted correction trace is the compliance correction trace is obtained by training according to a large number of samples, the preliminary screening can be carried out according to the model, and the identification accuracy rate is improved.
The effective correction mark judging unit is used for acquiring the number of effective correction marks in the segmented image, and if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
a maximum correction trace judging unit, configured to further obtain, if the number of the correction traces is not one, a trace in which the correction trace occupies the largest area among all the correction traces, and record a result of recognition of the correction trace as a default correction result of the question as a default value;
the small question judging unit is used for acquiring the area corresponding to each small question of the divided area, if no small question exists, judging whether a result is an error correction trace in the scope of the question, and if the error correction trace exists, judging that the whole correction result of the question is an error;
the correction result determining unit is used for sequentially traversing the areas corresponding to all the small questions if the small questions exist, and judging whether correction traces exist in the range of the small questions; if the correction trace exists, the result of the correction trace is used as the recognition result of the small question, and if the correction trace does not exist, a default value is used as the correction result.
The maximum trace judging unit further includes:
a secondary verification unit: the correcting trace default value judging method in the maximum trace judging unit can further perform secondary verification on the maximum correcting trace by using a secondary verifying unit, and the secondary verifying unit further comprises:
the second question judging unit is used for judging whether the question has a question or not, if the question does not have a question, secondary verification is not needed, and if the question exists, the distribution mode of the question is further identified;
the first subtotal direction processing unit is used for further calculating the height occupied by the maximum correcting mark in the longitudinal direction as the proportion value of the height of the whole subtotal in the longitudinal direction if the subtotal is longitudinally distributed, and taking the result of the maximum correcting mark as a default value if the proportion value exceeds a preset threshold value;
the second subtotal direction processing unit is used for further calculating the transverse length of the maximum correction mark as a proportion value of the overall transverse length of the topic if the subtotal is transversely distributed, and taking the result of the maximum correction mark as a default value if the proportion value exceeds a preset threshold value;
and the third topic direction processing unit is used for judging whether the overall topic is more inclined to longitudinal distribution or transverse distribution according to the size or proportional relation between the longitudinal height and the transverse length if the topics are distributed longitudinally and transversely, and then identifying according to the distribution structure of the overall deviation.
Specifically, the correction trace in the small question judging unit is subjected to secondary verification, and if and only if the proportion of the correction trace to the small question area is greater than a preset threshold value, the correction trace is regarded as an effective correction trace.
In the embodiment, all test question areas and correction traces in the test paper are identified in a machine learning mode, whether the test questions have the small questions or not is further identified, each small question and the correction result of the answer are further determined according to whether the small questions and the correction traces exist or not, and the correction result of the test paper can be more accurately identified.
EXAMPLE III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a topic identification device based on machine learning according to the embodiment. The machine learning-based topic identification apparatus 20 of this embodiment comprises a processor 21, a memory 22, and a computer program stored in said memory 22 and executable on said processor 21. The processor 21 realizes the steps in the above-described method embodiments when executing the computer program. Alternatively, the processor 21 implements the functions of the modules/units in the above-described device embodiments when executing the computer program.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 22 and executed by the processor 21 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the machine learning based topic identification apparatus 20. For example, the computer program may be divided into the modules in the second embodiment, and for the specific functions of the modules, reference is made to the working process of the apparatus in the foregoing embodiment, which is not described herein again.
The topic identification device 20 based on machine learning may include, but is not limited to, a processor 21 and a memory 22. Those skilled in the art will appreciate that the schematic diagram is merely an example of a machine learning based topic identification apparatus 20 and does not constitute a limitation of a machine learning based topic identification apparatus 20 and may include more or fewer components than shown, or combine certain components, or different components, for example, the machine learning based topic identification apparatus 20 may also include input output devices, network access devices, buses, and the like.
The Processor 21 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 21 is a control center of the machine learning based topic identification apparatus 20, and various interfaces and lines are used to connect various parts of the entire machine learning based topic identification apparatus 20.
The memory 22 may be used to store the computer programs and/or modules, and the processor 21 implements various functions of the machine learning-based question recognition apparatus 20 by running or executing the computer programs and/or modules stored in the memory 22 and calling data stored in the memory 22. The memory 22 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 22 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein, the integrated module/unit of the apparatus 20 for recognizing topics based on machine learning can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by the processor 21 to implement the steps of the above embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims (10)

1. A method for identifying a subtotal based on machine learning comprises the following steps:
s1, obtaining template data of the target to be recognized, performing image segmentation on the target to be recognized according to the template data, and sequentially performing correction trace extraction on segmented images; identifying the result of the correction trace by using a preset identification model;
s2, acquiring the number of effective correction marks in the segmented image, wherein if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
s3, if the number of the correction marks is not one, further acquiring the mark with the largest area occupied by the correction marks in all the correction marks, and taking the result of the recognition of the correction marks as the default correction result of the question and recording the result as a default value;
s4, obtaining the area corresponding to each small topic of the divided area, if no small topic exists, judging whether the result is wrong correction trace in the topic range, if so, judging that the whole correction result is wrong;
s5, if the small questions exist, sequentially traversing the areas corresponding to all the small questions, and judging whether correction marks exist in the range of the small questions; if the correction trace exists, the result of the correction trace is used as the recognition result of the question, and if the correction trace does not exist, a default value is used as the correction result.
2. The method according to claim 1, wherein the default value determination method for modification traces in step S3 further performs a secondary verification on the largest modification trace, and the secondary verification step further comprises:
s31, judging whether the question has a subtotal or not, if not, then no secondary verification is needed, if yes, then further identifying the distribution mode of the subtotal;
s32, if the small questions are distributed longitudinally, further calculating the proportion value of the height of the maximum correcting mark in the longitudinal direction to the height of the whole questions in the longitudinal direction, and if the proportion value exceeds a preset threshold value, taking the result of the maximum correcting mark as a default value;
s33, if the subtotal is transversely distributed, further calculating the transverse length of the maximum correction trace as the proportion value of the overall transverse length of the subtotal, and if the proportion value exceeds a preset threshold value, taking the result of the maximum correction trace as a default value;
and S34, if the small questions are distributed longitudinally and transversely, further judging whether the whole questions are more inclined to be longitudinally distributed or transversely distributed according to the size or the proportional relation between the longitudinal height and the transverse length, and identifying according to the inclined distribution structure of the whole questions.
3. The method according to claim 2, wherein the correction mark in step S5 is secondarily checked, and the correction mark is considered as a valid correction mark if and only if the proportion of the correction mark to the subtopic area is greater than a preset threshold.
4. The method of claim 3, wherein the template data comprises page data and title data.
5. The method of claim 4, the page data comprising page width and height data or/and page number; the title data includes title coordinate data.
6. A device for identifying a topic for machine learning, comprising the following units:
the correction trace recognition unit is used for acquiring template data of a target to be recognized, segmenting the target to be recognized according to the template data and sequentially extracting correction traces of the segmented image; identifying the result of the correction trace by using a preset identification model;
the effective correction mark judging unit is used for acquiring the number of effective correction marks in the segmented image, and if the number of the effective correction marks is one, the correction result of the segmented image is the recognition result of the correction marks;
the maximum correction mark judging unit is used for further acquiring the mark with the largest area occupied by the correction marks in all the correction marks if the number of the correction marks is not one, and taking the result of recognition of the correction marks as the default correction result of the question and recording the result as a default value;
the small question judging unit is used for acquiring the area corresponding to each small question of the divided area, if no small question exists, judging whether a result is an error correction trace in the scope of the question, and if the error correction trace exists, judging that the whole correction result of the question is an error;
the correction result determining unit is used for sequentially traversing the areas corresponding to all the small questions if the small questions exist, and judging whether correction traces exist in the range of the small questions; if the correction trace exists, the result of the correction trace is used as the recognition result of the question, and if the correction trace does not exist, a default value is used as the correction result.
7. The apparatus of claim 1, the maximum trace judging unit further comprising:
a secondary verification unit: the correcting trace default value judging method in the maximum trace judging unit can further perform secondary verification on the maximum correcting trace by using a secondary verifying unit, and the secondary verifying unit further comprises:
the second question judging unit is used for judging whether the question has a question or not, if the question does not have a question, secondary verification is not needed, and if the question exists, the distribution mode of the question is further identified;
the first subtotal direction processing unit is used for further calculating the height occupied by the maximum correcting mark in the longitudinal direction as the proportion value of the height of the whole subtotal in the longitudinal direction if the subtotal is longitudinally distributed, and taking the result of the maximum correcting mark as a default value if the proportion value exceeds a preset threshold value;
the second subtotal direction processing unit is used for further calculating the transverse length of the maximum correction mark as a proportion value of the overall transverse length of the topic if the subtotal is transversely distributed, and taking the result of the maximum correction mark as a default value if the proportion value exceeds a preset threshold value;
and the third topic direction processing unit is used for judging whether the overall topic is more inclined to longitudinal distribution or transverse distribution according to the size or proportional relation between the longitudinal height and the transverse length if the topics are distributed longitudinally and transversely, and then identifying according to the distribution structure of the overall deviation.
8. The apparatus according to claim 7, wherein the correction trace in the question judging unit is checked for the second time, and the correction trace is considered as a valid correction trace only when the ratio of the correction trace to the question area is greater than a preset threshold.
9. The apparatus of claim 3, the template data comprising page data and title data; the page data comprises width and height data or/and page numbers of the page; the title data includes title coordinate data.
10. A non-volatile memory storing instructions which, when executed by a processor, are adapted to implement the machine learning based topic identification method of any one of claims 1-5.
CN202210126218.6A 2022-02-10 2022-02-10 Method, device and medium for identifying question Active CN114550181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210126218.6A CN114550181B (en) 2022-02-10 2022-02-10 Method, device and medium for identifying question

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210126218.6A CN114550181B (en) 2022-02-10 2022-02-10 Method, device and medium for identifying question

Publications (2)

Publication Number Publication Date
CN114550181A true CN114550181A (en) 2022-05-27
CN114550181B CN114550181B (en) 2023-01-10

Family

ID=81674185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210126218.6A Active CN114550181B (en) 2022-02-10 2022-02-10 Method, device and medium for identifying question

Country Status (1)

Country Link
CN (1) CN114550181B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012181653A (en) * 2011-03-01 2012-09-20 Tokyo Univ Of Agriculture & Technology Scoring method, program of scoring method, and scoring system
JP2014071489A (en) * 2012-09-27 2014-04-21 Fujitsu Ltd Grade result acquisition method, program, and device
WO2015170822A1 (en) * 2014-05-07 2015-11-12 오종현 System for grading examination paper and for managing incorrect-answer data
CN106846961A (en) * 2017-01-25 2017-06-13 华中师范大学 The treating method and apparatus of electronic test paper
CN107977637A (en) * 2017-12-11 2018-05-01 上海启思教育科技服务有限公司 A kind of intelligently reading system of more topic types
CN108133167A (en) * 2016-12-01 2018-06-08 北京新唐思创教育科技有限公司 A kind of automatic judging method and its device selected with judging topic answer
CN109697905A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 A kind of exam paper marking system
CN110210309A (en) * 2019-04-30 2019-09-06 宜春宜联科技有限公司 The recognition methods of mistake topic, system, readable storage medium storing program for executing and equipment
US20190311644A1 (en) * 2016-12-12 2019-10-10 Nichinoken Inc. Computer system and program for assisting grading of examination papers
CN110348444A (en) * 2019-05-31 2019-10-18 浙江米猪控股有限公司 Wrong topic collection method, device and equipment based on deep learning
WO2020259060A1 (en) * 2019-06-26 2020-12-30 深圳中兴网信科技有限公司 Test paper information extraction method and system, and computer-readable storage medium
CN112200058A (en) * 2020-09-30 2021-01-08 珠海读书郎网络教育有限公司 System and method for intelligently correcting auxiliary data
CN112215192A (en) * 2020-10-22 2021-01-12 常州大学 Test paper and method for quickly inputting test paper score based on machine vision technology
JP6828915B1 (en) * 2019-09-10 2021-02-10 株式会社教育同人社 Scoring support system and scoring support program
CN113407676A (en) * 2021-06-24 2021-09-17 作业帮教育科技(北京)有限公司 Title correction method and system, electronic device and computer readable medium
CN113435440A (en) * 2021-07-19 2021-09-24 深圳市亚太未来教育科技发展有限公司 Intelligent paper marking review method and system based on image recognition
CN113505787A (en) * 2021-06-24 2021-10-15 作业帮教育科技(北京)有限公司 Title correction method and system, adopted electronic equipment and computer readable medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012181653A (en) * 2011-03-01 2012-09-20 Tokyo Univ Of Agriculture & Technology Scoring method, program of scoring method, and scoring system
JP2014071489A (en) * 2012-09-27 2014-04-21 Fujitsu Ltd Grade result acquisition method, program, and device
WO2015170822A1 (en) * 2014-05-07 2015-11-12 오종현 System for grading examination paper and for managing incorrect-answer data
CN108133167A (en) * 2016-12-01 2018-06-08 北京新唐思创教育科技有限公司 A kind of automatic judging method and its device selected with judging topic answer
US20190311644A1 (en) * 2016-12-12 2019-10-10 Nichinoken Inc. Computer system and program for assisting grading of examination papers
CN106846961A (en) * 2017-01-25 2017-06-13 华中师范大学 The treating method and apparatus of electronic test paper
CN109697905A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 A kind of exam paper marking system
CN107977637A (en) * 2017-12-11 2018-05-01 上海启思教育科技服务有限公司 A kind of intelligently reading system of more topic types
CN110210309A (en) * 2019-04-30 2019-09-06 宜春宜联科技有限公司 The recognition methods of mistake topic, system, readable storage medium storing program for executing and equipment
CN110348444A (en) * 2019-05-31 2019-10-18 浙江米猪控股有限公司 Wrong topic collection method, device and equipment based on deep learning
WO2020259060A1 (en) * 2019-06-26 2020-12-30 深圳中兴网信科技有限公司 Test paper information extraction method and system, and computer-readable storage medium
JP6828915B1 (en) * 2019-09-10 2021-02-10 株式会社教育同人社 Scoring support system and scoring support program
CN112200058A (en) * 2020-09-30 2021-01-08 珠海读书郎网络教育有限公司 System and method for intelligently correcting auxiliary data
CN112215192A (en) * 2020-10-22 2021-01-12 常州大学 Test paper and method for quickly inputting test paper score based on machine vision technology
CN113407676A (en) * 2021-06-24 2021-09-17 作业帮教育科技(北京)有限公司 Title correction method and system, electronic device and computer readable medium
CN113505787A (en) * 2021-06-24 2021-10-15 作业帮教育科技(北京)有限公司 Title correction method and system, adopted electronic equipment and computer readable medium
CN113435440A (en) * 2021-07-19 2021-09-24 深圳市亚太未来教育科技发展有限公司 Intelligent paper marking review method and system based on image recognition

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHONG ZHANG等: "Research on Test Paper Auto-generating Based on Improved Particle Swarm Optimization", 《2015 SEVENTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP)》 *
宋峥峥: "自动化扫描阅卷系统的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
朱然: "基于机器视觉的电子作业批改系统的研究与设计", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
贾硕: "智能批改系统的研究与设计", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
陶翠: "机械类课程自动组卷与批改系统的研究与开发", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
马飞等: "基于图像处理的客观题自动阅卷系统研究开发", 《计算机技术与发展》 *

Also Published As

Publication number Publication date
CN114550181B (en) 2023-01-10

Similar Documents

Publication Publication Date Title
CN108932508B (en) Method and system for intelligently identifying and correcting subjects
CN108171297A (en) A kind of answer card identification method and device
CN107798321A (en) A kind of examination paper analysis method and computing device
CN110110581B (en) Test paper correcting method and system based on artificial intelligence
CN109858542B (en) Character recognition method and device
CN108509988B (en) Test paper score automatic statistical method and device, electronic equipment and storage medium
WO2021232670A1 (en) Pcb component identification method and device
CN110135225B (en) Sample labeling method and computer storage medium
CN110879965A (en) Automatic reading and amending method of test paper objective questions, electronic device, equipment and storage medium
CN111783855A (en) Intelligent marking method and device
CN113159014A (en) Objective question reading method, device, equipment and storage medium based on handwritten question numbers
CN113762274B (en) Answer sheet target area detection method, system, storage medium and equipment
CN112347997A (en) Test question detection and identification method and device, electronic equipment and medium
CN115100668A (en) Method and device for identifying table information in image
CN111008594A (en) Error correction evaluation method, related equipment and readable storage medium
CN114550181B (en) Method, device and medium for identifying question
CN112434585A (en) Method, system, electronic device and storage medium for identifying virtual reality of lane line
CN115482535A (en) Test paper automatic correction method, storage medium and equipment
CN112084103A (en) Interface test method, device, equipment and medium
CN113033480A (en) Answer sheet-based objective question reading method, device, equipment and storage medium
CN114254605A (en) Answer sheet template generation method, answer sheet identification method and device and electronic equipment
CN114663891A (en) Daily work wrong question book generation method and system based on configured answer sheet
CN114241503B (en) Method and system for acquiring error cause, readable storage medium and device
CN112907705B (en) Correction image generation method, device, equipment and storage medium
CN113128486B (en) Construction method and device of handwritten mathematical formula sample library and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant