CN116629270A - Subjective question scoring method and device based on examination big data and text semantics - Google Patents

Subjective question scoring method and device based on examination big data and text semantics Download PDF

Info

Publication number
CN116629270A
CN116629270A CN202310690851.2A CN202310690851A CN116629270A CN 116629270 A CN116629270 A CN 116629270A CN 202310690851 A CN202310690851 A CN 202310690851A CN 116629270 A CN116629270 A CN 116629270A
Authority
CN
China
Prior art keywords
scored
subjective question
subjective
content
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310690851.2A
Other languages
Chinese (zh)
Other versions
CN116629270B (en
Inventor
马赫
董淑娟
倪小明
郭南明
杜育林
刘佳荣
洪潜凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Nanfang Human Resources Evaluation Center Co ltd
Original Assignee
Guangzhou Nanfang Human Resources Evaluation Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Nanfang Human Resources Evaluation Center Co ltd filed Critical Guangzhou Nanfang Human Resources Evaluation Center Co ltd
Priority to CN202310690851.2A priority Critical patent/CN116629270B/en
Publication of CN116629270A publication Critical patent/CN116629270A/en
Application granted granted Critical
Publication of CN116629270B publication Critical patent/CN116629270B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Abstract

The invention requests to protect subjective question scoring method and device based on examination big data and text semantics, and carries out data processing on answer content of input subjective questions, including removing special characters, word segmentation and word segmentation; mapping the answer content of the input subjective questions after data processing into a numerical sequence; inputting the numerical sequence of the answer content of the input subjective questions into a feature extraction model, and collecting feature vectors of the answer content of the input subjective questions; and carrying out semantic clustering based on the feature vectors, and grading the subjective questions according to the reference data. The invention can clearly distinguish the current answering state, carries out targeted analysis on the multidimensional characteristics of answering contents, comprehensively considers the framework and the content weight of the current answering contents, plays an important role for application under artificial intelligence, and can provide a lot of information and logic guidance for reference as auxiliary decision.

Description

Subjective question scoring method and device based on examination big data and text semantics
Technical Field
The invention belongs to the field of big data processing, and particularly relates to a subjective question scoring method and device based on examination big data and text semantics.
Background
At present, the subjective questions are scored in a manner of matching keywords, namely, a large number of manual setting of all conceivable keywords which can be used as answers is used, whether the possible keywords appear in the answers of the answering staff is searched, and if so, the subjective questions are scored, and if not, the subjective questions are not scored.
However, the subjective question scoring mode purely relying on keywords is excessively dependent on the set of answer keywords set in advance, and once the set is not complete, the probability of obvious errors of scores of examinees is increased sharply, so that there is still a more objective technical scheme improvement demand for the scoring standard of the subjective questions.
Disclosure of Invention
In order to solve the problem that the verification and output of the current subjective question detection answer content is inaccurate, the invention requests to protect a subjective question scoring method and device based on examination big data and text semantics.
According to a first aspect of the present invention, the present invention claims a subjective question scoring method based on examination big data and text semantics, comprising:
collecting subjective question answering contents to be scored, and partitioning and standardizing the subjective question answering contents to be scored to obtain standard answering contents to be scored;
Inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the subjective question answer contents to be scored under each feature extraction model;
and collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
Further, collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored, which specifically comprises the following steps:
according to the spatial position relation between the subjective question answer area to be scored and the subjective question answer area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer to enable the subjective question answer area of the selected subjective question standard answer to be aligned with the subjective question answer area to be scored, and comprising the following steps: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
Resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
and according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
Further, the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model, wherein the semantic clustering result set specifically comprises:
collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of standard answer content to be scored, and inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
Each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answering contents to be scored under each feature extraction model according to the candidate scoring contents.
Further, the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks trained through deep learning;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
the first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content missing, a detail content defect and an identification probability value of complete detail content;
The semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
Further, collecting subjective question scoring results of the subjective question answering contents to be scored according to a semantic clustering result set of the subjective question answering contents to be scored, which specifically comprises the following steps:
collecting verification output of each candidate scoring content according to a semantic clustering result set of the subjective question to be scored as a response content;
obtaining subjective question scoring results of subjective question answering contents to be scored according to the verification output;
and forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
According to a second aspect of the present invention, the present invention claims a subjective question scoring apparatus based on examination big data and text semantics, comprising:
the standard processing module is used for collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored;
the branch processing module inputs the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model;
And the result output module is used for collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
Further, collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored, which specifically comprises the following steps:
according to the spatial position relation between the subjective question answer area to be scored and the subjective question answer area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer to enable the subjective question answer area of the selected subjective question standard answer to be aligned with the subjective question answer area to be scored, and comprising the following steps: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
And according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
Further, the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model, wherein the semantic clustering result set specifically comprises:
collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of standard answer content to be scored, and inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answering contents to be scored under each feature extraction model according to the candidate scoring contents.
Further, the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks trained through deep learning;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
the first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content missing, a detail content defect and an identification probability value of complete detail content;
the semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
Further, collecting subjective question scoring results of the subjective question answering contents to be scored according to a semantic clustering result set of the subjective question answering contents to be scored, which specifically comprises the following steps:
Collecting verification output of each candidate scoring content according to a semantic clustering result set of the subjective question to be scored as a response content;
obtaining subjective question scoring results of subjective question answering contents to be scored according to the verification output;
and forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
The invention requests to protect a subjective question scoring method and device based on examination big data and text semantics, and the subjective question scoring content to be scored is obtained by collecting the subjective question scoring content to be scored, and carrying out partitioning and standardization processing on the subjective question scoring content to be scored; inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the subjective question answer contents to be scored under each feature extraction model; and collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored. The invention can clearly distinguish the current answering state, carries out targeted analysis on the multidimensional characteristics of answering contents, comprehensively considers the frame, content weight, validity use and amplification condition of the current answering content, plays an important role for application under artificial intelligence, and can provide a lot of information and logic guidance for reference as auxiliary decision.
Drawings
FIG. 1 is a workflow diagram of the subject matter scoring method based on test big data and text semantics as claimed in the present application;
FIG. 2 is a second workflow diagram of the subject matter scoring method based on test big data and text semantics as claimed in the present application;
FIG. 3 is a third workflow diagram of the subject matter scoring method based on test big data and text semantics as claimed in the present application;
FIG. 4 is a fourth operational flow diagram of the subject matter scoring method based on test big data and text semantics as claimed in the present application;
fig. 5 is a block diagram of the structure of the subjective question scoring device based on examination big data and text semantics as claimed in the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. It will be understood that the terms "first," "second," and the like, as used herein, may be used to describe various elements, but these elements are not limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another element. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
According to a first embodiment of the present invention, referring to fig. 1, the present invention claims a subjective question scoring method based on examination big data and text semantics, comprising:
collecting subjective question answering contents to be scored, and partitioning and standardizing the subjective question answering contents to be scored to obtain standard answering contents to be scored;
inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the subjective question answer contents to be scored under each feature extraction model;
and collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
The invention simply, but rapidly and accurately identifies the test paper, and can also rapidly and accurately score objectively, subjectively and narratically through data processing. More specifically, the standard marks in the test paper and the answer sheet capable of being identified are used, the whole picture is not required to be checked, and only the answer sheet area is required to be checked, so that the test paper can be quickly identified and can be automatically shot, and the convenience of a user can be improved. Moreover, the position of the test paper can be accurately mastered through the reference mark, whether the test paper rotates or not can be identified, an inclination correction index is provided when an image is corrected, and an answer position extraction index is provided when an answer region is extracted, so that the test paper can be simply, quickly and accurately identified. Further, in scoring an answer, in the case of an objective question, binary values of a mark, a check, and a recording position identified as an answer region are not specified as absolute values, but a mark position or a check position is relatively identified by calculating a lowest value of each question, so that reliability of a scoring result can be improved, and in the case of a subjective question or a descriptive question, a scorer can view only answers other than questions, so that readability is improved, and scoring can be easily performed.
"subjective questions" refers to questions with a subjective interpretation, such as a brief answer, by referring to at least one reference answer, such as text, where the reference answer may be composed of a plurality of keywords (i.e., a given point or a collection point), where the interpretation result may include at least three results, such as correct answer, incorrect and unknown (i.e., whether it cannot be determined correctly or incorrectly), and the questions are, as non-limiting examples, brief answer, text writing, and blank filling.
Further, collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored, which specifically comprises the following steps:
and according to the spatial position relation between the subjective question answering area to be scored and the subjective question answering area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer, so that the subjective question answering area of the selected subjective question standard answer is aligned with the subjective question answering area to be scored.
Referring to fig. 2, specifically, the method includes: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
Resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
and according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
Further, the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
referring to fig. 3, inputting standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of subjective question answer contents to be scored under each feature extraction model, which specifically includes:
collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of standard answer content to be scored, and inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
Each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answering contents to be scored under each feature extraction model according to the candidate scoring contents.
Further, the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks trained through deep learning;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
the first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content missing, a detail content defect and an identification probability value of complete detail content;
The semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
The number of recognition possibilities output by the frame feature extraction model may be equal to or smaller than the number of kinds of frames of the preset type. For example, the number of frames of the preset type is 3, the frame feature extraction model can output 3 recognition possibilities, and the 3 recognition possibilities can be the recognition possibilities that the answer content of the subjective questions to be scored respectively belong to the frames of the preset type; the frame feature extraction model can only output 3 recognition possibilities, and the 3 recognition possibilities can be the 3 recognition possibilities with the largest median of the recognition possibilities of the subjective questions to be scored, wherein the subjective questions to be scored are respectively in 3 preset types of frames; for example, the frame feature extraction model may output only 1 recognition probability, which is a probability that the median of the recognition probabilities of the 3 preset types of frames is largest.
In this embodiment, extracting key points with smaller weight and larger weight adaptation and neutralization weight respectively, and establishing an affine matrix with larger weight to smaller weight and moderate weight based on the key points with smaller weight and larger weight adaptation and neutralization weight;
Extracting answer content blocks in the small weight and the weight adaptation, and establishing a training answer content data set based on the answer content blocks; judging whether the extracted answer content block is positioned in the content weight area according to the affine matrix, if so, marking the extracted answer content block as a positive sample, otherwise marking the extracted answer content block as a negative sample.
Specifically, foreground information and background information with smaller weight and larger weight in the adaptive middle are distinguished respectively, positions of tissues in the foreground information with smaller weight and moderate weight and the foreground information with larger weight are detected respectively, and the tissues are completely extracted from the foreground information with smaller weight and larger weight in the adaptive middle, so that the tissues with smaller weight and larger weight in the adaptive middle are formed; and respectively extracting key points with smaller tissue weight and larger weight adaptation and neutralization weight, matching the key points with smaller tissue weight and larger weight adaptation and neutralization weight, and establishing an affine matrix with larger weight to smaller tissue weight and moderate weight.
Extracting key points from the tissue with smaller weight, proper weight and larger weight respectively;
calculating the similarity of key points with smaller weight and larger weight;
establishing key point matching pairs with small organization weight and large weight in weight adaptation according to the similarity of the key points; generating an affine matrix based on the key point matching pairs;
Extracting the positions of the subjective questions with larger weight in the larger weight, generating a marking grid map with larger weight, and applying an affine matrix to the marking grid map with larger weight to obtain the corresponding subjective questions with smaller organization weight and moderate weight.
In this embodiment, the score point judgment information of the subjective question answering content to be scored is identified according to the validity answering content of the subjective question answering content to be scored; and collecting the recorded scoring elements in the identifying sensor of the answer content of the subjective questions to be scored, and judging whether the information verification scoring elements are correct or not according to the scoring points collected from the answer content identifying module.
Specifically, collecting validity answer content of subjective questions to be scored; extracting answer content characteristic information of effective answer content acquired by the vision acquisition device, matching the extracted answer content characteristic information with answer content characteristic information stored in a memory, and determining the validity information of operation effectiveness corresponding to the matched answer content characteristic information as score point judgment information of the answer content of the subjective questions to be scored.
Reading the scoring element recorded in the identification sensor; and collecting the score judgment information of the determined subjective question answer content to be scored, and comparing the score element with the score judgment information to judge whether the score element recorded in the identification sensor is correct or not.
Reading a first check value recorded in the identification sensor, calculating the score element according to a second preset check algorithm to obtain a second check value, and judging whether the first check value is matched with the second check value.
Wherein in this embodiment, identifying the probability value for the fourth candidate scoring content including missing detail content, defect in detail content, complete detail content includes:
storing the stretched resolution answer content and configuration information of subjective question answer content to be scored, which is generated according to the stretched resolution answer content, in a resolution memory, wherein the stretched resolution answer content is obtained by stretching original resolution answer content acquired by a display screen device, and the subjective question answer content to be scored is used for displaying on the display screen;
obtaining configuration information of answering contents of subjective questions to be scored and stretched resolution answering contents according to the resolution data processing request;
recovering the stretched resolution response content according to the configuration information to obtain a basic resolution response content;
amplifying the basic resolution answer content according to the configuration information to obtain subjective question answer content to be scored;
Displaying answer content of subjective questions to be scored;
searching in a resolution memory to obtain subjective question answering contents to be scored, which are matched with the position coordinate information, according to the position coordinate information of each pixel point to be scored in the display resolution answering content in any resolution answering content area on the display screen;
and comparing and analyzing the pixel value of the pixel point to be scored with the pixel value of the corresponding answer content of the subjective question to be scored, and judging whether amplification exists or not according to the comparison and analysis result.
Further, referring to fig. 4, according to a semantic clustering result set of the subjective question answering contents to be scored, subjective question scoring results of the subjective question answering contents to be scored are collected, which specifically includes:
collecting verification output of each candidate scoring content according to a semantic clustering result set of the subjective question to be scored as a response content;
obtaining subjective question scoring results of subjective question answering contents to be scored according to the verification output;
and forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
In this embodiment, the candidate score content of each feature extraction model is essentially a vector after the softmax operation, and the multiple candidate score contents of each feature extraction model are subjected to tuple construction to form a verification output in a multi-tuple form.
For example, the check output of the content weight feature extraction model for answering the subjective questions to be scored at a time is [0,0.1,0.7,0.2], which represents the predicted probability values of the four results of the content weight branches, respectively.
Wherein the probability of infinitesimal weight is 0;
the probability of smaller weight is 0.1;
the probability of moderate weight is 0.7;
the probability of a larger weight is 0.2.
In this case, a strategy of winner's general eating is adopted, and under this scheme, the verification output of the content weight feature extraction model is considered as "weight is moderate".
Similarly, the framework recognition feature extraction model, the validity recognition feature extraction model and the minutiae feature extraction model all adopt the same strategy. Thus, the prediction vector is converted into a single prediction category.
It is emphasized that each frame will have a corresponding examination type result. So that tens of thousands of five-tuple results (frame state, content weight state, validity state, magnification state, examination type) are obtained during the examination/operation of a answer. Our patent facilitates the quick localization of certain special scenarios of a case when the rolls-judging person performs retrospective work afterwards.
According to a second embodiment of the present invention, referring to fig. 5, the present invention claims a subjective question scoring device based on examination big data and text semantics, comprising:
the standard processing module is used for collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored;
the branch processing module inputs the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the subjective question answer contents to be scored under each feature extraction model;
and the result output module is used for collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
Further, collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored, which specifically comprises the following steps:
according to the spatial position relation between the subjective question answer area to be scored and the subjective question answer area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer to enable the subjective question answer area of the selected subjective question standard answer to be aligned with the subjective question answer area to be scored, and comprising the following steps: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
Resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
and according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
Further, the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model, wherein the semantic clustering result set specifically comprises:
collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of standard answer content to be scored, and inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
Each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answering contents to be scored under each feature extraction model according to the candidate scoring contents.
Further, the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks trained through deep learning;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
the first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content missing, a detail content defect and an identification probability value of complete detail content;
The semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
Further, collecting subjective question scoring results of the subjective question answering contents to be scored according to a semantic clustering result set of the subjective question answering contents to be scored, which specifically comprises the following steps:
collecting verification output of each candidate scoring content according to a semantic clustering result set of the subjective question to be scored as a response content;
obtaining subjective question scoring results of subjective question answering contents to be scored according to the verification output;
and forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
Those skilled in the art will appreciate that various modifications and improvements can be made to the disclosure. For example, the various devices or components described above may be implemented in hardware, or may be implemented in software, firmware, or a combination of some or all of the three.
A flowchart is used in this disclosure to describe the steps of a method according to embodiments of the present disclosure. It should be understood that the steps that follow or before do not have to be performed in exact order. Rather, the various steps may be processed in reverse order or simultaneously. Also, other operations may be added to these processes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the methods described above may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, such as a read only memory, a magnetic disk, or an optical disk. Alternatively, all or part of the steps of the above embodiments may be implemented using one or more integrated circuits. Accordingly, each module/unit in the above embodiment may be implemented in the form of hardware, or may be implemented in the form of a software functional module. The present disclosure is not limited to any specific form of combination of hardware and software.
Unless defined otherwise, all terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The foregoing is illustrative of the present disclosure and is not to be construed as limiting thereof. Although a few exemplary embodiments of this disclosure have been described, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this disclosure. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the claims. It is to be understood that the foregoing is illustrative of the present disclosure and is not to be construed as limited to the specific embodiments disclosed, and that modifications to the disclosed embodiments, as well as other embodiments, are intended to be included within the scope of the appended claims. The disclosure is defined by the claims and their equivalents.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
Those skilled in the art will appreciate that various modifications and improvements can be made to the disclosure. For example, the various devices or components described above may be implemented in hardware, or may be implemented in software, firmware, or a combination of some or all of the three.
A flowchart is used in this disclosure to describe the steps of a method according to embodiments of the present disclosure. It should be understood that the steps that follow or before do not have to be performed in exact order. Rather, the various steps may be processed in reverse order or simultaneously. Also, other operations may be added to these processes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the methods described above may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, such as a read only memory, a magnetic disk, or an optical disk. Alternatively, all or part of the steps of the above embodiments may be implemented using one or more integrated circuits. Accordingly, each module/unit in the above embodiment may be implemented in the form of hardware, or may be implemented in the form of a software functional module. The present disclosure is not limited to any specific form of combination of hardware and software.

Claims (10)

1. The subjective question scoring method based on the examination big data and the text semantics is characterized by comprising the following steps:
collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored;
Inputting the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model;
and collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
2. The method for scoring subjective questions based on test big data and text semantics of claim 1,
the method comprises the steps of collecting the answer content of the subjective questions to be scored, partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored, and specifically comprises the following steps:
according to the spatial position relation between the subjective question answering area to be scored and the subjective question answering area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer to enable the subjective question answering area of the selected subjective question standard answer to be aligned with the subjective question answering area to be scored, comprising the following steps: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
Resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
and according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
3. The method for scoring subjective questions based on test big data and text semantics of claim 1,
the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
inputting the standard answer content to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer content of the subjective questions to be scored under each feature extraction model, wherein the semantic clustering result set comprises the following specific steps:
Collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of the standard answer content to be scored, and respectively inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answer content to be scored under each feature extraction model according to the candidate scoring content.
4. The method for scoring subjective questions based on examination big data and text semantics of claim 3,
the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks subjected to deep learning training;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
The first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content deficiency, a detail content defect and an identification probability value of complete detail content; and the semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
5. The method for scoring subjective questions based on test big data and text semantics of claim 4,
the collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored specifically comprises the following steps:
collecting verification output of each candidate scoring content according to the semantic clustering result set of the subjective question to be scored as the answer content;
obtaining subjective question scoring results of the subjective question answering contents to be scored according to the checking output;
And forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
6. Subjective question scoring device based on examination big data and text semantics, which is characterized by comprising:
the standard processing module is used for collecting the answer content of the subjective questions to be scored, and partitioning and standardizing the answer content of the subjective questions to be scored to obtain the answer content of the standard to be scored;
the branch processing module inputs the standard answer contents to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer contents of subjective questions to be scored under each feature extraction model;
and the result output module is used for collecting subjective question scoring results of the subjective question answering contents to be scored according to the semantic clustering result set of the subjective question answering contents to be scored.
7. The subjective question scoring device based on big examination data and text semantics according to claim 6, wherein the collecting subjective question answering contents to be scored, partitioning and standardizing the subjective question answering contents to be scored, and obtaining standard answering contents to be scored specifically includes:
according to the spatial position relation between the subjective question answering area to be scored and the subjective question answering area of the selected subjective question standard answer, carrying out semantic transfer on the selected subjective question standard answer to enable the subjective question answering area of the selected subjective question standard answer to be aligned with the subjective question answering area to be scored, comprising the following steps: according to the line vector of the subjective question answer area to be scored, performing word segmentation processing on the selected subjective question standard answer through word segmentation processing semantic transfer; according to the size of an answer frame of an answer area of the subjective questions to be scored, carrying out equal proportion partition on the selected subjective question standard answers; comparing and pasting the selected subjective question standard answers according to the positions of detection points in the subjective question answering areas to be scored;
Resampling the contents of the subjective question answering areas to be scored according to the subjective question answering area points of the selected subjective question standard answers, so that the number of the vertices of the subjective question answering areas to be scored is the same as that of the vertices of the subjective question answering areas in the selected subjective question standard answers, and the positions of the vertices of the subjective question answering areas are corresponding to those of the vertices of the subjective question answering areas in the selected subjective question standard answers; establishing a score point corresponding relation between the two subjective question answering areas according to the subjective question answering areas to be scored and the vertex serial numbers of the subjective question answering areas of the selected subjective question standard answers;
and according to the subjective question offset line vector, processing the main line word of the subjective question answer content to the same transverse line, and carrying out equal proportion partition on the grid model of the subjective question answer content on the three transverse lines to obtain the answer content of the standard to be scored.
8. The subjective question scoring device based on test big data and text semantics of claim 7,
the plurality of feature extraction models at least comprise a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model;
inputting the standard answer content to be scored into a plurality of feature extraction models to obtain a semantic clustering result set of the answer content of the subjective questions to be scored under each feature extraction model, wherein the semantic clustering result set comprises the following specific steps:
Collecting a first answer content set, a second answer content set, a third answer content set and a fourth answer content set of the standard answer content to be scored, and respectively inputting a first feature extraction model, a second feature extraction model, a third feature extraction model and a fourth feature extraction model correspondingly;
each feature extraction model operation respectively obtains a first candidate scoring content, a second candidate scoring content, a third candidate scoring content and a fourth candidate scoring content;
and collecting a semantic clustering result set of the subjective question answer content to be scored under each feature extraction model according to the candidate scoring content.
9. The subjective question scoring device based on examination big data and text semantics of claim 8,
the first feature extraction model, the second feature extraction model, the third feature extraction model and the fourth feature extraction model are convolutional neural networks subjected to deep learning training;
the first answer content set is a frame-related answer content set, the second answer content set is a content weight-related answer content set, the third answer content set is a validity-related answer content set, and the fourth answer content set is a minutiae answer content set;
The first candidate scoring content comprises an unordered frame, a total score frame and an identification probability value of the total score frame;
the second candidate scoring content comprises an identification probability value with infinitesimal weight, smaller weight, moderate weight, larger weight and infinite weight;
the third candidate scoring content includes an identifying probability value of invalidity, validity;
the fourth candidate scoring content comprises a detail content deficiency, a detail content defect and an identification probability value of complete detail content; and the semantic clustering result set of the subjective question answer contents to be scored is a multi-group formed by the identification probability values of a plurality of elements in each candidate scoring content.
10. The subjective-topic scoring device based on big examination data and text semantics of claim 9, wherein the collecting subjective-topic scoring results of the subjective-topic scoring content to be scored according to a semantic clustering result set of the subjective-topic scoring content to be scored specifically comprises:
collecting verification output of each candidate scoring content according to the semantic clustering result set of the subjective question to be scored as the answer content;
obtaining subjective question scoring results of the subjective question answering contents to be scored according to the checking output;
And forming five-tuple storage by the subjective question scoring result of the subjective question answering content to be scored and the verification output.
CN202310690851.2A 2023-06-12 2023-06-12 Subjective question scoring method and device based on examination big data and text semantics Active CN116629270B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310690851.2A CN116629270B (en) 2023-06-12 2023-06-12 Subjective question scoring method and device based on examination big data and text semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310690851.2A CN116629270B (en) 2023-06-12 2023-06-12 Subjective question scoring method and device based on examination big data and text semantics

Publications (2)

Publication Number Publication Date
CN116629270A true CN116629270A (en) 2023-08-22
CN116629270B CN116629270B (en) 2024-02-02

Family

ID=87597341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310690851.2A Active CN116629270B (en) 2023-06-12 2023-06-12 Subjective question scoring method and device based on examination big data and text semantics

Country Status (1)

Country Link
CN (1) CN116629270B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252739A (en) * 2023-11-17 2023-12-19 山东山大鸥玛软件股份有限公司 Method, system, electronic equipment and storage medium for evaluating paper

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108121702A (en) * 2017-12-26 2018-06-05 科大讯飞股份有限公司 Mathematics subjective item reads and appraises method and system
CN108764074A (en) * 2018-05-14 2018-11-06 山东师范大学 Subjective item intelligently reading method, system and storage medium based on deep learning
CN109213999A (en) * 2018-08-20 2019-01-15 成都佳发安泰教育科技股份有限公司 A kind of subjective item methods of marking
CN114781611A (en) * 2022-04-21 2022-07-22 润联软件系统(深圳)有限公司 Natural language processing method, language model training method and related equipment
KR20220120253A (en) * 2021-02-23 2022-08-30 주식회사 브레인벤쳐스 Artificial intelligence-based subjective automatic grading system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108121702A (en) * 2017-12-26 2018-06-05 科大讯飞股份有限公司 Mathematics subjective item reads and appraises method and system
CN108764074A (en) * 2018-05-14 2018-11-06 山东师范大学 Subjective item intelligently reading method, system and storage medium based on deep learning
CN109213999A (en) * 2018-08-20 2019-01-15 成都佳发安泰教育科技股份有限公司 A kind of subjective item methods of marking
KR20220120253A (en) * 2021-02-23 2022-08-30 주식회사 브레인벤쳐스 Artificial intelligence-based subjective automatic grading system
CN114781611A (en) * 2022-04-21 2022-07-22 润联软件系统(深圳)有限公司 Natural language processing method, language model training method and related equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252739A (en) * 2023-11-17 2023-12-19 山东山大鸥玛软件股份有限公司 Method, system, electronic equipment and storage medium for evaluating paper
CN117252739B (en) * 2023-11-17 2024-03-12 山东山大鸥玛软件股份有限公司 Method, system, electronic equipment and storage medium for evaluating paper

Also Published As

Publication number Publication date
CN116629270B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
CN110008933B (en) Universal intelligent marking system and method
JP6472621B2 (en) Classifier construction method, image classification method, and image classification apparatus
CN110929573A (en) Examination question checking method based on image detection and related equipment
CN111460250B (en) Image data cleaning method, image data cleaning device, image data cleaning medium, and electronic apparatus
CN110210286A (en) Abnormality recognition method, device, equipment and storage medium based on eye fundus image
US9171477B2 (en) Method and system for recognizing and assessing surgical procedures from video
CN116629270B (en) Subjective question scoring method and device based on examination big data and text semantics
CN110659584B (en) Intelligent mark-remaining paper marking system based on image recognition
CN111626177B (en) PCB element identification method and device
CN110909035A (en) Personalized review question set generation method and device, electronic equipment and storage medium
JP7077483B2 (en) Problem correction methods, devices, electronic devices and storage media for mental arithmetic problems
CN112381099A (en) Question recording system based on digital education resources
CN112347997A (en) Test question detection and identification method and device, electronic equipment and medium
CN112991343B (en) Method, device and equipment for identifying and detecting macular region of fundus image
CN109978868A (en) Toy appearance quality determining method and its relevant device
CN111914706A (en) Method and device for detecting and controlling quality of character detection output result
CN110705610A (en) Evaluation system and method based on handwriting detection and temporary writing capability
KR100795951B1 (en) System for grading examination paper and control method
CN112016334A (en) Appraising method and device
CN110335244A (en) A kind of tire X-ray defect detection method based on more Iterative classification devices
CN113673631B (en) Abnormal image detection method and device
CN115937543A (en) Neural network-based pelvis image key point identification method and system
TWI453703B (en) Method and system for assessment of learning
CN113392844A (en) Deep learning-based method for identifying text information on medical film
CN112613500A (en) Campus dynamic scoring system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant