CN112528601A - Question bank construction method, device, equipment and storage medium based on Word document - Google Patents

Question bank construction method, device, equipment and storage medium based on Word document Download PDF

Info

Publication number
CN112528601A
CN112528601A CN202011510690.7A CN202011510690A CN112528601A CN 112528601 A CN112528601 A CN 112528601A CN 202011510690 A CN202011510690 A CN 202011510690A CN 112528601 A CN112528601 A CN 112528601A
Authority
CN
China
Prior art keywords
question
test
answer
word document
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011510690.7A
Other languages
Chinese (zh)
Inventor
姚俊峰
彭兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yuande Education Technology Co ltd
Original Assignee
Shenzhen Yuande Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yuande Education Technology Co ltd filed Critical Shenzhen Yuande Education Technology Co ltd
Priority to CN202011510690.7A priority Critical patent/CN112528601A/en
Publication of CN112528601A publication Critical patent/CN112528601A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for establishing a question bank based on a Word document, wherein the question bank establishing method comprises the following steps: filing the test paper or the test question book to form a Word document; loading the Word document by using a preset program; sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document; obtaining question types, question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics; binding the test question content and the answer content; and storing the bound test question content and the bound answer content into a test question library. The test paper and the test book of the question bank building method can be automatically input and divided into the test questions and the answers without adding auxiliary labels, the steps are simple, and the question bank building efficiency is greatly improved.

Description

Question bank construction method, device, equipment and storage medium based on Word document
Technical Field
The invention relates to the technical field of education, in particular to a Word document-based question bank construction method, a Word document-based question bank construction device and a storage medium of a device.
Background
The current technology for generating question banks based on test paper import in the market mainly comprises the following technologies, namely: the software form filling mode is similar to the mail writing mode, and an operator copies the electronic contents into an edit box together. And the second method comprises the following steps: the Excel table import method operates in the same manner as above. And the third is that: the Word document brush format brushing method predefines some specific styles, such as: question style, question stem style, answer style, etc.; each portion of the document is then manually swiped to a particular style. And fourthly: the Word document combines a complete set of additional label method to assist the program to segment the test paper, the operation mode is similar to the style, and only the style is changed into the label, for example: stem, options, answers, etc.
The method for generating the question bank has the defects of long input time, high operation difficulty, easy error, no support of resolving the mathematical formula, manual conversion of the mathematical formula into pictures for input, invisible typesetting format, and incapability of directly displaying the formula, wherein the picture is a picture.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
The invention mainly aims to solve the technical problems of long input time and low efficiency of the existing question bank construction method.
The invention provides a question bank construction method based on Word documents, which comprises the following steps:
filing the test paper or the test question book to form a Word document;
loading the Word document by using a preset program;
sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
obtaining question types, question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics;
binding the test question content and the answer content;
and storing the bound test question content and the bound answer content into a test question library.
Optionally, in a first implementation manner of the first aspect of the present invention, the sequentially identifying a question type feature, a question feature, and a reference answer feature in the Word document includes:
reading paragraphs in the Word document;
and intercepting the first two characters in the paragraph, and judging whether the first two characters in the paragraph are question type features, test question features and reference answer features.
Optionally, in a second implementation manner of the first aspect of the present invention, the intercepting the first two characters in the paragraph and determining whether the first two characters in the paragraph are question type features, question features, and reference answer features includes:
if the first two characters in the paragraph are question type features, adjusting the program to be in a question type identification state, and then obtaining the question type of the program;
if the first two characters in the paragraph are test question features, adjusting the program to identify the test question state, and then obtaining test question content;
if the first two characters in the paragraph are the reference answer features, the program is adjusted to identify the reference answer state, and then the answer content is obtained.
Optionally, in a third implementation manner of the first aspect of the present invention, the storing the bound test question content and the answer content in a test question library includes:
generating an adjacency list of the question types, the test question contents and the answer contents;
and storing the test question content and the answer content into a test question library according to the adjacency list.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the test question features include a question number feature, an option feature, a formula feature, and a table feature.
Optionally, in a fifth implementation manner of the first aspect of the present invention, in the Word document, the title type feature uses a chinese serial number or a roman numeral, and writes a dot number or a pause number after the chinese serial number or the roman numeral; the title number features use Arabic numerals, and a dot number or a pause number is written behind the Arabic numerals; the option features use capital English letters, point numbers or pause numbers are written after the capital English letters, the reference answer features use 'reference answer' keywords, and corresponding question numbers, answers and analysis are written after the keywords.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the program loads the Word document in a read-only manner.
A second aspect of the present invention provides an item library creating apparatus, including:
the filing module is used for filing the test paper or the test question book to form a Word document;
the loading module is used for loading the Word document by using a preset program;
the identification module is used for sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
the obtaining module is used for obtaining question types, test question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics;
the binding module is used for binding the test question content and the answer content;
and the storage module is used for storing the bound test question content and the bound answer content into a test question library.
A third aspect of the present invention provides an item bank constructing apparatus, including: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor calls the instructions in the memory to cause the question bank constructing device to execute the question bank constructing method based on the Word document according to any one of the above items.
A fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for building a question bank based on Word documents according to any one of the above items.
Has the advantages that: the invention provides a method, a device, equipment and a storage medium for establishing a question bank based on a Word document, wherein the question bank establishing method comprises the following steps: filing the test paper or the test question book to form a Word document; loading the Word document by using a preset program; sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document; obtaining question types, question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics; binding the test question content and the answer content; and storing the bound test question content and the bound answer content into a test question library. The test paper and the test book of the question bank building method can be automatically input and divided into the test questions and the answers without adding auxiliary labels, the steps are simple, and the question bank building efficiency is greatly improved.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a question bank construction method based on Word documents according to the present invention;
FIG. 2 is a schematic diagram of an embodiment of an item library construction apparatus according to the present invention;
fig. 3 is a schematic diagram of an embodiment of the question bank constructing device according to the present invention.
Detailed Description
The embodiment of the invention provides a method, a device, equipment and a storage medium for establishing a question bank based on a Word document.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and with reference to fig. 1, a first aspect of the present invention provides a question bank construction method based on Word documents, where the question bank construction method includes:
s100, filing the test paper or the test question book to form a Word document;
in this embodiment, the Word document contains many elements, such as question type, question stem, option, answer, parsing, etc. The characteristics of each element are not the same, and the number of lines each element occupies in a word document is not fixed.
S200, loading the Word document by using a preset program;
in this embodiment, a Word Document provides hundreds of objects that can interact with, organized in a hierarchy that closely follows the user interface, an Application object at the top of the hierarchy, the Application object representing the current instance of Word, the Application object containing Document, Selection, Bookmark, and Range objects, each of which has many methods and properties that can be used to manipulate and interact with the objects.
Objects are overlapped, e.g., Document and Selection objects are both members of the Application object, but Document objects are also members of the Selection object, both Document and Selection objects contain Bookmark and Range objects, there is an overlap because there are multiple ways to access the same type of object, e.g., applying format settings to Range objects; but you may want to access currently selected content, a particular paragraph, a section or the scope of the entire document.
S300, sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
in the present embodiment, the question type feature processing is, for example, "one.radio question", the test question feature processing is, for example, "1. x + y" ____ ", and the answer feature processing is, for example: the paragraph is checked for the presence of a "reference answer" keyword and if so, the following parsing answer is started.
S400, obtaining question types, question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics; in the present embodiment, the test question content handles, for example, a stem section and a selected option section, and the answer content includes a parsing section.
S500, binding the test question content and the answer content; in this embodiment, for example, the first test question and the answer to the first test question are bound in a one-to-one correspondence manner, specifically, the binding is performed according to the question number in front of the test question and the answer;
s600, storing the bound test question content and the bound answer content into a test question library.
Specifically, when the method for establishing the question bank based on the Word document is applied to the test book, the whole operation process can be summarized as that the Word document is filed according to the book directory hierarchical structure, then the Word document is loaded by a program, the Word document is intelligently segmented by the program, the test paper information is stored in the json format, if the segmentation is wrong, the error information is written into a log file, the program calls WebAPI service written based on asp, json data is uploaded, and the service program writes the data into the SQL Server database through the entity frame.
In an optional implementation manner of the first aspect of the present invention, the sequentially identifying a question type feature, a question feature, and a reference answer feature in the Word document includes:
reading paragraphs in the Word document;
and intercepting the first two characters in the paragraph, and judging whether the first two characters in the paragraph are question type features, test question features and reference answer features.
In this embodiment, identifying question type features, test question features, and reference answer features in a Word document may be implemented by a neural network model, and techniques such as an image processing algorithm and a natural language processing algorithm are integrated, specifically, a binary image is obtained by preprocessing a scroll in the Word document, the binary image is analyzed to determine a layout area of the binary image, text line detection is performed on each layout area, a text line of each layout is traversed, a corresponding text image is extracted from a circumscribed rectangular area with a largest text line, the text image is input to a text identification model to be matched so as to identify text information of the scroll, the text information and the text line are correspondingly combined to obtain the Word document with the identified text information, and then the test question information in the scroll image is extracted according to different feature labels.
In an optional implementation manner of the first aspect of the present invention, the intercepting the first two characters in the paragraph and determining whether the first two characters in the paragraph are question type features, test question features, and reference answer features includes:
if the first two characters in the paragraph are question type features, adjusting the program to be in a question type identification state, and then obtaining the question type of the program;
if the first two characters in the paragraph are test question features, adjusting the program to identify the test question state, and then obtaining test question content;
if the first two characters in the paragraph are the reference answer features, the program is adjusted to identify the reference answer state, and then the answer content is obtained.
In this embodiment, the program identifies the question type feature, the test question feature, and the reference answer feature in the paragraph by using a state machine, the question type and the test question in the Word document can be put together, after the reference answer is put in the test question, the program reads the beginning of the paragraph in the Word document from top to bottom to determine whether the read region is a test question region or an answer region, and then completes the identification of each part of the whole Word document to extract the test question content and the answer content therein.
In an optional implementation manner of the first aspect of the present invention, the storing the bound test question content and the answer content in a test question library includes:
generating an adjacency list of the question types, the test question contents and the answer contents;
and storing the test question content and the answer content into a test question library according to the adjacency list.
In this embodiment, the structure of the adjacency list specifically includes, for example, a single-choice question: test question 1/test question 2; filling in the blank: examination questions 3; completing a shape filling problem: when the first test question 4/the first test question 5/the second test question 6/the second test question 7 are stored in the test question library, the single-choice question is stored in the storage area of the single-choice question, the blank filling question is stored in the storage area of the blank filling question, and the shape filling question is stored in the shape filling question.
In an optional implementation manner of the first aspect of the present invention, the test question features include a question number feature, an option feature, a formula feature, and a table feature. Specifically, the invention supports segmentation of basic coverage full-header type: single choice questions, multiple choice questions, filling in blank questions, reading and understanding question types, completely filling in blank questions, solving questions, judging questions, and supporting analysis of all basic typesetting styles required by the test questions: the table, underline, dotted, centered and other typesetting styles support the parsing of mathematical formulas and the conversion of mathematical formulas into two international common formats: mathml, latex, and the test paper can be automatically cut without adding an auxiliary label, and the addition of such as: a stem, an option tag, etc.
In an alternative implementation manner of the first aspect of the present invention, in the Word document, the title type feature uses a chinese order or roman numeral, and writes a dot number or a pause number after the chinese order or the roman numeral; the title number features use Arabic numerals, and a dot number or a pause number is written behind the Arabic numerals; the option features use capital English letters, point numbers or pause numbers are written after the capital English letters, the reference answer features use 'reference answer' keywords, and corresponding question numbers, answers and analysis are written after the keywords.
In this embodiment, through analysis of a sample test paper or a test question book, the following segmentation rules are obtained, wherein the options are uniformly started with capital ABCDEF, followed by a point (all-angle and half-angle) or a pause sign (,), the question number is an arabic numeral 0-9, followed by a point (all-angle and half-angle) or a pause sign (,), the answer is blank, and the answer is an underline style, a solid underline or a round bracket, the question type uses a chinese serial number one, two, or roman numerals i and ii, followed by a point (all-angle and half-angle) or a pause sign (,), the reader understands that the articles in the reader are distinguished by one or two types, and the program can be automatically segmented as long as the document is identified according to the rules: information such as question number, question stem, options, and questions.
In an alternative implementation manner of the first aspect of the present invention, the program loads the Word document in a read-only manner.
In a specific embodiment of the method for constructing the question bank based on the Word document in the first aspect of the present invention, the method includes the following steps:
step 1: creating a Document object and loading a word file in a read-only mode;
step 2: obtaining all Section objects through Section objects;
and step 3: reading Section objects of a first Section index, wherein the Section index is a cycle mark, the Section index starts from zero, and the maximum value of the Section index is document.
And when the Section object is obtained through Section [ Section index ] for the first time, the Section index is equal to 0, when the Section object is obtained for the second time, the Section index is equal to 1, and so on, judging whether the Section index is smaller than Section. If yes, executing step 4;
and 4, step 4: obtaining all Paragraph objects through document.
And 5: reading the paragraphIndex which is a circulation mark and starts from zero, wherein the maximum value of the paragraphIndex is paragraphLength-1;
and (3) when the Paragraph object is obtained through the Paragraph [ Paragraph index ] for the first time, the Paragraph index is equal to 0, when the Paragraph object is obtained for the second time, the Paragraph index is equal to 1, and so on, the Paragraph object is obtained each time, whether the Paragraph index is smaller than the Paragraph. If yes, executing step 6;
step 6: format object gets paragraph level, font size, font display style.
And 7: acquiring all ParagrAN _ SNh item objects through the ParagrAN _ SNh.
And 8: reading a first paragatherItemIndex which is a circulation mark, starting from zero, and the maximum value of the paragatherItemIndex is ParagatherItem.Length-1;
and when the paragraphItem object is obtained through the paragraphItem [ paragraphItemIndex ] for the first time, the paragraphItemIndex is equal to 0, when the paragraphItem object is obtained for the second time, the paragraphItemIndex is equal to 1, and so on, each time the paragraphItem object is obtained, judging whether the paragraphItemIndex is smaller than the ParaagraphItem.Length-1, if not, indicating that the paragraph is completely parsed, and executing 10. If yes, executing step 9;
and step 9: classifying and processing according to the type of the paragraphItem, documentObjectType, and executing the step 8 after the processing is finished;
if condition one, the DocumentObjectType is equal to DocumentObjectType textrange, the styles of underlining, superscript, subscript, etc. of the paragraph are obtained according to the paramraphitem. Then, the above surface style is converted into a corresponding tag of HTML5, and then the analyzed content is additionally output to an analysis cache region;
if the condition II is that the DocumentObjectType is equal to the DocumentObjectType.Picture, converting the paragraphItem into a DocPicture object, and calling DocPicture.image to save the picture to the local; then generating an HTML5< img > tag, and then additionally outputting the analyzed content to an analysis cache region;
if the DocumentObjectType is equal to DocumentObjectType.Object, converting the paragraphItem into a DocObject object, judging whether the OLE is an audio file according to a DocObject.PackageFileName character string, and if so, storing binary data of the DocObject.NativeData to the local; then generating an HTML5< audio > tag, and then additionally outputting the analyzed content to an analysis cache region;
and fourthly, when the DocumentObjectType is equal to the DocumentObjectType, converting the paragraphItem into an OfficeMath object, and calling an OfficeMath. If the program sets a translation to the latex output format. Then call
The system, xml, xsl, compandedproform object loads the "MML 2LaTex \ mmltex, xsl" file and then calls the xsl compandedproform. If the program settings are to be converted to a Mathml format. Then the officemath.tomahmlcode method is output as a string, and the string string.replace ("mml:", ") method is called; then, additionally outputting the analyzed content to an analysis cache region;
if the condition five is that the DocumentObjectType is equal to the DocumentObjectType, an HTML5< br/> tag is added in the analysis character string cache region; then, additionally outputting the analyzed content to an analysis cache region;
and if the documentObjectType is equal to documentObjectType.Table, converting the paragraphItem into a Table object, reading table.Rows to acquire Table row objects of all tables, and reading all columns of Table Row.cells. Calling TableRow.cells.Paragrraphs to obtain the row and column number of the table, and obtaining the content of each cell of the table through TableRow.cells.ParagrraphsItems.Items; all contents are represented by HTML5 table labels; then, additionally outputting the analyzed content to an analysis cache region;
step 10: judging whether the state of the paperParsseStatus is in a reference answer analysis part, if so, executing a step 100, otherwise, executing a step 11;
step 11: checking whether the character of the parsing buffer is a topic beginning paragraph, comparing with topic features [ "one," "six," ] or [ "i,", etc. ", by calling string.StartsWidth, and if not, performing step 12; if so, setting the state of the PaperParseStatus as PaperParseStatus. Creating a questoiteggory object to store question type information; adding QuestionCategory to a QuestionCategoryControl object; executing the step 5;
step 12: checking whether the characters of the analysis buffer area are title beginning paragraphs or not, matching character strings by using a regular expression of 'Lambda \ s \ [1-9| l ] \ d \ {1} {0 }' by calling the Regex regular expression, and if yes, executing a step 13; if not, then string.startpath is invoked to compare the reading comprehension [ "(one)", "(two)", "(three)", "(four)", "(five)", "(six)" ] features in the question pattern, if yes, then step 13 is performed; if not; step 14 is executed;
step 13: setting paperParsetStatus equal to paperParsetStatus. Setting QuestionControl. eQuestionParseStatus as EQuestionParseStatus. QuestionParse _ Title; adding a questations object to a questationcategorycontrol object; executing the step 5;
step 14: checking whether the character string of the analysis buffer is a reference answer starting paragraph, and calling paragraph, indexof (reference answer) whether the character string is matched, if not, executing step 15; if so, setting the state of the PaperParseStatus as PaperParseStatus. Executing the step 5;
step 15: classifying and processing according to the state of the paperParsetStatus, and executing the step 5 after the processing is finished;
when the condition I is that the PaperParseStatus is equal to the PaperParseStatus.
When the PaperParseStatus is equal to PaperParseStatus.
If the condition three is that PaperParseStatus is equal to PaperParseStatus.
Step 153: checking whether the character of the parsing buffer is an option start paragraph, comparing with the question type feature [ "a", "B", "C", "D", "E", "F", "a", "B", "C", "D", "E", "C", "D", "C; if yes, calling a questions.addOptionParagraph method to save option content; setting the state of eQuestionParseStatus as EQuestionParseStatus.QuestionParse _ Option; executing the step 5;
step 154: checking whether the characters of the parsing buffer are a beginning paragraph of reading comprehension, by judging whether eQuestionParseStatus is equal to EQuestionParseStatus.QuestionParse _ Option and matches [ "B", "C", "D", "E", "F" ], if not, executing step 5; if yes, storing the character string; saving the Questions to a Questions category object; executing the step 5;
step 100: checking whether the character of the parsing buffer is an answer topic type start paragraph, comparing with topic type features [ "one," six, "] or [" i, ", etc. by calling string.startwidth, and if not, performing step 101; if so, setting the state of the PaperParseStatus as PaperParseStatus. Creating a questoiteggory object to store question type information; adding QuestionCategory to a QuestionCategoryControl object; executing the step 5;
step 101: checking whether the characters of the analysis buffer area are answer topic beginning paragraphs or not, matching character strings by using a regular expression of 'Lambda \ s \ [1-9| l ] \ d \ {1} {0 }' by calling a Regex regular expression, and if yes, executing a step 102; if not; step 103 is executed;
step 102: classifying and processing according to the content of the cache buffer area, and executing the step 5 after the processing is finished;
when the condition I and the regular expression are matched with 1, answer xxxx, an answer text is taken out through string.substring;
condition two, regular expression matching "22, solution: xxxx ", taking out a question parsing text through string.substring;
the second condition is that when the regular expression is matched with that the 1.C analysis part needs to contain Chinese characters, the answer and the analysis are separated through Regex.Split;
step 103: setting paperParsetStatus equal to paperParsetStatus. Setting QuestionControl. eQuestionParseStatus as EQuestionParseStatus. QuestionParse _ Title; adding a questations object to a questationcategorycontrol object; executing the step 5;
referring to fig. 2, the second aspect of the present invention further provides an item library constructing apparatus, including:
the filing module 10 is used for filing the test paper or the test question book to form a Word document;
the loading module 20 is used for loading the Word document by using a preset program;
the identification module 30 is used for sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
an obtaining module 40, configured to obtain question types, question contents, and answer contents included in the Word document based on the question type features, the test question features, and the reference answer features;
a binding module 50, configured to bind the test question content and the answer content;
and a storage module 60, configured to store the bound test question content and the bound answer content in a test question library.
In an optional implementation manner of the second aspect of the present invention, the recognition module is further configured to read a paragraph in the Word document; and intercepting the first two characters in the paragraph, and judging whether the first two characters in the paragraph are question type features, test question features and reference answer features.
In an optional implementation manner of the second aspect of the present invention, the identification module is further configured to adjust the program to an identification question state if the first two characters in the paragraph are question features, and then obtain a question to which the program belongs;
if the first two characters in the paragraph are test question features, adjusting the program to identify the test question state, and then obtaining test question content;
if the first two characters in the paragraph are the reference answer features, the program is adjusted to identify the reference answer state, and then the answer content is obtained.
In an optional implementation manner of the second aspect of the present invention, the storage module is further configured to generate an adjacency list of the question types, the test question contents, and the answer contents; and storing the test question content and the answer content into a test question library according to the adjacency list.
In an optional implementation manner of the second aspect of the present invention, the question features include a question number feature, an option feature, a formula feature, and a table feature.
In an alternative implementation manner of the second aspect of the present invention, in the Word document, the title type feature uses a chinese order or roman numeral, and writes a dot number or a pause number after the chinese order or the roman numeral; the title number features use Arabic numerals, and a dot number or a pause number is written behind the Arabic numerals; the option features use capital English letters, point numbers or pause numbers are written after the capital English letters, the reference answer features use 'reference answer' keywords, and corresponding question numbers, answers and analysis are written after the keywords.
In an alternative implementation manner of the second aspect of the present invention, the program loads the Word document in a read-only manner.
Fig. 3 is a schematic structural diagram of a question bank constructing apparatus provided by an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more processors 70 (CPUs) (e.g., one or more processors) and a memory 80, and one or more storage media 90 (e.g., one or more mass storage devices) for storing applications or data. The memory and storage medium may be, among other things, transient or persistent storage. The program stored in the storage medium may include one or more modules (not shown), and each module may include a series of instruction operations in the answer sheet segmentation apparatus. Further, the processor may be configured to communicate with the storage medium and execute a series of instruction operations in the storage medium on the question bank constructing apparatus.
The question bank constructing apparatus may further include one or more power supplies 100, one or more wired or wireless network interfaces 110, one or more input/output interfaces 120, and/or one or more operating systems, such as Windows server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the configuration of the question bank constructing apparatus shown in figure 3 does not constitute a limitation of the question bank constructing apparatus and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, or a volatile computer-readable storage medium, wherein instructions are stored in the computer-readable storage medium, and when the instructions are run on a computer, the instructions cause the computer to execute the steps of the method for building the question bank based on the Word document.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses, and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A question bank construction method based on Word documents is characterized by comprising the following steps:
filing the test paper or the test question book to form a Word document;
loading the Word document by using a preset program;
sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
obtaining question types, question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics;
binding the test question content and the answer content;
and storing the bound test question content and the bound answer content into a test question library.
2. The method for constructing a Word document-based question bank according to claim 1, wherein the sequentially identifying question type features, question features and reference answer features in the Word document comprises:
reading paragraphs in the Word document;
and intercepting the first two characters in the paragraph, and judging whether the first two characters in the paragraph are question type features, test question features and reference answer features.
3. The method for constructing a question bank based on Word documents according to claim 2, wherein the step of intercepting the first two characters in the paragraph and judging whether the first two characters in the paragraph are question type features, question features and reference answer features comprises the following steps:
if the first two characters in the paragraph are question type features, adjusting the program to be in a question type identification state, and then obtaining the question type of the program;
if the first two characters in the paragraph are test question features, adjusting the program to identify the test question state, and then obtaining test question content;
if the first two characters in the paragraph are the reference answer features, the program is adjusted to identify the reference answer state, and then the answer content is obtained.
4. The method for building a question bank based on Word documents according to claim 3, wherein the step of storing the bound test question contents and the bound answer contents into the question bank comprises the steps of:
generating an adjacency list of the question types, the test question contents and the answer contents;
and storing the test question content and the answer content into a test question library according to the adjacency list.
5. The method of claim 1, wherein the question features include a question number feature, an option feature, a formula feature and a table feature.
6. A method of constructing a question bank based on a Word document according to claim 5, wherein the question type feature uses a Chinese order or Roman number and writes a dot number or a pause number after the Chinese order or the Roman number in the Word document; the title number features use Arabic numerals, and a dot number or a pause number is written behind the Arabic numerals; the option features use capital English letters, point numbers or pause numbers are written after the capital English letters, the reference answer features use 'reference answer' keywords, and corresponding question numbers, answers and analysis are written after the keywords.
7. A method for building a question bank based on a Word document according to claim 1, wherein said program loads said Word document in a read-only manner.
8. An item bank constructing apparatus, comprising:
the filing module is used for filing the test paper or the test question book to form a Word document;
the loading module is used for loading the Word document by using a preset program;
the identification module is used for sequentially identifying question type characteristics, test question characteristics and reference answer characteristics in the Word document;
the obtaining module is used for obtaining question types, test question contents and answer contents included in the Word document based on the question type characteristics, the test question characteristics and the reference answer characteristics;
the binding module is used for binding the test question content and the answer content;
and the storage module is used for storing the bound test question content and the bound answer content into a test question library.
9. An item bank construction device, characterized in that the item bank construction device comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the question bank constructing apparatus to execute the question bank constructing method based on the Word document according to any one of claims 1 to 7.
10. A computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for building a Word document-based question bank according to any one of claims 1 to 7.
CN202011510690.7A 2020-12-18 2020-12-18 Question bank construction method, device, equipment and storage medium based on Word document Pending CN112528601A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011510690.7A CN112528601A (en) 2020-12-18 2020-12-18 Question bank construction method, device, equipment and storage medium based on Word document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011510690.7A CN112528601A (en) 2020-12-18 2020-12-18 Question bank construction method, device, equipment and storage medium based on Word document

Publications (1)

Publication Number Publication Date
CN112528601A true CN112528601A (en) 2021-03-19

Family

ID=75001753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011510690.7A Pending CN112528601A (en) 2020-12-18 2020-12-18 Question bank construction method, device, equipment and storage medium based on Word document

Country Status (1)

Country Link
CN (1) CN112528601A (en)

Similar Documents

Publication Publication Date Title
CN109933796B (en) Method and device for extracting key information of bulletin text
Anthony Visualisation in corpus-based discourse studies
US20060285746A1 (en) Computer assisted document analysis
CN106446072B (en) The treating method and apparatus of web page contents
Ugale et al. Document management system: A notion towards paperless office
CN110705503B (en) Method and device for generating directory structured information
CN111274239A (en) Test paper structuralization processing method, device and equipment
Barakat et al. The pinkas dataset
CN115438162A (en) Knowledge graph-based disease question-answering method, system, equipment and storage medium
Boulaknadel et al. Building a standard Amazigh corpus
JP2013016036A (en) Document component generation method and computer system
CN112418875A (en) Cross-platform tax intelligent customer service corpus migration method and device
CN112433995A (en) File format conversion method, system, computer equipment and storage medium
CN114579796B (en) Machine reading understanding method and device
CN116822634A (en) Document visual language reasoning method based on layout perception prompt
Balk et al. IMPACT: working together to address the challenges involving mass digitization of historical printed text
CN112528601A (en) Question bank construction method, device, equipment and storage medium based on Word document
Antara Kesiman et al. Knowledge representation and phonological rules for the automatic transliteration of balinese script on palm leaf manuscript
CN112347742B (en) Method for generating document image set based on deep learning
CN110533035B (en) Student homework page number identification method based on text matching
Joshi et al. CENSUS-HWR: a large training dataset for offline handwriting recognition
US20140111438A1 (en) System, method and apparatus for the transcription of data using human optical character matching (hocm)
Sroison et al. Resume parser with natural language processing
Gribomont OCR with Google Vision API and Tesseract
Mähr Working with batches of PDF files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination