WO2020177531A1 - 题目辅助方法及系统 - Google Patents

题目辅助方法及系统 Download PDF

Info

Publication number
WO2020177531A1
WO2020177531A1 PCT/CN2020/075826 CN2020075826W WO2020177531A1 WO 2020177531 A1 WO2020177531 A1 WO 2020177531A1 CN 2020075826 W CN2020075826 W CN 2020075826W WO 2020177531 A1 WO2020177531 A1 WO 2020177531A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
vector
calculation
answer
neural network
Prior art date
Application number
PCT/CN2020/075826
Other languages
English (en)
French (fr)
Inventor
何涛
石凡
罗欢
陈明权
Original Assignee
杭州大拿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州大拿科技股份有限公司 filed Critical 杭州大拿科技股份有限公司
Publication of WO2020177531A1 publication Critical patent/WO2020177531A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/02Counting; Calculating
    • G09B19/025Counting; Calculating with electrically operated apparatus or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • G06V30/262Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
    • G06V30/274Syntactic or semantic context, e.g. balancing
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/02Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present disclosure relates to the field of artificial intelligence technology, and in particular to a topic assistance method and system.
  • An object of the present disclosure is to provide a method and system for assisting a topic.
  • a method for assisting a topic including: acquiring an image including at least a first topic presented on a first surface through an image acquisition device; through a first computing device and a pre-trained first neural network The model, based on the image, identifies the first area where the first topic in the image is located; through a second computing device and a pre-trained second neural network model, based on the first area, identifies all areas The characters in the first area to obtain the first question; using a third computing device and a pre-trained third neural network model to determine the type of the first question based on the first question; if the The type of the first question is a calculation question, then: the first answer to the calculation question and the step-by-step problem-solving process are respectively generated by the fourth and fifth calculation devices; An answer and a step-by-step problem-solving process.
  • a topic assistance system including: one or more neural network models pre-trained; one or more electronic devices with image acquisition functions and display functions, configured to acquire at least An image of the first topic presented on the first surface; and one or more computing devices configured to: based on the neural network model and the image, identify the first topic in the image where the first topic is located A region; based on the neural network model and the first region, identify the characters in the first region, thereby obtaining the first question; based on the neural network model and the first question, determine where State the type of the first question; if the type of the first question is a calculation question, a first answer to the calculation question and a step-by-step problem-solving process are generated, wherein the one or more electronic devices are also configured To display the title, the first answer, and the step-by-step problem solving process of the calculation problem.
  • a topic assistance system including: one or more processors; and one or more memories, the one or more memories are configured to store a series of computer-executable instructions And computer-accessible data associated with the series of computer-executable instructions, wherein, when the series of computer-executable instructions are executed by the one or more processors, the one or more Each processor performs the method described above.
  • a non-transitory computer-readable storage medium characterized in that a series of computer-executable instructions are stored on the non-transitory computer-readable storage medium, when the series of When the computer-executable instructions are executed by one or more computing devices, the one or more computing devices perform the methods described above.
  • FIGS. 1A and 1B are schematic diagrams schematically showing a display screen of a display device on which a title assistance method according to an embodiment of the present disclosure is based.
  • Fig. 2 is a flowchart schematically showing at least a part of a topic assisting method according to an embodiment of the present disclosure.
  • Fig. 3 is a flowchart schematically showing at least a part of a topic assisting method according to an embodiment of the present disclosure.
  • Fig. 4 is a structural diagram schematically showing at least a part of a question assistance system according to an embodiment of the present disclosure.
  • Fig. 5 is a structural diagram schematically showing at least a part of a question assistance system according to an embodiment of the present disclosure.
  • the present disclosure provides a topic assistance method, which can be used for teaching and learning, for example.
  • the user can use the first electronic device with the image acquisition function to take photos or videos of the subject that needs assistance to obtain the image of the subject, and then can use the second electronic device with the display function (the first and second electronic devices can be The same device can also be a different device) to display the question (the question in the form of recognized characters or the image of the question can be displayed), the answer to the question, and the process of solving the question.
  • the problem-solving process of the problem is a step-by-step problem-solving process, as shown in FIG. 1A, the user can easily understand the problem-solving method through the step-by-step problem-solving process.
  • the problem-solving process of the problem is a graphical problem-solving process. As shown in FIG. 1B, the user can understand the problem-solving method from another perspective through the graphical problem-solving process.
  • the method of the present disclosure can assist a single topic. In some embodiments, the method of the present disclosure can assist multiple questions in the entire test paper.
  • Step S11 Obtain an image including at least the first topic presented on the first surface through the image acquisition device in the first electronic device.
  • Images can include any form of visual presentation, such as photos or videos.
  • the image acquisition device may include a camera, an imaging module, an image processing module, etc., and may also include a communication module for receiving or downloading images.
  • image acquisition by the image acquisition device may include taking photos or videos, receiving or downloading photos or videos, and so on.
  • the first surface may include paper (such as test papers, books or brochures, etc.), whiteboards, chalk boards, display screens (such as TV screens, computer screens, tablet screens, or learning machine screens, etc.), or various other surfaces.
  • Step S12 Using the first computing device and the pre-trained first neural network model, based on the image, identify the first area where the first topic in the image is located.
  • the input of the first neural network model is the image including the first topic, and the output is the first region where the first topic in the image is located.
  • the first neural network model can use a large number of training samples, according to the above-mentioned input and output, and pre-trained by any known method. For example, it can be obtained through the following process of training: establishing a training set of image samples, in which each image sample includes at least one topic. Annotate each image sample to mark the location of at least one topic in each image sample; and train the first neural network through the image sample training set after annotation processing to obtain the first neural network Network model.
  • the first neural network can be any known neural network, such as a deep residual network, a recurrent neural network, and so on.
  • Training the first neural network may also include: testing the output accuracy of the trained first neural network based on the image sample test set; if the output accuracy is less than a predetermined first threshold, increasing the image sample training set The number of image samples, each image sample of the added image samples is subjected to the above-mentioned labeling processing; and the first neural network is retrained through the image sample training set after the number of image samples has been increased. Then the output accuracy of the retrained first neural network is tested again based on the image sample test set until the output accuracy of the first neural network meets the requirements, that is, not less than the predetermined first threshold. In this way, the trained first neural network whose output accuracy meets the requirements can be used as the pre-trained first neural network model in step S12.
  • one or more image samples in the image sample training set can be placed in the image sample test set, or one or more image samples in the image sample test set can be placed in the image sample training set. .
  • Step S13 Through the second computing device and the pre-trained second neural network model, the characters in the first area are recognized based on the first area, so as to obtain the first question.
  • the input of the second neural network model is the first region in the image where the first topic is located (for example, the first region cut out from the complete image), and the output is the characters in the first region.
  • the characters referred to in this article include text (including text, graphic text, letters, numbers, symbols, etc.) and pictures.
  • the second neural network model can use a large number of training samples, according to the above-mentioned input and output, and pre-trained by any known method. For example, it can be obtained through the following process of training: establishing an image sample training set, where each image sample is an image of a region, and each region includes a topic. Annotation processing is performed on each image sample to annotate the characters in the region in each image sample; and the second neural network is trained through the image sample training set after the annotation processing to obtain the second neural network model.
  • the second neural network can be any known neural network.
  • training the second neural network can also include using the test set to verify the output accuracy of the model. If the accuracy does not meet the requirements, the sample set can be increased. Quantity and retrain.
  • Step S14 Judge the type of the first question based on the first question through the third computing device and the third neural network model trained in advance.
  • the types of questions can include calculation questions, applied questions, fill-in-the-blank questions, multiple-choice questions, operation questions, etc.
  • the input of the third neural network model is the first question, and the output is the type of the first question.
  • the third neural network model can be obtained by pre-training the third neural network by any known method using a large number of training samples according to the above-mentioned input and output.
  • the third neural network can be any known neural network, such as a deep convolutional neural network.
  • step S151 is: generating the first answer to the calculation problem and the step-by-step problem-solving process by the fourth and fifth computing devices, respectively.
  • the first answer is a reference answer given by the application of the method of the present invention and used for calculation question assistance, and the fourth calculation device for generating the first answer may be any known calculation engine.
  • the step-by-step problem-solving process of generating a calculation problem by the fifth calculation device includes: according to the form characteristics of the calculation problem (such as the number of unknowns, powers, positions, and calculation symbols), from a preset rule library Obtain the corresponding rules; and generate a step-by-step problem-solving process of calculation problems according to the corresponding rules.
  • the following is a specific example to illustrate.
  • the formal feature of the problem is determined to be a linear equation with a denominator.
  • the obtained rule may be, for example, five steps including removing the denominator, removing the parentheses, shifting terms, merging similar terms, and converting the coefficient to 1. Then according to the rules including these five steps, the following step-by-step problem-solving process can be generated:
  • the step of removing the denominator is usually to multiply both sides of the equation by the least common multiple of the two denominators (for example, in the above example, the denominators of 3 and 5 The least common multiple is 15).
  • the step of removing the denominator can include two sub-steps: first eliminate the fraction in the denominator (for example, you can use the numerator and denominator to multiply the reciprocal of the denominator), and then multiply both sides of the equation by The least common multiple of the two denominators.
  • equation For example: eliminate the fraction in the denominator, that is, the numerator and denominator on the left side of the equation are multiplied by the reciprocal 5 of the denominator on the left side of the equation, and the numerator and denominator on the right side of the equation are multiplied by the reciprocal 4/3 of the denominator on the right side of the equation to change the equation for: Then multiply both sides of the equation by the least common multiple of 3 of the two denominators, and the equation becomes: 15x 4(x+1). In this way, the result of the step of removing the denominator in the step-by-step problem-solving process of the above example is obtained.
  • Step S152 is: displaying the title of the calculation problem and/or the recognized first area through the display device in the second electronic device, and displaying the first answer and the step-by-step problem-solving process.
  • the first and second electronic devices may be the same device or different devices.
  • the image capturing device and the display device can be located in the same electronic device or in different electronic devices.
  • screen 100 of the display screen of the display device, refer to FIG. 1A.
  • the screen 100 includes a title 106, a problem 101 of the calculation problem recognized by the second computing device and the second neural network model, an image area 107 where the problem of the calculation problem recognized by the first computing device and the first neural network model is located, The answer 102 of the calculation question generated by the fourth calculation device, and the step-by-step problem solving process 108, 109 (including 109-1, 109-2) generated by the fifth calculation device.
  • the question 101 of the calculation question and its image area 107 are both displayed on the screen 100, those skilled in the art should understand that only one of the question 101 of the calculation question and its image area 107 needs to be displayed. That is, it is not even necessary to display any of the question 101 of the calculation question and the image area 107 thereof.
  • the step-by-step problem solving process of the calculation problem is displayed when the first trigger is triggered.
  • the first answer ie, the reference answer
  • the user can first think about the steps of solving the problem, and then trigger when the user needs to view the steps of solving the problem (for example, by operating the second electronic device)
  • the specific operation device in the display device, or a specific area in the display screen of the display device, etc. The display device displays these step-by-step problem solving processes.
  • the method of the present invention can only display the calculation question 101 and the first answer 102 by default; when the area where the calculation question 101 is located on the display screen 100 of the display device, the image area 107 is located, and the calculation question is The area where the answer 102 is located, the blank area 103, and/or other designated areas (for example, the area where the partial title 105 is located, the area where the title 106 is located) is the first operation specified by the user (for example, tapping, two consecutive times) Only when tapping, long-pressing, deep-pressing, swiping, etc.), the step-by-step problem solving process 108, 109 is displayed.
  • the indication of other designated areas in the drawings of the present application is only schematic, and other designated areas may obviously include other areas not shown in the drawings.
  • name 108, the process 109-1, and the result 109-2 may not all be displayed, as long as one of them is displayed, or any two of them are displayed. can.
  • the screen 100 may display the name 108 and result 109-2 of the operation corresponding to each step by default, as an aid to the user's question.
  • the user wants to learn more about the operation, such as how to get the result 109-2, he can operate (for example, tap) a designated area (for example, the area where the special mark 104 is located) to trigger the display of the operation 109 -1.
  • a graphical problem solving process of the calculation problem can be generated by the sixth calculation device, and when the second is triggered, the display device Display the title of the calculation problem and/or the first area recognized, and display the first answer and the stepwise and/or graphical problem solving process of the calculation problem.
  • screen 200 For an illustrative example (screen 200) of the display screen of the display device, refer to FIG. 1B. Since the graphical problem-solving process 204 is more intuitive and easier to understand, displaying the graphical problem-solving process is more helpful to the effect of problem assistance.
  • the graphical problem-solving process can be displayed only when the second trigger, for example, in the area where the problem 201 of the calculation problem is located on the display screen 200 of the display device,
  • the area where the first answer 202 of the calculation question is located, the specific operation area (such as the area where the area title 205 is located, the area where the title 206 is located, etc.), and/or the blank area 203, etc., are specified by the user for the second operation (such as light Touch, double touch, long press, deep press, swipe, etc.).
  • the method of the present invention may only display the title of the calculation question and the first answer by default, display the step-by-step problem-solving process at the first trigger, and display the graphical problem-solving process at the second trigger. In some embodiments, the method of the present invention may by default only display the calculation question, the first answer and the step-by-step problem-solving process, and display the graphical problem-solving process at the second trigger. In some embodiments, the method of the present invention may by default only display the question of the calculation problem, the first answer, and the graphical problem-solving process, and display the step-by-step problem-solving process at the first trigger.
  • the graphical problem-solving process of generating the calculation problem by the sixth computing device may include: converting the calculation problem into a function graph based on the plotly library or the pm algorithm model; and generating the graphical problem-solving process of the calculation problem according to the function graph.
  • converting the calculation problem into a function graph based on the plotly library or the pm algorithm model may include: converting the calculation problem into a function graph based on the plotly library or the pm algorithm model; and generating the graphical problem-solving process of the calculation problem according to the function graph.
  • the question assistance method can also correct the second answer associated with the first question (for example, the user's answer to the first question) presented on the first surface. .
  • the first question in the image is identified. Area and the second area where the second answer is located. Identify the characters in the first area through the second computing device and the pre-trained second neural network model, thereby obtaining the first question; and through the seventh computing device and the pre-trained fourth neural network model, identify the characters in the second area To get the second answer.
  • the eighth calculating device compares the first and second answers to obtain the same or different results.
  • the display device displays the title, the first answer, the second answer, the same or different results of the first and second answers, and the step-by-step problem solving process.
  • the same or different results of the first and second answers can be displayed by a specific symbol (such as " ⁇ " or " ⁇ "), or a specific mark can be used to mark the second answer that is different from the first answer (reference answer)
  • the answer (answer answer) is displayed.
  • the training method of the fourth neural network model may be similar to the training method of the second neural network model.
  • the second answer is used to identify the characters in the first area.
  • the neural network model and the fourth neural network model for recognizing characters in the second region may be different models trained separately. It should be understood that the second neural network model and the fourth neural network model may also be the same model.
  • Step S161 is: performing feature extraction on the word problem through the ninth computing device and the pre-trained fifth neural network model to generate a two-dimensional feature vector.
  • the two-dimensional feature vector can be a feature map, which can be generated by any method known in the art, for example, a deep convolutional neural network can be used to process and extract the image area where the word problem is located. Among them, a first two-dimensional feature vector is generated for the text in the word problem, and a second two-dimensional feature vector is generated for the picture in the word problem; and the first and second two-dimensional feature vectors are spliced to obtain a two-dimensional feature vector.
  • the input of the fifth neural network model is the first topic (including text and pictures), and the output is the two-dimensional feature vector corresponding to the first topic (composed of the first and second two-dimensional feature vectors).
  • the fifth neural network model can be obtained by pre-training the fifth neural network by any known method using a large number of training samples according to the above-mentioned input and output.
  • the fifth neural network can be any known neural network, such as a deep convolutional neural network.
  • Step S162 is: searching for a topic vector matching the two-dimensional feature vector (for example, the vector of the topic closest to the first topic) from the preset vector index library through the tenth computing device.
  • the vector index library includes multiple groups, and each group includes one or more vectors. These vectors are all feature extractions of known word problems (for example, questions in the test question library of pre-collected word problems) to generate two-dimensional feature vectors. Any two vectors from the same group have the same length, and any two vectors from different groups have different lengths.
  • Searching for the topic vector from the vector index library can include: first find a group that matches the length of the two-dimensional feature vector in the vector index library according to the length of the two-dimensional feature vector; then search in the group that matches the length to find Topic vector. In this way, the topic vector that matches the two-dimensional feature vector can be searched more quickly.
  • each group has its own index that matches (e.g., equal) the length of each vector in the group.
  • Finding a group that matches the length of the two-dimensional feature vector in the vector index library includes: Index to the matched group according to the length of the two-dimensional feature vector.
  • Step S163 is: through the eleventh computing device, generate the fourth answer (ie, the reference answer) of the applied problem based on the preset third answer associated with the problem vector; and step S164 is: display the applied problem through the display device
  • the fourth answer may also come from a pre-collected test question bank of application questions.
  • the test question bank includes questions and reference answers corresponding to the questions.
  • the answer associated with the question is extracted from the question database, which is the third answer.
  • the third answer is used as the mother board, and the third answer is deformed according to the difference between the first question and the closest question to obtain the fourth answer.
  • Each of the above-mentioned pre-trained first to fifth neural network models can be stored as a whole on one or more storage media in any of the following items, or the first part can be stored in any of the following items One or more storage media in one item, and the second part is stored on one or more storage media in any of the following items: the first and/or second electronic device, one or more One or more of the remote server and the first to eleventh computing devices.
  • any two of the first to eleventh computing devices that perform the processing of the above steps may be the same computing device, or may be different computing devices.
  • Each of the first to eleventh computing devices may include one or more processors, and one or more processors belonging to one computing device may: all be located in the physical housing of the first and/or second electronic device, All are located in the physical housing of one or more remote servers, or the first part is located in the physical housing of the first and/or second electronic device and the second part is located in the physical housing of one or more remote servers.
  • each of the first to eleventh computing devices may also include one or more memories to store instructions that can be executed by the one or more processors and data required to execute the instructions, such as one or more At least part of a plurality of neural network models.
  • the process of processing a single item (a calculation problem or an application problem) is described.
  • the question assisting method of the present invention can also process multiple questions in the whole test paper together. It should be understood that the process of processing a single topic in the above embodiment is also applicable to the process of processing multiple topics together. For the sake of brevity, when describing the following embodiments, the method for applying the above process will not be repeated.
  • the image of basically the entire test paper is acquired through the image acquisition device in the first electronic device.
  • the entire test paper includes multiple questions, and the types of multiple questions may be the same or different.
  • the types of questions can include calculation questions, applied questions, fill-in-the-blank questions, multiple-choice questions, operation questions, etc.
  • the first computing device and the first neural network model Through the first computing device and the first neural network model, multiple respective regions where multiple topics in the image are located are identified.
  • the second computing device and the second neural network model the characters in the above-mentioned multiple regions are respectively recognized, thereby obtaining multiple questions included in the image of the entire test paper.
  • the third calculation device and the third neural network model the type of each of the multiple questions is judged.
  • test paper also includes the answer
  • this method can also identify the area where the answer for each question is located when identifying the area where each question is located. Then, the corresponding model is used to identify the characters in the area where each answer is located, so as to correct the answer in the whole test paper by comparing the answer with the reference answer.
  • judging the type of each question in the multiple questions is based on each question (for example, the text and pictures included in the question, etc.) and the position of each question in the entire test paper (for example, where each question is The position of the area in the image of the whole test paper).
  • each question for example, the text and pictures included in the question, etc.
  • the position of each question in the entire test paper for example, where each question is The position of the area in the image of the whole test paper.
  • the distribution of question types is relatively fixed. For example, calculation questions are distributed at the beginning of the test paper, followed by multiple-choice or fill-in-the-blank questions, and finally applied questions and operation questions. Therefore, when identifying the question type, consider the position of the question in the entire test paper, which is conducive to the accuracy of recognition.
  • the location can be a detailed location, such as coordinates; it can also be a rough location, such as which part of the test paper is distributed (such as the upper left part, the middle right part, etc.); it can also be the order of the questions, such as the part of the first big question Wait.
  • the input of the third neural network model is each item and the corresponding position of each item in the whole test paper, and the output is the type of each item.
  • each question in the sample and the location of the area where the answer is located and the question type are marked.
  • using the first neural network model to identify multiple areas in the image where multiple questions are located includes the following process: using a deep convolutional neural network to extract the two-dimensional feature vector of the entire test paper picture.
  • An anchor point (anchor, also called an anchor box) of a different shape is generated for each grid of the two-dimensional feature vector.
  • Each anchor point includes the center coordinates of the label box and the length and height of the label box. Because the text lines in the test paper are mostly long bars, you can define multiple anchor points in advance, including rectangular boxes with an aspect ratio of 2:1, 3:1, 4:1, and other ratios.
  • the area of each identified question is marked with a rectangular frame of appropriate shape.
  • the image samples used include ground truth boxes that mark each question in the sample and the real area where the answer is located. For example, it can be passed Manually labeled). Among them, the pictures and texts in the title are marked with real frames. In the training process, the generated anchor points are regressed with the real frame, so that the labeling frame is closer to the real position of the topic, and the first neural network model can better identify the area where each topic is located.
  • Questions are usually printed in font, and answers are usually handwritten; and especially for application questions, the character set contained in the question and the character set contained in the answer are often different, and the character set contained in the answer is usually smaller than the question.
  • the included character set for example, the characters in the answer are usually Chinese characters plus numbers, letters and symbols.
  • different models can be used to recognize characters in the question and the answer, and the two models can be trained using different training image sample sets. Nevertheless, the method of model recognition can use hole convolution to extract features of characters (including text and pictures), so that the extracted features have a larger receptive field.
  • hollow convolution can be recognized according to the context of handwritten text; it can also be recognized at intervals, without recognizing text by text, which is convenient for machine parallel processing. Then the feature is decoded through the attention model, and finally the text with variable length is output.
  • the method of the present invention further includes the process shown in FIG. 3.
  • Step S21 Through the ninth computing device and the fifth neural network model, feature extraction is performed on the image of the problem areas of multiple word problems ⁇ T1, T2,..., Tn ⁇ to generate multiple two-dimensional feature vectors ⁇ a1, a2, ...,An ⁇ .
  • Step S22 Search a plurality of nearest vectors ⁇ b1, b2,..., bn ⁇ that are respectively the closest to a plurality of two-dimensional feature vectors from a preset vector index library through the tenth computing device.
  • Step S23 According to the pre-set mark of each vector in the vector index library (the mark of each vector is the identification ID of the test paper from which the vector comes), obtain multiple test papers corresponding to the multiple nearest vectors ⁇ P1, P2, ..., Pn ⁇ .
  • Step S24 Determine the test paper with the most occurrences among the multiple test papers as the matching test paper P.
  • Step S25 For each of the multiple questions, it is determined that the test paper corresponding to the closest vector of the two-dimensional feature vector of each question is a matching test paper. Taking item T1 as an example, it is determined that the test paper P1 corresponding to the closest vector b1 of the two-dimensional feature vector a1 of T1 is the matching test paper P.
  • step S261 determine the closest vector b1 closest to the two-dimensional feature vector a1 of the question T1 as the question vector t of the first question; if not, proceed to step S262: change the two-dimensional feature vector of the question T1 a1. Perform the shortest edit distance matching among multiple vectors with the identification marks of the matching test paper P, find the vector s with the smallest edit distance to the two-dimensional feature vector a1 of question T1, and set the vector with the smallest edit distance s is determined as the topic vector t of the first topic.
  • Step S27 Using the eleventh computing device, generate the fourth answer (ie, the reference answer) of the question T1 according to the preset third answer (for example, the mother board answer) associated with the question vector t of the question T1.
  • Step S28 Display the fourth answers to these application questions on the display device.
  • FIG. 4 is a structural diagram schematically showing at least a part of a question assistance system 400 according to an embodiment of the present disclosure.
  • the system 400 may include one or more neural network models 410, one or more electronic devices 420, one or more computing devices 430, one or more remote servers 440, and a network 450.
  • one or more neural network models 410, one or more electronic devices 420, one or more computing devices 430, and one or more remote servers 440 may be connected to each other through a network 450.
  • the network 450 may be any wired or wireless network, and may also include a cable.
  • one or more neural network models 410 are independent of one or more electronic devices 420, one or more computing devices 430, one or more remote servers 440, and a separate network 450 in the system 400
  • the box shows that it should be understood that one or more neural network models 410 may be actually stored on any one of the other entities 420, 430, 440, 450 included in the system 400.
  • one or more computing devices may include server computing devices that operate as a load-balanced server farm.
  • server computing devices that operate as a load-balanced server farm.
  • various aspects of the subject matter described herein can be implemented by multiple computing devices communicating with each other, for example, through a network.
  • Each of the one or more electronic devices 420, one or more computing devices 430, and one or more remote servers 440 may be located at different nodes of the network 450, and can directly or indirectly communicate with other nodes of the network 450 Communication.
  • the system 500 may also include other devices not shown in FIG. 4, where each different device is located at a different node of the network 450.
  • Various protocols and systems can be used to interconnect the network 450 and the components in the system described herein, so that the network 450 can be part of the Internet, the World Wide Web, a specific intranet, a wide area network, or a local area network.
  • the network 450 may utilize standard communication protocols such as Ethernet, WiFi, and HTTP, protocols that are proprietary to one or more companies, and various combinations of the foregoing protocols. Although certain advantages are obtained when transferring or receiving information as described above, the subject matter described herein is not limited to any specific information transfer method.
  • Each of one or more electronic devices 420, one or more computing devices 430, and one or more remote servers 440 may be configured to be similar to the system 500 shown in FIG. 5, that is, have one or more processors 510, one or more memories 520, and instructions and data.
  • Each of the one or more electronic devices 420, the one or more computing devices 430, and the one or more remote servers 440 may be a personal computing device intended to be used by a user or a commercial computer device used by an enterprise, and has All components commonly used in combination with personal computing devices or commercial computer devices, such as a central processing unit (CPU), memory for storing data and instructions (for example, RAM and internal hard drives), such as displays (for example, monitors with screens, One or more I/O devices such as touch screens, projectors, televisions or other devices operable to display information), mice, keyboards, touch screens, microphones, speakers, and/or network interface devices.
  • the one or more electronic devices 420 may also include one or more cameras for capturing still images or recording video streams, and all components for connecting these elements to each other.
  • one or more electronic devices 420 may each include a full-sized personal computing device, they may optionally include a mobile computing device capable of wirelessly exchanging data with a server through a network such as the Internet.
  • the one or more electronic devices 420 may be a mobile phone, or a device such as a PDA with wireless support, a tablet PC, or a netbook that can obtain information via the Internet.
  • one or more electronic devices 420 may be a wearable computing system.
  • FIG. 5 is a structural diagram schematically showing at least a part of a question assistance system 500 according to an embodiment of the present disclosure.
  • the system 500 includes one or more processors 510, one or more memories 520, and other components (not shown) generally present in devices such as computers.
  • Each of the one or more memories 520 can store content that can be accessed by one or more processors 510, including instructions 521 that can be executed by one or more processors 510, and can be executed by one or more processors 510. retrieve, manipulate, or store data522.
  • the instruction 521 may be any instruction set to be directly executed by the one or more processors 510, such as machine code, or any instruction set to be executed indirectly, such as a script.
  • the terms "instruction”, “application”, “process”, “step” and “program” in this article can be used interchangeably in this article.
  • the instructions 521 may be stored in an object code format for direct processing by one or more processors 510, or stored in any other computer language, including scripts or collections of independent source code modules that are interpreted on demand or compiled in advance.
  • the instructions 521 may include instructions that cause, for example, one or more processors 510 to act as various neural networks herein. The rest of this article explains the functions, methods, and routines of instruction 521 in more detail.
  • the one or more memories 520 may be any temporary or non-transitory computer-readable storage media capable of storing content accessible by the one or more processors 510, such as hard drives, memory cards, ROM, RAM, DVD, CD, USB memory, writable memory and read-only memory, etc.
  • One or more of the one or more memories 520 may include a distributed storage system, where the instructions 521 and/or data 522 may be stored on multiple different storage devices physically located at the same or different geographic locations.
  • One or more of the one or more memories 520 may be connected to the one or more first devices 510 via a network, and/or may be directly connected to or incorporated into any one of the one or more processors 510.
  • One or more processors 510 may retrieve, store, or modify data 522 according to instructions 521.
  • the data 522 stored in the one or more memories 520 may include various images to be recognized, various image sample sets, and parameters used for various neural networks as described above. Other data not associated with images or neural networks may also be stored in one or more memories 520.
  • the data 522 may also be stored in computer registers (not shown), as a table or XML document with many different fields and records. Type database.
  • the data 522 may be formatted in any computing device readable format, such as but not limited to binary values, ASCII, or Unicode.
  • the data 522 may include any information sufficient to identify related information, such as numbers, descriptive text, proprietary codes, pointers, references to data stored in other storage such as other network locations, or used by functions to calculate related information. Data information.
  • the one or more processors 510 may be any conventional processors, such as a central processing unit (CPU), a graphics processing unit (GPU), etc., which are commercially available. Alternatively, the one or more processors 510 may also be dedicated components, such as an application specific integrated circuit (ASIC) or other hardware-based processors. Although not required, one or more processors 510 may include dedicated hardware components to perform specific calculation processes faster or more efficiently, such as image processing on images.
  • CPU central processing unit
  • GPU graphics processing unit
  • ASIC application specific integrated circuit
  • processors 510 may include dedicated hardware components to perform specific calculation processes faster or more efficiently, such as image processing on images.
  • processors 510 and one or more memories 520 are schematically shown in the same box in FIG. 5, the system 500 may actually include multiple processors that may exist in the same physical housing or different Multiple processors or memories in a physical housing.
  • one of the one or more memories 520 may be a hard disk drive or other storage medium located in a housing different from the housing of each of the one or more computing devices (not shown) described above. . Therefore, references to processors, computers, computing devices, or memories should be understood to include references to collections of processors, computers, computing devices, or memories that may or may not operate in parallel.
  • references to “one embodiment” or “some embodiments” means that the feature, structure, or characteristic described in conjunction with the embodiment is included in at least one embodiment or at least some embodiments of the present disclosure. Therefore, the appearances of the phrases “in one embodiment” and “in some embodiments” in various places in this disclosure do not necessarily refer to the same or the same embodiments. In addition, in one or more embodiments, any suitable combination and/or sub-combination may be used to combine features, structures, or characteristics.
  • the word "exemplary” means “serving as an example, instance, or illustration” and not as a “model” to be copied exactly. Any implementation described exemplarily herein is not necessarily interpreted as being preferred or advantageous over other implementations. Moreover, the present disclosure is not limited by any expressed or implied theory given in the above technical field, background art, summary of the invention, or specific embodiments.
  • the word “substantially” means to include any minor changes caused by design or manufacturing defects, device or component tolerances, environmental influences, and/or other factors.
  • the word “substantially” also allows the difference between the perfect or ideal situation caused by parasitic effects, noise, and other practical considerations that may be present in the actual implementation.
  • connection means that one element/node/feature is electrically, mechanically, logically, or otherwise directly connected (or Direct communication).
  • coupled means that one element/node/feature can be directly or indirectly connected to another element/node/feature mechanically, electrically, logically, or in other ways. Allow interaction, even if the two features may not be directly connected. In other words, “coupled” intends to include direct connection and indirect connection of elements or other features, including the connection of one or more intermediate elements.
  • a component may be, but is not limited to, a process, an object, an executable state, an execution thread, and/or a program running on the processor.
  • an application program running on a server and the server may be one component.
  • One or more components may exist inside an executing process and/or thread, and a component may be located on one computer and/or distributed between two or more computers.
  • embodiments of the present disclosure may also include the following examples:
  • An auxiliary method for a topic including:
  • the first answer to the calculation problem and the step-by-step problem-solving process are respectively generated by the fourth and fifth computing devices.
  • the title of the calculation question and/or the first area is displayed by a display device, and the first answer and the step-by-step problem solving process are displayed.
  • step-by-step question-solving process of generating the calculation question by the fifth calculation device comprises:
  • the step-by-step problem solving process of the calculation problem is generated according to the corresponding rule.
  • step-by-step problem-solving process includes one or more steps
  • displaying the step-by-step problem-solving process on a display device includes: displaying all the steps in order. Describe the operation result corresponding to one or more steps.
  • displaying the step-by-step question-solving process on a display device further comprises: matching the one or more steps in the screen of the display device The area associated with the corresponding result displays the operation name and/or process corresponding to the one or more steps.
  • the first trigger comprises: the area where the question of the calculation question is located on the display screen of the display device, the first of the calculation question The area where the answer is located, the blank area, and/or the specified area are subjected to the specified first operation.
  • the graphical problem solving process of the calculation problem is displayed by the display device.
  • a graphical problem solving process of the calculation problem is generated according to the function graph.
  • the second trigger comprises: the area where the question of the calculation question is located on the display screen of the display device, the first of the calculation question The area where the answer is located, the specific operation area, and/or the blank area are subjected to the designated second operation.
  • the second answer and the result are also displayed by the display device.
  • the fourth answer to the applied problem is displayed on the display device.
  • the vector index library includes a plurality of groups, and each group includes one or more vectors, wherein any two vectors from the same group have the same Length, any two vectors from different groups have different lengths,
  • searching for the topic vector from the vector index library includes:
  • each group has its own index, the index matches the length of the vector in the group, and the index is found in the vector index library.
  • the length matching group of the two-dimensional feature vector includes:
  • the question assisting method according to claim 1 wherein the image comprises substantially the entire test paper where the first question presented on the first surface is located, wherein the first question is determined
  • the type is also based on the position of the first area in the entire test paper.
  • test paper further includes a plurality of second questions of practical type other than the first question, and the method further comprises:
  • test paper corresponding to the nearest vector that is closest to the two-dimensional feature vector of the word problem is the matching test paper, then:
  • test paper corresponding to the nearest vector that is closest to the two-dimensional feature vector of the word problem is not the matching test paper, then:
  • the two-dimensional feature vector of the word problem is matched with the shortest edit distance among multiple vectors from the matching test paper, and the vector with the shortest edit distance to the two-dimensional feature vector of the word problem is found, and the The vector with the shortest edit distance is determined as the topic vector of the word problem;
  • the fourth answer to the applied problem is displayed on the display device.
  • a topic assistance system including:
  • One or more pre-trained neural network models are One or more pre-trained neural network models
  • One or more electronic devices with an image acquisition function and a display function, configured to acquire an image including at least the first question presented on the first surface;
  • One or more computing devices configured to:
  • the type of the first question is a calculation question, then generate the first answer to the calculation question and a step-by-step problem-solving process,
  • the one or more electronic devices are also configured to display the title, the first answer, and the step-by-step problem solving process of the calculation problem.
  • the one or more computing devices are further configured to: if the type of the first question is a calculation question, generate a graphical problem-solving process of the calculation question; and
  • the one or more electronic devices are further configured to display the graphical problem solving process of the calculation problem.
  • the one or more computing devices are further configured to: if the type of the first question is a practical question, then:
  • the one or more electronic devices are also configured to display a fourth answer to the word problem.
  • the question assistance system according to claim 18, wherein the image includes substantially the entire test paper where the first question presented on the first surface is located, wherein the first question is determined
  • the type is also based on the position of the first area in the entire test paper.
  • the one or more computing devices are also configured to:
  • feature extraction is performed on the first topic and the multiple second topics respectively to generate multiple two-dimensional feature vectors
  • test paper corresponding to the nearest vector that is closest to the two-dimensional feature vector of the word problem is the matching test paper, then:
  • test paper corresponding to the nearest vector that is closest to the two-dimensional feature vector of the word problem is not the matching test paper, then:
  • the two-dimensional feature vector of the word problem is matched with the shortest edit distance among multiple vectors from the matching test paper, and the vector with the shortest edit distance to the two-dimensional feature vector of the word problem is found, and the The vector with the shortest edit distance is determined as the topic vector of the word problem;
  • the one or more electronic devices are further configured to display a fourth answer to the word problem.
  • the item assistance system further comprises one or more remote servers, and one or more of the one or more neural network models are stored in the One or more storage media in one or more remote servers.
  • the question assistance system according to claim 18, wherein the question assistance system further comprises one or more remote servers, and one or more of the one or more computing devices are located in the one or more Multiple remote servers in the physical enclosure.
  • a topic assistance system including:
  • One or more processors are One or more processors.
  • One or more memories configured to store a series of computer-executable instructions and computer-accessible data associated with the series of computer-executable instructions
  • a non-transitory computer-readable storage medium wherein a series of computer-executable instructions are stored on the non-transitory computer-readable storage medium.
  • the one or more computing devices are caused to perform the method according to any one of claims 1-17.

Abstract

一种题目辅助方法,包括:通过影像获取装置获取至少包括呈现在第一表面的第一题目的影像(S11);通过第一计算装置和预先训练的第一神经网络模型,基于所述影像,识别出所述影像中的所述第一题目所在的第一区域(S12);通过第二计算装置和预先训练的第二神经网络模型,基于所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目(S13);通过第三计算装置和预先训练的第三神经网络模型,基于所述第一题目,判断所述第一题目的类型(S14);若所述第一题目的类型为计算题,则:通过第四和第五计算装置分别生成所述计算题的第一答案和步骤化的解题过程(S151);以及通过显示装置显示所述计算题的题目、第一答案、以及步骤化的解题过程(S152)。

Description

题目辅助方法及系统 技术领域
本公开涉及人工智能技术领域,尤其涉及一种题目辅助方法及系统。
背景技术
近年来,人工智能已经应用于日常的教学和学习中。例如,通过智能终端等电子设备对试卷或作业中的题目进行批改等。
因此,存在对新技术的需求。
发明内容
本公开的一个目的是提供一种题目辅助方法及系统。
根据本公开的第一方面,提供了一种题目辅助方法,包括:通过影像获取装置获取至少包括呈现在第一表面的第一题目的影像;通过第一计算装置和预先训练的第一神经网络模型,基于所述影像,识别出所述影像中的所述第一题目所在的第一区域;通过第二计算装置和预先训练的第二神经网络模型,基于所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;通过第三计算装置和预先训练的第三神经网络模型,基于所述第一题目,判断所述第一题目的类型;若所述第一题目的类型为计算题,则:通过第四和第五计算装置分别生成所述计算题的第一答案和步骤化的解题过程;以及通过显示装置显示所述计算题的题目、第一答案、以及步骤化的解题过程。
根据本公开的第二方面,提供了一种题目辅助系统,包括:预先训练的一个或多个神经网络模型;具有影像获取功能和显示功能的一个或多个电子设备,被配置为获取至少包括呈现在第一表面的第一题目的影像;以及一个或多个计算装置,被配置为:基于所述神经网络模型和所述影像,识别出所述影像中的所述第一题目所在的第一区域;基于所述神经网络模型和所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;基于所述神经网络模型和所述第一题目,判断所述第一题目的类型; 若所述第一题目的类型为计算题,则生成所述计算题的第一答案和步骤化的解题过程,其中,所述一个或多个电子设备还被配置为显示所述计算题的题目、第一答案、以及步骤化的解题过程。
根据本公开的第三方面,提供了一种题目辅助系统,包括:一个或多个处理器;以及一个或多个存储器,所述一个或多个存储器被配置为存储一系列计算机可执行的指令以及与所述一系列计算机可执行的指令相关联的计算机可访问的数据,其中,当所述一系列计算机可执行的指令被所述一个或多个处理器执行时,使得所述一个或多个处理器进行如上所述的方法。
根据本公开的第四方面,提供了一种非临时性计算机可读存储介质,其特征在于,所述非临时性计算机可读存储介质上存储有一系列计算机可执行的指令,当所述一系列计算机可执行的指令被一个或多个计算装置执行时,使得所述一个或多个计算装置进行如上所述的方法。
通过以下参照附图对本公开的示例性实施例的详细描述,本公开的其它特征及其优点将会变得清楚。
附图说明
构成说明书的一部分的附图描述了本公开的实施例,并且连同说明书一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1A和1B是示意性地示出根据本公开的实施例的题目辅助方法所基于的显示装置的显示画面的示意图。
图2是示意性地示出根据本公开的一个实施例的题目辅助方法的至少一部分的流程图。
图3是示意性地示出根据本公开的一个实施例的题目辅助方法的至少一部分的流程图。
图4是示意性地示出根据本公开的一个实施例的题目辅助系统的至少一部分的结构图。
图5是示意性地示出根据本公开的一个实施例的题目辅助系统的至少一部分的结构图。
注意,在以下说明的实施方式中,有时在不同的附图之间共同使用同一附图标记来表示相同部分或具有相同功能的部分,而省略其重复说明。在本说明书中,使用相似的标号和字母表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
具体实施方式
以下将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。在下面描述中,为了更好地解释本公开,阐述了许多细节,然而可以理解的是,在没有这些细节的情况下也可以实践本公开。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
本公开提供了一种题目辅助方法,可以用于例如教学和学习。用户可以利用具有影像获取功能的第一电子设备对需要进行辅助的题目进行拍照或摄像来获取该题目的影像,然后可以在具有显示功能的第二电子设备(第一和第二电子设备可以是同一个设备也可以是不同的设备)上显示该题目(可以显示识别出来的字符形式的题目,也可以显示获取的该题目的影像)、该题目的答案、以及该题目的解题过程。在一些实施例中,该题目的解题过程为步骤化的解题过程,如图1A所示,用户可以通过该步骤化的解题过程容易地理解解题方法。在一些实施例中,该题目的解题过程为图形化的解题过程,如图1B所示,用户可以通过该图形化的解题过程从另一个角度理解解题方法。在一些实施例中,本公开的方法可以对单个题 目进行辅助。在一些实施例中,本公开的方法可以对整个试卷中的多个题目进行辅助。
下面参考图2描述根据本公开实施例的题目辅助方法以及该方法所包括的各个步骤。
步骤S11:通过第一电子设备中的影像获取装置获取至少包括呈现在第一表面的第一题目的影像。影像可以包括任何形式的视觉呈现,例如照片或视频等。影像获取装置可以包括摄像头、成像模块、以及图像处理模块等,还可以包括用于接收或下载影像的通信模块等。相应地,影像获取装置获取影像可以包括拍摄照片或视频、接收或下载照片或视频等。第一表面可以包括纸张(例如试卷、书籍或小册子等)、白板、粉笔板、显示屏幕(例如电视机屏幕、电脑屏幕、平板屏幕或学习机屏幕等)或各种其他表面。
步骤S12:通过第一计算装置和预先训练的第一神经网络模型,基于影像,识别出影像中的第一题目所在的第一区域。第一神经网络模型的输入为包括第一题目的影像,输出为影像中的第一题目所在的第一区域。
第一神经网络模型可以使用大量的训练样本,按照上述的输入输出,通过任何已知的方法预先训练得到。例如,可以通过如下过程训练得到:建立一个影像样本训练集,其中的每个影像样本中均包括至少一个题目。对每个影像样本进行标注处理,以标注出每个影像样本中的至少一个题目所在的区域的位置;以及通过经过标注处理的影像样本训练集对第一神经网络进行训练,以得到第一神经网络模型。第一神经网络可以是任何已知的神经网络,例如深度残差网络、递归神经网络等。
对第一神经网络进行训练还可以包括:基于影像样本测试集,对经过训练的第一神经网络的输出准确率进行测试;若输出准确率小于预定的第一阈值,则增加影像样本训练集中的影像样本的数量,所增加的影像样本中的每个影像样本均经过上述标注处理;以及通过增加了影像样本数量之后的影像样本训练集,重新对第一神经网络进行训练。然后基于影像样本测试集对重新训练过的第一神经网络的输出准确率再次进行测试,直到第 一神经网络的输出准确率满足要求,即不小于预定的第一阈值为止。如此,经过训练的、输出准确率满足要求的第一神经网络可以用作步骤S12中的经过预先训练的第一神经网络模型。本领域技术人员应理解,可以根据需要,将影像样本训练集中的一个或多个影像样本放到影像样本测试集中,也可以将影像样本测试集中的一个或多个影像样本放到影像样本训练集中。
步骤S13:通过第二计算装置和预先训练的第二神经网络模型,基于第一区域,识别出第一区域中的字符,从而得到第一题目。第二神经网络模型的输入为影像中的第一题目所在的第一区域(例如,从完整的影像中切割出来的第一区域),输出为第一区域中的字符。应当理解,本文中所称的字符,包括文字(包括文本文字、图形文字、字母、数字、符号等)以及图片等。
第二神经网络模型可以使用大量的训练样本,按照上述的输入输出,通过任何已知的方法预先训练得到。例如,可以通过如下过程训练得到:建立一个影像样本训练集,其中的每个影像样本为一个区域的影像,每个区域包括一个题目。对每个影像样本进行标注处理,以标注出每个影像样本中的区域中的字符;以及通过经过标注处理的影像样本训练集对第二神经网络进行训练,以得到第二神经网络模型。第二神经网络可以是任何已知的神经网络。此外,与上文对第一神经网络的描述相类似,对第二神经网络进行训练还可以包括用测试集来验证模型的输出准确率,如果准确率不满足要求时可以增大样本集中样本的数量并重新进行训练。
步骤S14:通过第三计算装置和预先训练的第三神经网络模型,基于第一题目,判断第一题目的类型。题目的类型可以包括计算题、应用题、填空题、选择题、操作题等。第三神经网络模型的输入为第一题目,输出为第一题目的类型。第三神经网络模型可以使用大量的训练样本,按照上述的输入输出,通过任何已知的方法对第三神经网络进行预先训练而得到。第三神经网络可以是任何已知的神经网络,例如深度卷积神经网络等。
若步骤S14中识别出的第一题目的类型为计算题,则进行步骤S151 和S152。其中,步骤S151为:通过第四和第五计算装置分别生成计算题的第一答案和步骤化的解题过程。其中,第一答案为应用本发明的方法给出的用于计算题的题目辅助的参考答案,用于生成第一答案的第四计算装置可以是任何已知的计算引擎。
通过第五计算装置生成计算题的步骤化的解题过程包括:根据计算题的题目的形式特征(例如未知数的个数、几次方、位置和计算符号等),从预先设置的规则库中获取对应的规则;以及根据对应的规则生成计算题的步骤化的解题过程。下面以一个具体的例子来说明。
例如,如果识别到的计算题的题目为
Figure PCTCN2020075826-appb-000001
则确定该题目的形式特征为带分母的一元一次方程。在预先设置的规则库中获取带分母的一元一次方程的解题规则。获取的规则例如可以为:依次包括去分母、去括号、移项、合并同类项、以及系数化为1共五个步骤。然后根据包括这五个步骤的规则可以生成如下的步骤化的解题过程:
1.去分母,得:5(x+4)=3(x+5);
2.去括号,得:5x+20=3x+15;
3.移项,得:5x-3x=15-20;
4.合并同类项,得:2x=-5;
5.系数化为1,得:
Figure PCTCN2020075826-appb-000002
需要说明的是,如众所周知的,在以上步骤化的解题过程的示例中,去分母的步骤通常是方程的两边均乘以两个分母的最小公倍数(例如在以上例子中分母3和5的最小公倍数是15)。如果分母为分数(包括小数),则去分母的步骤可以包括两个子步骤:先消除分母中的分数(例如可以利用分子和分母同乘以分母的倒数),然后再将方程的两边均乘以两个分母的最小公倍数。
以方程
Figure PCTCN2020075826-appb-000003
为例:消除分母中的分数,即方程左边的分子和分母分别乘以方程左边的分母的倒数5,方程右边的分子和分母分别乘以方程右边的分母的倒数4/3,可将方程变为:
Figure PCTCN2020075826-appb-000004
然后再将方程的两边 均乘以两个分母的最小公倍数3,则方程变为:15x=4(x+1)。如此得到了上述示例的步骤化的解题过程中去分母的步骤的结果。
步骤S152为:通过第二电子设备中的显示装置显示计算题的题目和/或识别到的第一区域,并且显示第一答案以及步骤化的解题过程。其中,第一和第二电子设备可以是同一个设备、也可以是不同的设备。也就是说,影像获取装置和显示装置可以位于同一个电子设备中、也可以位于不同的电子设备中。显示装置的显示画面的一个示意性的例子(画面100)可以参考图1A。
画面100包括标题106、通过第二计算装置和第二神经网络模型识别到的计算题的题目101、通过第一计算装置和第一神经网络模型识别到的计算题的题目所在的影像区域107、通过第四计算装置生成的计算题的答案102、以及通过第五计算装置生成的步骤化的解题过程108、109(包括109-1、109-2)。虽然图1A所示的例子中,计算题的题目101和其影像区域107均被显示在画面100中,本领域技术人员应理解,只需要显示计算题的题目101和其影像区域107中的一个即可,甚至可以不显示计算题的题目101和其影像区域107中的任何一个。
在一些实施例中,处于对教学/学习效果的考虑,计算题的步骤化的解题过程在第一触发时才被显示。例如,用户通过查看显示装置获得该计算题的第一答案(即参考答案)之后,可以先自己思考解题的步骤,在用户需要查看解题步骤时,再触发(例如通过操作第二电子设备中的特定操作装置、或者显示装置的显示画面中的特定区域等)显示装置显示这些步骤化的解题过程。例如,本发明的方法可以默认只显示计算题的题目101和第一答案102;当显示装置的显示画面100中的计算题的题目101所在的区域、影像区域107所在的区域、计算题的第一答案102所在的区域、空白区域103、和/或其他指定的区域(例如,局部标题105所在的区域、标题106所在的区域)被用户进行指定的第一操作(例如轻触、连续两次轻触、长按、深按、轻扫等)时,才显示步骤化的解题过程108、109。应当理解,本申请的附图中对其他指定的区域的标示只是示意性的,其他指定 的区域显然可以包括附图中未标示出的其他区域。
步骤化的解题过程可以包括一个或多个步骤,每个步骤对应一个操作,每个操作通常具有其名称108(在图1A所示的例子中为“两边各减2”)、过程109-1(在图1A所示的例子中为方框内显示的、被标记为“如何做?”的内容)和结果109-2(在图1A所示的例子中为“x=1”)。虽然未在附图中示出,但本领域技术人员应理解,名称108、过程109-1和结果109-2可以不都被显示,只要显示其中的一者、或者显示其中的任意两者均可。作为一个示例,在第一触发时,画面100可以默认显示每个步骤所对应的操作的名称108和结果109-2,以作为对用户的题目辅助。当用户希望了解更多该操作的内容时,例如如何得到该结果109-2时,可以操作(例如轻触)指定的区域(例如特殊标记104所在的区域),以触发显示该操作的过程109-1。
在一些实施例中,若步骤S14中识别出的第一题目的类型为计算题,则可以通过第六计算装置生成计算题的图形化的解题过程,并且在第二触发时,通过显示装置显示计算题的题目和/或识别到的第一区域,并且显示第一答案以及计算题的步骤化和/或图形化的解题过程。显示装置的显示画面的一个示意性的例子(画面200)可以参考图1B。由于图形化的解题过程204更直观和更容易理解,所以显示图形化的解题过程更有助于题目辅助的效果。出于与以上步骤化的解题过程类似的考虑,图形化的解题过程可以在第二触发时才被显示,例如,在显示装置的显示画面200中的计算题的题目201所在的区域、计算题的第一答案202所在的区域、特定的操作区域(例如区域标题205所在的区域、标题206所在的区域等)、和/或空白区域203等被用户进行指定的第二操作(例如轻触、连续两次轻触、长按、深按、轻扫等)时。
在一些实施例中,本发明的方法可以默认只显示计算题的题目和第一答案,在第一触发时显示步骤化的解题过程,并且在第二触发时显示图形化的解题过程。在一些实施例中,本发明的方法可以默认只显示计算题的题目、第一答案和步骤化的解题过程,并且在第二触发时显示图形化的解 题过程。在一些实施例中,本发明的方法可以默认只显示计算题的题目、第一答案和图形化的解题过程,并且在第一触发时显示步骤化的解题过程。
通过第六计算装置生成计算题的图形化的解题过程可以包括:基于plotly库或pm算法模型将计算题转换为函数图;以及根据函数图生成计算题的图形化的解题过程。下面以一些具体的例子来说明图形化的解题过程。
例如,如图1B所示的例子中,计算题的题目为x+2=3。可以先根据该题目建立二元一次方程组,即y=x+2和y=3两个方程。然后利用plotly库或pm算法模型分别将这两个方程转换为直角坐标系中的函数图。例如,将y=x+2转换为斜率为1、截距为2的一条直线,将y=3转换为平行于x轴的、截距为3的一条直线。从直角坐标系中的函数图中可以看出,题目的解即为两条直线的交点,即x=1。再例如,对于二元二次方程,已知的是其函数曲线为抛物线,该抛物线与某一个坐标轴的交点即为方程的解。因此,本方法可以先确定方程的解,然后再确定函数曲线。例如,对于方程y=2x 2-5x+2,已知的是因变量y是自变量x的函数;本方法可以先通过十字相乘法求得方程的两个解为x=0.5和x=2,因此可以确定该抛物线与x轴的两个交点为0.5和2;并根据二次变量的系数的正负得知该抛物线的开口向上,因此,可以容易地利用plotly库或pm算法模型确定和绘制函数曲线。
在一些实施例中,根据本发明实施例的题目辅助方法还可以对呈现在第一表面的与第一题目相关联的第二答案(例如,可以是用户对第一题目的作答答案)进行批改。在这些情况下,通过第一计算装置和第一神经网络模型,基于包括呈现在第一表面的第一题目和相关联的第二答案的影像,识别出影像中的第一题目所在的第一区域以及第二答案所在的第二区域。通过第二计算装置和预先训练的第二神经网络模型识别出第一区域中的字符,从而得到第一题目;并通过第七计算装置和预先训练的第四神经网络模型识别出第二区域中的字符,从而得到第二答案。通过第八计算装置比较第一和第二答案,以得到相同或不同的结果。通过显示装置显示计算题的题目、第一答案、第二答案、第一和第二答案相同或不同的结果、以及 步骤化的解题过程。第一和第二答案相同或不同的结果可以通过特定的符号(例如“√”或“×”)来显示,也可以通过特定的标记来标示出与第一答案(参考答案)不同的第二答案(作答答案)来显示。
第四神经网络模型的训练方式可以类似于第二神经网络模型的训练方式。在一些实施例中,考虑到通常第一题目的字体为印刷体,而第二答案的字体为手写体(因为其可能为用户手写的答案),因此用于识别第一区域中的字符的第二神经网络模型和用于识别第二区域中的字符的第四神经网络模型可以是分别训练的不同的模型。应当理解,第二神经网络模型和第四神经网络模型也可以是同一个模型。
若步骤S14中识别出的第一题目的类型为应用题,则进行步骤S161至S164。步骤S161为:通过第九计算装置和预先训练的第五神经网络模型,对应用题进行特征提取以生成二维特征向量。二维特征向量可以是特征图(feature map),其可以用本领域已知的任何方法来生成,例如可以利用深度卷积神经网络对应用题所在的影像区域进行处理来提取。其中,对应用题中的文字生成第一二维特征向量,并对应用题中的图片生成第二二维特征向量;以及将第一和第二二维特征向量拼接以得到二维特征向量。第五神经网络模型的输入为第一题目(包括文字和图片),输出为第一题目所对应的二维特征向量(为第一和第二二维特征向量拼接而成)。第五神经网络模型可以使用大量的训练样本,按照上述的输入输出,通过任何已知的方法对第五神经网络进行预先训练而得到。第五神经网络可以是任何已知的神经网络,例如深度卷积神经网络等。
步骤S162为:通过第十计算装置,从预先设置的向量索引库中搜索与二维特征向量相匹配的题目向量(例如,与第一题目最相近的题目的向量)。向量索引库包括多个组,每个组包括一个或多个向量。这些向量都是对已知的应用题的题目(例如,预先搜集的应用题的试题库中的题目)进行特征提取而生成二维特征向量。来自同一组的任意两个向量具有相同的长度,来自不同组的任意两个向量具有不同的长度。
从向量索引库中搜索题目向量可以包括:先根据二维特征向量的长度, 在向量索引库中找到与二维特征向量的长度匹配的组;然后在这个长度匹配的组中进行搜索,以找到题目向量。如此,能够更快速地搜索到与二维特征向量相匹配的题目向量。在一些实施例中,每个组具有各自的索引,该索引与该组中的各个向量的长度相匹配(例如相等),在向量索引库中找到与二维特征向量的长度匹配的组包括:根据二维特征向量的长度索引到匹配的组。
步骤S163为:通过第十一计算装置,根据预先设置的与题目向量相关联的第三答案,生成应用题的第四答案(即参考答案);以及步骤S164为:通过显示装置显示应用题的第四答案。其中,第三答案也可以来自于预先搜集的应用题的试题库,例如,该试题库中包括题目和与题目对应的参考答案。在步骤S162中找到与第一题目最相近的题目(即与上述题目向量相匹配的题目)之后,从试题库中提取该题目相关联的答案,即为第三答案。然后以第三答案作为母板,根据第一题目与该最相近的题目之间的差异,来对第三答案进行变形以得到第四答案。
上述预先训练的第一至第五神经网络模型中的每一个可以整体存储在以下各项中的任意一项中的一个或多个存储介质上,也可以第一部分存储在以下各项中的任意一项中的一个或多个存储介质上、并且第二部分存储在以下各项中的任意一项中的一个或多个存储介质上:第一和/或第二电子设备、一个或多个远程服务器、第一至第十一计算装置中的一个或多个。
进行上述各步骤处理的第一至第十一计算装置中的任意两者可以为相同的计算装置,也可以为不同的计算装置。第一至第十一计算装置中的每一个可以包括一个或多个处理器,属于一个计算装置的一个或多个处理器可以:全部位于第一和/或第二电子设备的物理壳体内、全部位于一个或多个远程服务器的物理壳体内、或者第一部分位于第一和/或第二电子设备的物理壳体内并且第二部分位于一个或多个远程服务器的物理壳体内。应当理解,第一至第十一计算装置中的每一个还可以包括一个或多个存储器,以存储上述一个或多个处理器能够执行的指令、以及执行指令所需要的数据,例如上述一个或多个神经网络模型的至少一部分。
根据上述实施例描述的本发明的题目辅助方法,描述了对单独一道题目(一道计算题或一道应用题)进行处理的过程。本发明的题目辅助方法还可以针对整张试卷中的多道题目共同进行处理。应当理解,上述实施例中的针对单独一道题目进行处理的过程也同样适用于对多道题目共同进行处理的过程。为简明起见,在对以下实施例进行描述时,适用上述过程的方法不再重复描述。
通过第一电子设备中的影像获取装置获取基本上整张试卷的影像,整张试卷中包括多个题目,多个题目的类型可以相同也可以不同。题目的类型可以包括计算题、应用题、填空题、选择题、操作题等。通过第一计算装置和第一神经网络模型,识别出影像中的多个题目所在的多个各自的区域。通过第二计算装置和第二神经网络模型,分别识别上述多个区域中的字符,从而得到整张试卷的影像中包括的多个题目。通过第三计算装置和第三神经网络模型,判断多个题目中每个题目的类型。对于识别出的整张试卷中的计算题,针对每道计算题,可以进行如上所述的步骤S151和S152的操作。对于识别出的整张试卷中的应用题,针对每道计算题,可以进行如上所述的步骤S161至S164的操作。
应当理解,如果试卷上还包括作答答案的话,本方法在识别各个题目所在的区域时,还可以识别出每个题目的作答答案所在的区域。然后通过相应的模型识别出这些每个作答答案所在的区域中的字符,从而通过比较作答答案和参考答案来批改整张试卷中的作答答案。
在一些实施例中,判断多个题目中每个题目的类型基于每个题目(例如,题目中包括的文字和图片等)以及每个题目在整张试卷中的位置(例如,每个题目所在的区域在整张试卷的影像中的位置)。对于一些试卷来说,题目类型的分布是较为固定的,例如计算题分布在试卷的开头,接着是选择题或填空题,最后是应用题和操作题。因此,在识别题目类型时考虑题目在整张试卷中的位置,这有利于识别的准确性。位置可以是细致的位置,例如坐标;也可以是粗略的位置,例如分布在试卷的哪个部分(例如左上部分、右中部分等);还可以是题目顺序,例如位于第一道大题的 部分等。在这些实施例中,第三神经网络模型的输入为每个题目、以及每个题目在整张试卷中相应的位置,输出为每个题目的类型。在用于训练第三神经网络模型的影响样本中,标记了样本中的各个题目及其答案所在区域的位置及题目类型。
在一些实施例中,利用第一神经网络模型,识别出影像中的多个题目所在的多个区域包括如下过程:利用深度卷积神经网络提取整张试卷图片的二维特征向量。对二维特征向量的每一个网格生成不同形状的锚点(anchor,也可以称作锚框,anchor box)。每个锚点包括标注框的中心坐标以及标注框的长度和高度。因为试卷中的文字行多以长条形为主,因此,可以预先定义多个锚点,包括宽高比为2:1、3:1、4:1以及其他比例的矩形框。识别出的每个题目的区域被标注以各自合适形状的矩形框。
在对第一神经网络模型进行训练时,所用的影像样本(用于训练时模型的输入)包括标记了样本中的各个题目及其答案所在真实区域的真实框(Ground Truth Box,例如可以是通过人工标注的)。其中,对题目中的图片和文字分别标记真实框。训练的过程中,将生成的锚点与真实框做回归,以使得标注框更贴近题目的真实位置,进一步使得第一神经网络模型能够更好地识别各个题目所在的区域。
题目通常是打印字体,而作答答案通常是手写字体;并且尤其对于应用题来说,题目包含的字符集与作答答案包含的字符集常常是不同的,作答答案所包含的字符集通常要小于题目所包含的字符集,例如,作答答案中的字符通常为常用汉字加上数字、字母和符号。鉴于此,在一些实施例中,可以用不同的模型来识别题目和作答答案中的字符,两个模型可以是分别用不同的训练图像样本集来训练的。尽管如此,模型识别的方法均可以采用空洞卷积来对字符(包括文字和图片)进行特征提取,使得提取到的特征具有较大的感受野(receptive field)。并且采用空洞卷积可以根据手写文字的上下文进行识别;还可以间隔识别,不用逐个文字进行识别,这便于机器并行处理。然后通过注意力模型对特征进行解码,最终输出变长的文字。
对于整张试卷中的应用题,为了使得题目搜索的结果更准确,在一些实施例中,本发明的方法还包括如图3所示的过程。步骤S21:通过第九计算装置和第五神经网络模型,分别对多个应用题{T1,T2,…,Tn}的题目区域影像进行特征提取以生成多个二维特征向量{a1,a2,…,an}。步骤S22:通过第十计算装置,从预先设置的向量索引库中搜索分别与多个二维特征向量距离最近的多个最近向量{b1,b2,…,bn}。步骤S23:根据向量索引库中各个向量被预先设置的标记(每个向量的标记为该向量所来自的试卷的识别ID),得到多个最近向量所分别对应的多个试卷{P1,P2,…,Pn}。步骤S24:将多个试卷中出现次数最多的试卷确定为匹配试卷P。步骤S25:针对多个题目中的每一个题目,判断与每个题目的二维特征向量距离最近的最近向量所对应的试卷是匹配试卷。以题目T1为例,判断与T1的二维特征向量a1距离最近的最近向量b1所对应的试卷P1是匹配试卷P。如果是,则进行步骤S261:将与题目T1的二维特征向量a1距离最近的最近向量b1确定为第一题目的题目向量t;若不是,则进行步骤S262:将题目T1的二维特征向量a1,在具有匹配试卷P的识别的标记的多个向量中进行最短编辑距离匹配,在其中找到与题目T1的二维特征向量a1的最短编辑距离最小的向量s,将最短编辑距离最小的向量s确定为第一题目的题目向量t。步骤S27:通过第十一计算装置,根据预先设置的与题目T1的题目向量t相关联的第三答案(例如,母板答案),生成题目T1的第四答案(即参考答案)。步骤S28:通过显示装置显示这些应用题的第四答案。
图4是示意性地示出根据本公开的一个实施例的题目辅助系统400的至少一部分的结构图。本领域技术人员可以理解,系统400只是一个示例,不应将其视为限制本公开的范围或本文所描述的特征。在该示例中,系统400可以包括一个或多个神经网络模型410、一个或多个电子设备420、一个或多个计算装置430、一个或多个远程服务器440、以及网络450。其中,一个或多个神经网络模型410、一个或多个电子设备420、一个或多个计算装置430、以及一个或多个远程服务器440可以通过网络450互相连接。其中网络450可以是任何有线或无线的网络,也可以包括线缆。此外,虽 然一个或多个神经网络模型410在系统400中以独立于一个或多个电子设备420、一个或多个计算装置430、一个或多个远程服务器440、以及网络450之外的单独的框示出,应当理解,一个或多个神经网络模型410可以实际存储在系统400所包括的其他实体420、430、440、450中的任何一个上。
例如,一个或多个计算装置可以包括作为负载平衡的服务器群来操作的服务器计算装置。另外,虽然以上描述的一些功能被指示为在具有单个处理器的单个计算装置上发生,但是本文所描述的主题的各个方面均可以由多个计算装置例如通过网络相互通信来实现。
一个或多个电子设备420、一个或多个计算装置430、以及一个或多个远程服务器440中的每一个可以位于网络450的不同节点处,并且能够直接地或间接地与网络450的其他节点通信。本领域技术人员可以理解,系统500还可以包括图4未示出的其他装置,其中每个不同的装置均位于网络450的不同节点处。可以使用各种协议和系统将网络450和本文所描述的系统中的组成部分互连,以使得网络450可以是互联网、万维网、特定内联网、广域网或局域网的一部分。网络450可以利用诸如以太网、WiFi和HTTP等标准通信协议、对于一个或多个公司来说是专有的协议、以及前述协议的各种组合。虽然当如上所述来传递或接收信息时获得了某些优点,但是本文所描述的主题并不限于任何特定的信息传递方式。
一个或多个电子设备420、一个或多个计算装置430、以及一个或多个远程服务器440中的每一个可以被配置为与图5所示的系统500类似,即具有一个或多个处理器510、一个或多个存储器520、以及指令和数据。一个或多个电子设备420、一个或多个计算装置430、以及一个或多个远程服务器440中的每一个可以是意在由用户使用的个人计算装置或者由企业使用的商业计算机装置,并且具有通常与个人计算装置或商业计算机装置结合使用的所有组件,诸如中央处理单元(CPU)、存储数据和指令的存储器(例如,RAM和内部硬盘驱动器)、诸如显示器(例如,具有屏幕的监视器、触摸屏、投影仪、电视或可操作来显示信息的其他装置)、鼠标、 键盘、触摸屏、麦克风、扬声器、和/或网络接口装置等的一个或多个I/O设备。一个或多个电子设备420还可以包括用于捕获静态图像或记录视频流的一个或多个相机、以及用于将这些元件彼此连接的所有组件。
虽然一个或多个电子设备420可以各自包括全尺寸的个人计算装置,但是它们可能可选地包括能够通过诸如互联网等网络与服务器无线地交换数据的移动计算装置。举例来说,一个或多个电子设备420可以是移动电话,或者是诸如带无线支持的PDA、平板PC或能够经由互联网获得信息的上网本等装置。在另一个示例中,一个或多个电子设备420可以是可穿戴式计算系统。
图5是示意性地示出根据本公开的一个实施例的题目辅助系统500的至少一部分的结构图。系统500包括一个或多个处理器510、一个或多个存储器520、以及通常存在于计算机等装置中的其他组件(未示出)。一个或多个存储器520中的每一个可以存储可由一个或多个处理器510访问的内容,包括可以由一个或多个处理器510执行的指令521、以及可以由一个或多个处理器510来检索、操纵或存储的数据522。
指令521可以是将由一个或多个处理器510直接地执行的任何指令集,诸如机器代码,或者间接地执行的任何指令集,诸如脚本。本文中的术语“指令”、“应用”、“过程”、“步骤”和“程序”在本文中可以互换使用。指令521可以存储为目标代码格式以便由一个或多个处理器510直接处理,或者存储为任何其他计算机语言,包括按需解释或提前编译的独立源代码模块的脚本或集合。指令521可以包括引起诸如一个或多个处理器510来充当本文中的各神经网络的指令。本文其他部分更加详细地解释了指令521的功能、方法和例程。
一个或多个存储器520可以是能够存储可由一个或多个处理器510访问的内容的任何临时性或非临时性计算机可读存储介质,诸如硬盘驱动器、存储卡、ROM、RAM、DVD、CD、USB存储器、能写存储器和只读存储器等。一个或多个存储器520中的一个或多个可以包括分布式存储系统,其中指令521和/或数据522可以存储在物理地位于相同或不同的地理位置 处的多个不同的存储装置上。一个或多个存储器520中的一个或多个可以经由网络连接至一个或多个第一装置510,和/或可以直接地连接至或并入一个或多个处理器510中的任何一个中。
一个或多个处理器510可以根据指令521来检索、存储或修改数据522。存储在一个或多个存储器520中的数据522可以包括上文所述的各种待识别的影像、各种影像样本集、以及用于各个神经网络的参数等。其他不与影像或神经网络相关联的数据也可以被存储在一个或多个存储器520中。举例来说,虽然本文所描述的主题不受任何特定数据结构限制,但是数据522还可能存储在计算机寄存器(未示出)中,作为具有许多不同的字段和记录的表格或XML文档存储在关系型数据库中。数据522可以被格式化为任何计算装置可读格式,诸如但不限于二进制值、ASCII或统一代码。此外,数据522可以包括足以识别相关信息的任何信息,诸如编号、描述性文本、专有代码、指针、对存储在诸如其他网络位置处等其他存储器中的数据的引用或者被函数用于计算相关数据的信息。
一个或多个处理器510可以是任何常规处理器,诸如市场上可购得的中央处理单元(CPU)、图形处理单元(GPU)等。可替换地,一个或多个处理器510还可以是专用组件,诸如专用集成电路(ASIC)或其他基于硬件的处理器。虽然不是必需的,但是一个或多个处理器510可以包括专门的硬件组件来更快或更有效地执行特定的计算过程,诸如对影像进行图像处理等。
虽然图5中示意性地将一个或多个处理器510以及一个或多个存储器520示出在同一个框内,但是系统500可以实际上包括可能存在于同一个物理壳体内或不同的多个物理壳体内的多个处理器或存储器。例如,一个或多个存储器520中的一个可以是位于与上文所述的一个或多个计算装置(未示出)中的每一个的壳体不同的壳体中的硬盘驱动器或其他存储介质。因此,引用处理器、计算机、计算装置或存储器应被理解成包括引用可能并行操作或可能非并行操作的处理器、计算机、计算装置或存储器的集合。
在说明书及权利要求中的词语“A或B”包括“A和B”以及“A或B”, 而不是排他地仅包括“A”或者仅包括“B”,除非另有特别说明。
在本公开中,对“一个实施例”、“一些实施例”的提及意味着结合该实施例描述的特征、结构或特性包含在本公开的至少一个实施例、至少一些实施例中。因此,短语“在一个实施例中”、“在一些实施例中”在本公开的各处的出现未必是指同一个或同一些实施例。此外,在一个或多个实施例中,可以任何合适的组合和/或子组合来组合特征、结构或特性。
如在此所使用的,词语“示例性的”意指“用作示例、实例或说明”,而不是作为将被精确复制的“模型”。在此示例性描述的任意实现方式并不一定要被解释为比其它实现方式优选的或有利的。而且,本公开不受在上述技术领域、背景技术、发明内容或具体实施方式中所给出的任何所表述的或所暗示的理论所限定。
如在此所使用的,词语“基本上”意指包含由设计或制造的缺陷、器件或元件的容差、环境影响和/或其它因素所致的任意微小的变化。词语“基本上”还允许由寄生效应、噪音以及可能存在于实际的实现方式中的其它实际考虑因素所致的与完美的或理想的情形之间的差异。
上述描述可以指示被“连接”或“耦合”在一起的元件或节点或特征。如在此所使用的,除非另外明确说明,“连接”意指一个元件/节点/特征与另一种元件/节点/特征在电学上、机械上、逻辑上或以其它方式直接地连接(或者直接通信)。类似地,除非另外明确说明,“耦合”意指一个元件/节点/特征可以与另一元件/节点/特征以直接的或间接的方式在机械上、电学上、逻辑上或以其它方式连结以允许相互作用,即使这两个特征可能并没有直接连接也是如此。也就是说,“耦合”意图包含元件或其它特征的直接连结和间接连结,包括利用一个或多个中间元件的连接。
另外,仅仅为了参考的目的,还可以在下面描述中使用某种术语,并且因而并非意图限定。例如,除非上下文明确指出,否则涉及结构或元件的词语“第一”、“第二”和其它此类数字词语并没有暗示顺序或次序。
还应理解,“包括/包含”一词在本文中使用时,说明存在所指出的特征、整体、步骤、操作、单元和/或组件,但是并不排除存在或增加一个或 多个其它特征、整体、步骤、操作、单元和/或组件以及/或者它们的组合。
在本公开中,术语“部件”和“系统”意图是涉及一个与计算机有关的实体,或者硬件、硬件和软件的组合、软件、或执行中的软件。例如,一个部件可以是,但是不局限于,在处理器上运行的进程、对象、可执行态、执行线程、和/或程序等。通过举例说明,在一个服务器上运行的应用程序和所述服务器两者都可以是一个部件。一个或多个部件可以存在于一个执行的进程和/或线程的内部,并且一个部件可以被定位于一台计算机上和/或被分布在两台或更多计算机之间。
本领域技术人员应当意识到,在上述操作之间的边界仅仅是说明性的。多个操作可以结合成单个操作,单个操作可以分布于附加的操作中,并且操作可以在时间上至少部分重叠地执行。而且,另选的实施例可以包括特定操作的多个实例,并且在其他各种实施例中可以改变操作顺序。但是,其它的修改、变化和替换同样是可能的。因此,本说明书和附图应当被看作是说明性的,而非限制性的。
另外,本公开的实施方式还可以包括以下示例:
1.一种题目辅助方法,包括:
通过影像获取装置获取至少包括呈现在第一表面的第一题目的影像;
通过第一计算装置和预先训练的第一神经网络模型,基于所述影像,识别出所述影像中的所述第一题目所在的第一区域;
通过第二计算装置和预先训练的第二神经网络模型,基于所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;
通过第三计算装置和预先训练的第三神经网络模型,基于所述第一题目,判断所述第一题目的类型;
若所述第一题目的类型为计算题,则:
通过第四和第五计算装置分别生成所述计算题的第一答案和步骤化的解题过程;以及
通过显示装置显示所述计算题的题目和/或所述第一区域,并 且显示所述第一答案以及所述步骤化的解题过程。
2.根据权利要求1所述的题目辅助方法,其特征在于,通过所述第五计算装置生成所述计算题的步骤化的解题过程包括:
根据所述计算题的题目的形式特征,从预先设置的规则库中获取对应的规则;以及
根据所述对应的规则生成所述计算题的步骤化的解题过程。
3.根据权利要求1所述的题目辅助方法,其特征在于,所述步骤化的解题过程包括一个或多个步骤,通过显示装置显示所述步骤化的解题过程包括:按顺序显示所述一个或多个步骤所对应的操作结果。
4.根据权利要求3所述的题目辅助方法,其特征在于,通过显示装置显示所述步骤化的解题过程还包括:在所述显示装置的画面中的与所述一个或多个步骤所对应的结果相关联的区域,显示所述一个或多个步骤所对应的操作名称和/或过程。
5.根据权利要求1所述的题目辅助方法,其特征在于,所述计算题的步骤化的解题过程在第一触发时才被显示。
6.根据权利要求5所述的题目辅助方法,其特征在于,所述第一触发包括:所述显示装置的显示画面中的所述计算题的题目所在的区域、所述计算题的第一答案所在的区域、空白区域和/或指定的区域被进行指定的第一操作。
7.根据权利要求1所述的题目辅助方法,其特征在于,若所述第一题目的类型为计算题,则所述方法还包括:
通过第六计算装置生成所述计算题的图形化的解题过程;以及
在第二触发时,通过所述显示装置显示所述计算题的图形化的解题过程。
8.根据权利要求7所述的题目辅助方法,其特征在于,通过所述第六计算装置生成所述计算题的图形化的解题过程包括:
基于plotly库或pm算法模型将所述计算题转换为函数图;以及
根据所述函数图生成所述计算题的图形化的解题过程。
9.根据权利要求7所述的题目辅助方法,其特征在于,所述第二触发包括:所述显示装置的显示画面中的所述计算题的题目所在的区域、所述计算题的第一答案所在的区域、特定的操作区域和/或空白区域被进行指定的第二操作。
10.根据权利要求1所述的题目辅助方法,其特征在于,所述影像还包括呈现在所述第一表面的与所述第一题目相关联的第二答案,所述方法还包括:
通过所述第一计算装置和所述第一神经网络模型,基于所述影像,还识别出所述影像中的所述第二答案所在的第二区域;
通过第七计算装置和预先训练的第四神经网络模型,识别出所述第二区域中的字符,从而得到所述第二答案;
若所述第一题目的类型为计算题,则:
通过第八计算装置比较所述第一和第二答案,以得到相同或不同的结果;以及
通过所述显示装置还显示所述第二答案、以及所述结果。
11.根据权利要求1所述的题目辅助方法,其特征在于,还包括:
若所述第一题目的类型为应用题,则:
通过第九计算装置和预先训练的第五神经网络模型,对所述应用题进行特征提取以生成二维特征向量;
通过第十计算装置,从预先设置的向量索引库中搜索与所述二维特征向量相匹配的题目向量;
通过第十一计算装置,根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
通过显示装置显示所述应用题的第四答案。
12.根据权利要求11所述的题目辅助方法,其特征在于,对所述应用题进行特征提取以生成二维特征向量包括:
对所述应用题中的文字生成第一二维特征向量,并对所述应用题中的图片生成第二二维特征向量;以及
拼接所述第一和第二二维特征向量以得到所述二维特征向量。
13.根据权利要求11所述的题目辅助方法,其特征在于,所述向量索引库包括多个组,每个组包括一个或多个向量,其中,来自同一组的任意两个向量具有相同的长度,来自不同组的任意两个向量具有不同的长度,
其中,从所述向量索引库中搜索所述题目向量包括:
根据所述二维特征向量的长度,在所述向量索引库中找到与所述二维特征向量的长度匹配的组;
在所述组中进行搜索,以找到所述题目向量。
14.根据权利要求12所述的题目辅助方法,其特征在于,每个组具有各自的索引,所述索引与所述组中的向量的长度相匹配,在所述向量索引库中找到与所述二维特征向量的长度匹配的组包括:
根据所述二维特征向量的长度索引到所述匹配的组。
15.根据权利要求1所述的题目辅助方法,其特征在于,所述影像包括呈现在所述第一表面的所述第一题目所在的基本上整张试卷,其中,判断所述第一题目的类型还基于所述第一区域在所述整张试卷中的位置。
16.根据权利要求15所述的题目辅助方法,其特征在于,所述整张试卷还包括除所述第一题目之外的多个类型为应用题的第二题目,所述方法还包括:
通过所述第一计算装置和所述第一神经网络模型,基于所述影像,识别出所述影像中的所述多个第二题目所在的多个第三区域;
通过所述第二计算装置和所述第二神经网络模型,基于所述多个第三区域,分别识别所述多个第三区域中的字符,从而得到所述多个第二题目;
若所述第一题目的类型为应用题,则:
通过第九计算装置和预先训练的第五神经网络模型,分别对所述第一题目和所述多个第二题目进行特征提取以生成多个二 维特征向量;
通过第十计算装置:
从预先设置的向量索引库中搜索分别与所述多个二维特征向量距离最近的多个最近向量;
根据所述向量索引库中各个向量被预先设置的标记,得到所述多个最近向量所分别对应的多个试卷,所述标记为所述向量所来自的试卷的识别;
将所述多个试卷中出现次数最多的试卷确定为匹配试卷;
若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷是所述匹配试卷,则:
将与所述应用题的二维特征向量距离最近的所述最近向量确定为所述应用题的题目向量;
若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷不是所述匹配试卷,则:
将所述应用题的二维特征向量,在来自所述匹配试卷的多个向量中进行最短编辑距离匹配,找到与所述应用题的二维特征向量的最短编辑距离最小的向量,将所述最短编辑距离最小的向量确定为所述应用题的题目向量;
通过第十一计算装置,根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
通过显示装置显示所述应用题的第四答案。
17.根据权利要求16所述的题目辅助方法,其特征在于,所述第一至第五以及第九至第十一计算装置中的任意两者为相同或不同的计算装置。
18.一种题目辅助系统,包括:
预先训练的一个或多个神经网络模型;
具有影像获取功能和显示功能的一个或多个电子设备,被配置为获取至少包括呈现在第一表面的第一题目的影像;以及
一个或多个计算装置,被配置为:
基于所述神经网络模型和所述影像,识别出所述影像中的所述第一题目所在的第一区域;
基于所述神经网络模型和所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;
基于所述神经网络模型和所述第一题目,判断所述第一题目的类型;
若所述第一题目的类型为计算题,则生成所述计算题的第一答案和步骤化的解题过程,
其中,所述一个或多个电子设备还被配置为显示所述计算题的题目、第一答案以及步骤化的解题过程。
19.根据权利要求18所述的题目辅助系统,其特征在于,
所述一个或多个计算装置还被配置为:若所述第一题目的类型为计算题,则生成所述计算题的图形化的解题过程;以及
所述一个或多个电子设备还被配置为:显示所述计算题的图形化的解题过程。
20.根据权利要求18所述的题目辅助系统,其特征在于,
所述一个或多个计算装置还被配置为:若所述第一题目的类型为应用题,则:
基于所述神经网络模型,对所述应用题进行特征提取以生成二维特征向量;
从预先设置的向量索引库中搜索与所述二维特征向量相匹配的题目向量;
根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
所述一个或多个电子设备还被配置为:显示所述应用题的第四答 案。
21.根据权利要求20所述的题目辅助系统,其特征在于,对所述应用题进行特征提取以生成二维特征向量包括:
对所述应用题中的文字生成第一二维特征向量,并对所述应用题中的图片生成第二二维特征向量;以及
拼接所述第一和第二二维特征向量以得到所述二维特征向量。
22.根据权利要求18所述的题目辅助系统,其特征在于,所述影像包括呈现在所述第一表面的所述第一题目所在的基本上整张试卷,其中,判断所述第一题目的类型还基于所述第一区域在所述整张试卷中的位置。
23.根据权利要求22所述的题目辅助系统,其特征在于,所述整张试卷还包括除所述第一题目之外的多个类型为应用题的第二题目,
所述一个或多个计算装置还被配置为:
基于所述神经网络模型和所述影像,识别出所述影像中的所述多个第二题目所在的多个第三区域;
基于所述神经网络模型和所述多个第三区域,分别识别所述多个第三区域中的字符,从而得到所述多个第二题目;
若所述第一题目的类型为应用题,则:
基于神经网络模型,分别对所述第一题目和所述多个第二题目进行特征提取以生成多个二维特征向量;
从预先设置的向量索引库中搜索分别与所述多个二维特征向量距离最近的多个最近向量;
根据所述向量索引库中各个向量被预先设置的标记,得到所述多个最近向量所分别对应的多个试卷,所述标记为所述向量所来自的试卷的识别;
将所述多个试卷中出现次数最多的试卷确定为匹配试卷;
若与所述应用题的二维特征向量距离最近的所述最近 向量所对应的试卷是所述匹配试卷,则:
将所述应用题的二维特征向量距离最近的所述最近向量确定为所述应用题的题目向量;
若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷不是所述匹配试卷,则:
将所述应用题的二维特征向量,在来自所述匹配试卷的多个向量中进行最短编辑距离匹配,找到与所述应用题的二维特征向量的最短编辑距离最小的向量,将所述最短编辑距离最小的向量确定为所述应用题的题目向量;
根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
所述一个或多个电子设备还被配置为:显示所述应用题的第四答案。
24.根据权利要求18所述的题目辅助系统,其特征在于,所述一个或多个神经网络模型中的一个或多个存储在所述一个或多个电子设备中的一个或多个存储介质上。
25.根据权利要求18所述的题目辅助系统,其特征在于,所述题目辅助系统还包括一个或多个远程服务器,所述一个或多个神经网络模型中的一个或多个存储在所述一个或多个远程服务器中的一个或多个存储介质上。
26.根据权利要求18所述的题目辅助系统,其特征在于,所述一个或多个计算装置中的一个或多个位于所述一个或多个电子设备的物理壳体内。
27.根据权利要求18所述的题目辅助系统,其特征在于,所述题目辅助系统还包括一个或多个远程服务器,所述一个或多个计算装置中的一个或多个位于所述一个或多个远程服务器的物理壳体内。
28.一种题目辅助系统,包括:
一个或多个处理器;以及
一个或多个存储器,所述一个或多个存储器被配置为存储一系列计算机可执行的指令以及与所述一系列计算机可执行的指令相关联的计算机可访问的数据,
其中,当所述一系列计算机可执行的指令被所述一个或多个处理器执行时,使得所述一个或多个处理器进行如权利要求1-17中任一项所述的方法。
29.一种非临时性计算机可读存储介质,其特征在于,所述非临时性计算机可读存储介质上存储有一系列计算机可执行的指令,当所述一系列计算机可执行的指令被一个或多个计算装置执行时,使得所述一个或多个计算装置进行如权利要求1-17中任一项所述的方法。
虽然已经通过示例对本公开的一些特定实施例进行了详细说明,但是本领域的技术人员应该理解,以上示例仅是为了进行说明,而不是为了限制本公开的范围。在此公开的各实施例可以任意组合,而不脱离本公开的精神和范围。本领域的技术人员还应理解,可以对实施例进行多种修改而不脱离本公开的范围和精神。本公开的范围由所附权利要求来限定。

Claims (20)

  1. 一种题目辅助方法,包括:
    通过影像获取装置获取至少包括呈现在第一表面的第一题目的影像;
    通过第一计算装置和预先训练的第一神经网络模型,基于所述影像,识别出所述影像中的所述第一题目所在的第一区域;
    通过第二计算装置和预先训练的第二神经网络模型,基于所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;
    通过第三计算装置和预先训练的第三神经网络模型,基于所述第一题目,判断所述第一题目的类型;
    若所述第一题目的类型为计算题,则:
    通过第四计算装置和第五计算装置分别生成所述计算题的第一答案和步骤化的解题过程;以及
    通过显示装置显示所述计算题的题目和/或所述第一区域,并且显示所述第一答案以及所述步骤化的解题过程。
  2. 根据权利要求1所述的题目辅助方法,其特征在于,通过所述第五计算装置生成所述计算题的步骤化的解题过程包括:
    根据所述计算题的题目的形式特征,从预先设置的规则库中获取对应的规则;以及
    根据所述对应的规则生成所述计算题的步骤化的解题过程;
    其中,所述步骤化的解题过程包括一个或多个步骤,通过显示装置显示所述步骤化的解题过程包括:按顺序显示所述一个或多个步骤所对应的操作结果。
  3. 根据权利要求2所述的题目辅助方法,其特征在于,通过显示装置显示所述步骤化的解题过程还包括:在所述显示装置的画面中的与所述一个或多个步骤所对应的结果相关联的区域,显示所述一个或多个步骤所对 应的操作名称和/或过程。
  4. 根据权利要求1所述的题目辅助方法,其特征在于,所述计算题的步骤化的解题过程在第一触发时才被显示,所述第一触发包括:所述显示装置的显示画面中的所述计算题的题目所在的区域、所述计算题的第一答案所在的区域、空白区域和/或指定的区域被进行指定的第一操作。
  5. 根据权利要求1所述的题目辅助方法,其特征在于,若所述第一题目的类型为计算题,则所述方法还包括:
    通过第六计算装置生成所述计算题的图形化的解题过程;以及
    在第二触发时,通过所述显示装置显示所述计算题的图形化的解题过程,所述第二触发包括:所述显示装置的显示画面中的所述计算题的题目所在的区域、所述计算题的第一答案所在的区域、特定的操作区域和/或空白区域被进行指定的第二操作。
  6. 根据权利要求5所述的题目辅助方法,其特征在于,通过所述第六计算装置生成所述计算题的图形化的解题过程包括:
    基于plotly库或pm算法模型将所述计算题转换为函数图;以及
    根据所述函数图生成所述计算题的图形化的解题过程。
  7. 根据权利要求1所述的题目辅助方法,其特征在于,所述影像还包括呈现在所述第一表面的与所述第一题目相关联的第二答案,所述方法还包括:
    通过所述第一计算装置和所述第一神经网络模型,基于所述影像,还识别出所述影像中的所述第二答案所在的第二区域;
    通过第七计算装置和预先训练的第四神经网络模型,识别出所述第二区域中的字符,从而得到所述第二答案;
    若所述第一题目的类型为计算题,则:
    通过第八计算装置比较所述第一答案和第二答案,以得到相同或不同的结果;以及
    通过所述显示装置还显示所述第二答案以及所述结果。
  8. 根据权利要求1所述的题目辅助方法,其特征在于,还包括:
    若所述第一题目的类型为应用题,则:
    通过第九计算装置和预先训练的第五神经网络模型,对所述应用题进行特征提取以生成二维特征向量;
    通过第十计算装置,从预先设置的向量索引库中搜索与所述二维特征向量相匹配的题目向量;
    通过第十一计算装置,根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
    通过显示装置显示所述应用题的第四答案。
  9. 根据权利要求8所述的题目辅助方法,其特征在于,对所述应用题进行特征提取以生成二维特征向量包括:
    对所述应用题中的文字生成第一二维特征向量,并对所述应用题中的图片生成第二二维特征向量;以及
    拼接所述第一和第二二维特征向量以得到所述二维特征向量。
  10. 根据权利要求8所述的题目辅助方法,其特征在于,所述向量索引库包括多个组,每个组包括一个或多个向量,其中,来自同一组的任意两个向量具有相同的长度,来自不同组的任意两个向量具有不同的长度,
    其中,从所述向量索引库中搜索所述题目向量包括:
    根据所述二维特征向量的长度,在所述向量索引库中找到与所述二维特征向量的长度匹配的组;
    在所述组中进行搜索,以找到所述题目向量。
  11. 根据权利要求1所述的题目辅助方法,其特征在于,所述影像包括呈现在所述第一表面的所述第一题目所在的基本上整张试卷,其中,判断所述第一题目的类型还基于所述第一区域在所述整张试卷中的位置。
  12. 根据权利要求11所述的题目辅助方法,其特征在于,所述整张试卷还包括除所述第一题目之外的多个类型为应用题的第二题目,所述方法还包括:
    通过所述第一计算装置和所述第一神经网络模型,基于所述影像,识别出所述影像中的所述多个第二题目所在的多个第三区域;
    通过所述第二计算装置和所述第二神经网络模型,基于所述多个第三区域,分别识别所述多个第三区域中的字符,从而得到所述多个第二题目;
    若所述第一题目的类型为应用题,则:
    通过第九计算装置和预先训练的第五神经网络模型,分别对所述第一题目和所述多个第二题目进行特征提取以生成多个二维特征向量;
    通过第十计算装置:
    从预先设置的向量索引库中搜索分别与所述多个二维特征向量距离最近的多个最近向量;
    根据所述向量索引库中各个向量被预先设置的标记,得到所述多个最近向量所分别对应的多个试卷,所述标记为所述向量所来自的试卷的识别;
    将所述多个试卷中出现次数最多的试卷确定为匹配试卷;
    若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷是所述匹配试卷,则:
    将与所述应用题的二维特征向量距离最近的所述最近向量确定为所述应用题的题目向量;
    若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷不是所述匹配试卷,则:
    将所述应用题的二维特征向量,在来自所述匹配试卷的多个向量中进行最短编辑距离匹配,找到与所述应用题的二维特征向量的最短编辑距离最小的向量,将所述最短编辑距离最小的向量确定为所述应用题的题目向量;
    通过第十一计算装置,根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
    通过显示装置显示所述应用题的第四答案。
  13. 根据权利要求12所述的题目辅助方法,其特征在于,所述第一至第五以及第九至第十一计算装置中的任意两者为相同或不同的计算装置。
  14. 一种题目辅助系统,包括:
    预先训练的一个或多个神经网络模型;
    具有影像获取功能和显示功能的一个或多个电子设备,被配置为获取至少包括呈现在第一表面的第一题目的影像;以及
    一个或多个计算装置,被配置为:
    基于所述神经网络模型和所述影像,识别出所述影像中的所述第一题目所在的第一区域;
    基于所述神经网络模型和所述第一区域,识别出所述第一区域中的字符,从而得到所述第一题目;
    基于所述神经网络模型和所述第一题目,判断所述第一题目的类型;
    若所述第一题目的类型为计算题,则生成所述计算题的第一答案和步骤化的解题过程,
    其中,所述一个或多个电子设备还被配置为显示所述计算题的题目、第一答案以及步骤化的解题过程。
  15. 根据权利要求14所述的题目辅助系统,其特征在于,
    所述一个或多个计算装置还被配置为:若所述第一题目的类型为计算题,则生成所述计算题的图形化的解题过程;以及
    所述一个或多个电子设备还被配置为:显示所述计算题的图形化的解题过程。
  16. 根据权利要求14所述的题目辅助系统,其特征在于,
    所述一个或多个计算装置还被配置为:若所述第一题目的类型为应用题,则:
    基于所述神经网络模型,对所述应用题进行特征提取以生成二维特征向量;
    从预先设置的向量索引库中搜索与所述二维特征向量相匹配的题目向量;
    根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
    所述一个或多个电子设备还被配置为:显示所述应用题的第四答案。
  17. 根据权利要求14所述的题目辅助系统,其特征在于,所述影像包括呈现在所述第一表面的所述第一题目所在的基本上整张试卷,其中,判断所述第一题目的类型还基于所述第一区域在所述整张试卷中的位置。
  18. 根据权利要求17所述的题目辅助系统,其特征在于,所述整张试卷还包括除所述第一题目之外的多个类型为应用题的第二题目,
    所述一个或多个计算装置还被配置为:
    基于所述神经网络模型和所述影像,识别出所述影像中的所述多个第二题目所在的多个第三区域;
    基于所述神经网络模型和所述多个第三区域,分别识别所述多个第三区域中的字符,从而得到所述多个第二题目;
    若所述第一题目的类型为应用题,则:
    基于神经网络模型,分别对所述第一题目和所述多个第二题目进行特征提取以生成多个二维特征向量;
    从预先设置的向量索引库中搜索分别与所述多个二维特征向量距离最近的多个最近向量;
    根据所述向量索引库中各个向量被预先设置的标记,得到所述多个最近向量所分别对应的多个试卷,所述标记为所述向量所来自的试卷的识别;
    将所述多个试卷中出现次数最多的试卷确定为匹配试卷;
    若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷是所述匹配试卷,则:
    将所述应用题的二维特征向量距离最近的所述最近向量确定为所述应用题的题目向量;
    若与所述应用题的二维特征向量距离最近的所述最近向量所对应的试卷不是所述匹配试卷,则:
    将所述应用题的二维特征向量,在来自所述匹配试卷的多个向量中进行最短编辑距离匹配,找到与所述应用题的二维特征向量的最短编辑距离最小的向量,将所述最短编辑距离最小的向量确定为所述应用题的题目向量;
    根据预先设置的与所述题目向量相关联的第三答案,生成所述应用题的第四答案;以及
    所述一个或多个电子设备还被配置为:显示所述应用题的第四答案。
  19. 一种题目辅助系统,包括:
    一个或多个处理器;以及
    一个或多个存储器,所述一个或多个存储器被配置为存储一系列计算机可执行的指令以及与所述一系列计算机可执行的指令相关联的计算机可访问的数据,
    其中,当所述一系列计算机可执行的指令被所述一个或多个处理器执 行时,使得所述一个或多个处理器进行如权利要求1-13中任一项所述的方法。
  20. 一种非临时性计算机可读存储介质,其特征在于,所述非临时性计算机可读存储介质上存储有一系列计算机可执行的指令,当所述一系列计算机可执行的指令被一个或多个计算装置执行时,使得所述一个或多个计算装置进行如权利要求1-13中任一项所述的方法。
PCT/CN2020/075826 2019-03-04 2020-02-19 题目辅助方法及系统 WO2020177531A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910158424.3A CN109815955B (zh) 2019-03-04 2019-03-04 题目辅助方法及系统
CN201910158424.3 2019-03-04

Publications (1)

Publication Number Publication Date
WO2020177531A1 true WO2020177531A1 (zh) 2020-09-10

Family

ID=66608032

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/075826 WO2020177531A1 (zh) 2019-03-04 2020-02-19 题目辅助方法及系统

Country Status (3)

Country Link
US (1) US20200286402A1 (zh)
CN (1) CN109815955B (zh)
WO (1) WO2020177531A1 (zh)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710590B (zh) * 2018-12-26 2021-05-07 杭州大拿科技股份有限公司 一种错题本生成方法及装置
WO2020134750A1 (zh) * 2018-12-28 2020-07-02 杭州大拿科技股份有限公司 一种答案批改方法及装置
CN109815955B (zh) * 2019-03-04 2021-09-28 杭州大拿科技股份有限公司 题目辅助方法及系统
CN110598642A (zh) * 2019-09-16 2019-12-20 杭州大拿科技股份有限公司 一种计算题在线练习方法、装置、设备和存储介质
CN110675677A (zh) * 2019-10-16 2020-01-10 杭州大拿科技股份有限公司 用于辅助数学应用题的方法及装置
US20210182730A1 (en) * 2019-12-12 2021-06-17 Shopify Inc. Systems and methods for detecting non-causal dependencies in machine learning models
CN111369403B (zh) * 2020-02-27 2021-09-14 北京字节跳动网络技术有限公司 一种解题演示方法及装置
CN112183402B (zh) * 2020-09-30 2022-12-27 北京有竹居网络技术有限公司 一种信息处理的方法、装置、电子设备及存储介质
CN112446934A (zh) * 2020-12-11 2021-03-05 北京有竹居网络技术有限公司 一种题目示意图生成方法及装置
CN112488052B (zh) * 2020-12-16 2023-05-26 杭州大拿科技股份有限公司 题目辅助方法、装置和系统
CN113257063A (zh) * 2021-06-08 2021-08-13 北京字节跳动网络技术有限公司 一种交互方法及终端设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096678A (zh) * 2015-07-17 2015-11-25 成都准星云学科技有限公司 用于辅助评判数学题答题质量的方法及装置
CN107729936A (zh) * 2017-10-12 2018-02-23 科大讯飞股份有限公司 一种改错题自动评阅方法及系统
CN107798321A (zh) * 2017-12-04 2018-03-13 海南云江科技有限公司 一种试卷分析方法和计算设备
CN109189895A (zh) * 2018-09-26 2019-01-11 杭州大拿科技股份有限公司 一种针对口算题的题目批改方法及装置
CN109284355A (zh) * 2018-09-26 2019-01-29 杭州大拿科技股份有限公司 一种批改试卷中口算题的方法及装置
CN109815955A (zh) * 2019-03-04 2019-05-28 杭州大拿科技股份有限公司 题目辅助方法及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2702063B1 (fr) * 1993-02-24 1999-03-19 Infomedia Communication Procédé d'évaluation scolaire, assistée par ordinateur, du niveau de connaissances d'un élève, de diagnostic de carence et de remise à niveau et système, unité et serveur pour la mise en Óoeuvre du procédé.
US6042384A (en) * 1998-06-30 2000-03-28 Bookette Software Company Computerized systems for optically scanning and electronically scoring and reporting test results
CN105512257A (zh) * 2015-12-01 2016-04-20 广东小天才科技有限公司 一种搜题显示答案的方法及系统
CN106372243A (zh) * 2016-09-19 2017-02-01 广东小天才科技有限公司 应用于电子终端的试题搜索方法和装置
CN108932508B (zh) * 2018-08-13 2022-03-18 杭州大拿科技股份有限公司 一种题目智能识别、批改的方法和系统
CN109271511B (zh) * 2018-08-23 2021-04-23 上海互教教育科技有限公司 基于复杂推理网络的自动解题方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096678A (zh) * 2015-07-17 2015-11-25 成都准星云学科技有限公司 用于辅助评判数学题答题质量的方法及装置
CN107729936A (zh) * 2017-10-12 2018-02-23 科大讯飞股份有限公司 一种改错题自动评阅方法及系统
CN107798321A (zh) * 2017-12-04 2018-03-13 海南云江科技有限公司 一种试卷分析方法和计算设备
CN109189895A (zh) * 2018-09-26 2019-01-11 杭州大拿科技股份有限公司 一种针对口算题的题目批改方法及装置
CN109284355A (zh) * 2018-09-26 2019-01-29 杭州大拿科技股份有限公司 一种批改试卷中口算题的方法及装置
CN109815955A (zh) * 2019-03-04 2019-05-28 杭州大拿科技股份有限公司 题目辅助方法及系统

Also Published As

Publication number Publication date
CN109815955A (zh) 2019-05-28
CN109815955B (zh) 2021-09-28
US20200286402A1 (en) 2020-09-10

Similar Documents

Publication Publication Date Title
WO2020177531A1 (zh) 题目辅助方法及系统
RU2701995C2 (ru) Автоматическое определение набора категорий для классификации документа
WO2021073332A1 (zh) 用于辅助数学应用题的方法及装置
US11790641B2 (en) Answer evaluation method, answer evaluation system, electronic device, and medium
JP6397144B2 (ja) 画像からの事業発見
CN114375435A (zh) 增强物理活动表面上的有形内容
WO2022257578A1 (zh) 用于识别文本的方法和装置
CN110750624A (zh) 信息输出方法及装置
CN111522979B (zh) 图片排序推荐方法、装置、电子设备、存储介质
WO2022127425A1 (zh) 题目辅助方法、装置和系统
Ahmed et al. Arabic sign language intelligent translator
CN112347997A (zh) 一种试题检测识别方法、装置、电子设备及介质
CN116743692B (zh) 一种历史消息折叠方法和系统
Ouali et al. Real-time application for recognition and visualization of arabic words with vowels based dl and ar
WO2023024898A1 (zh) 题目辅助方法、题目辅助装置和题目辅助系统
Manasa Devi et al. Automated text detection from big data scene videos in higher education: a practical approach for MOOCs case study
Abutalipov et al. Handshape classification in a reverse dictionary of sign languages for the deaf
WO2022177372A1 (ko) 인공지능을 이용하여 튜터링 서비스를 제공하기 위한 시스템 및 그에 관한 방법
Li et al. A platform for creating Smartphone apps to enhance Chinese learning using augmented reality
Angrave et al. Creating tiktoks, memes, accessible content, and books from engineering videos? first solve the scene detection problem
de Carvalho Saraiva et al. Finding out topics in educational materials using their components
CN113934922A (zh) 一种智能推荐的方法、装置、设备及计算机存储介质
Lu et al. Complementary pseudolabel based on global-and-channel information for unsupervised person reidentification
Tuna Automated lecture video indexing with text analysis and machine learning
CN113239234B (zh) 提供视频书籍的方法及建立视频书籍的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20765604

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20765604

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 26.10.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 20765604

Country of ref document: EP

Kind code of ref document: A1