WO2024098282A1 - 一种几何解题方法、装置、设备及存储介质 - Google Patents

一种几何解题方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2024098282A1
WO2024098282A1 PCT/CN2022/130858 CN2022130858W WO2024098282A1 WO 2024098282 A1 WO2024098282 A1 WO 2024098282A1 CN 2022130858 W CN2022130858 W CN 2022130858W WO 2024098282 A1 WO2024098282 A1 WO 2024098282A1
Authority
WO
WIPO (PCT)
Prior art keywords
geometric
current
condition
theorem
learning model
Prior art date
Application number
PCT/CN2022/130858
Other languages
English (en)
French (fr)
Inventor
黄世锋
Original Assignee
广州视源电子科技股份有限公司
广州视源人工智能创新研究院有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州视源电子科技股份有限公司, 广州视源人工智能创新研究院有限公司 filed Critical 广州视源电子科技股份有限公司
Priority to PCT/CN2022/130858 priority Critical patent/WO2024098282A1/zh
Publication of WO2024098282A1 publication Critical patent/WO2024098282A1/zh

Links

Images

Definitions

  • the present application relates to the technical field of machine learning, and in particular to a geometry problem-solving method, apparatus, device and storage medium.
  • Geometry is a subject in mathematics. Geometry exercises require users to solve problems based on question information and geometric figures. This type of exercise requires students to have good geometric thinking and be familiar with knowledge points in geometry, which can test students' comprehensive mathematical ability.
  • the present application provides a geometry problem-solving method, device, equipment and storage medium to solve the problem of how to improve the accuracy of solving geometry exercises.
  • a geometry problem-solving method is provided, which is applied to a server, and the method includes:
  • the current geometric condition is input into the reinforcement learning model for learning to obtain a geometric theorem adapted to the current geometric condition
  • the answer is sent to the client to display the geometry problem in association with the answer.
  • a geometry problem-solving method is provided, which is applied to a server, and the method includes:
  • the current geometric condition is input into the reinforcement learning model for learning to obtain a geometric theorem adapted to the current geometric condition
  • the mapping relationship between the electronic exercises and the derivation information is stored in the question bank, and the derivation information sequentially displays the process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of iterating the reinforcement learning model until the answer is obtained.
  • a geometry problem-solving device which is applied to a server, and the device includes:
  • a problem-solving request receiving module is used to receive a problem-solving request sent by a client for an electronic exercise in the subject of geometry
  • an exercise information extraction module for extracting known geometric conditions and geometric problems to be solved from the electronic exercises in response to the problem-solving request
  • a geometric theorem learning module used for inputting the current geometric conditions into the reinforcement learning model for learning each time the reinforcement learning model is iterated, so as to obtain a geometric theorem adapted to the current geometric conditions
  • a geometric condition inference module used for applying the current geometric theorem to the current geometric condition to infer a new geometric condition
  • An answer determination module used for determining the new geometric condition as the answer to the geometric problem when the iteration of the reinforcement learning model ends;
  • the answer sending module is used to send the answer to the client so as to associate the geometry exercise with the answer for display.
  • a geometry problem-solving device which is applied to a server, and the device includes:
  • An electronic exercise search module is used to search for electronic exercises belonging to the subject of geometry from the question bank;
  • an exercise information extraction module for extracting known geometric conditions and geometric problems to be solved from the electronic exercise if the electronic exercise lacks an answer and/or a derivation process
  • a geometric theorem learning module used for inputting the current geometric conditions into the reinforcement learning model for learning each time the reinforcement learning model is iterated, so as to obtain a geometric theorem adapted to the current geometric conditions
  • a geometric condition inference module used for applying the current geometric theorem to the current geometric condition to infer a new geometric condition
  • An answer determination module used for setting the new geometric condition as the answer to the geometric problem when the iteration of the reinforcement learning model ends;
  • the exercise storage module is used to store the mapping relationship between the electronic exercises and the derivation information in the question bank, and the derivation information is a process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of iterating the reinforcement learning model until the answer is obtained.
  • an electronic device comprising:
  • the memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the geometry problem-solving method described in any embodiment of the present application.
  • a computer-readable storage medium stores a computer program, and the computer program is used to implement the geometry problem-solving method described in any embodiment of the present application when executed by a processor.
  • the server receives a problem-solving request sent by the client for an electronic exercise belonging to the subject of geometry; in response to the problem-solving request, the server extracts known geometric conditions and geometric problems to be solved from the electronic exercise; in each iteration of the reinforcement learning model, the current geometric conditions are input into the reinforcement learning model for learning, and a geometric theorem adapted to the current geometric conditions is obtained; the current geometric theorem is applied to the current geometric conditions to infer new geometric conditions; when the iterative reinforcement learning model ends, the new geometric conditions are determined to be the answer to the geometric problem; the answer is sent to the client to associate the geometric exercise with the answer for display.
  • This embodiment proposes a geometry problem-solving framework with strong compatibility and scalability.
  • the reinforcement learning model is introduced into the geometry problem-solving framework to learn geometry theorems.
  • the logic is clear and the parsability is strong.
  • the correlation between geometry theorems is maintained, and the accuracy of geometry problem-solving is improved.
  • this embodiment can predict the geometry theorem step by step for inference, and the inference process can be described, which is more in line with the user's learning process, and achieves the effect of knowing the truth and the reason.
  • FIG1 is a flow chart of a method for solving a geometric problem according to Embodiment 1 of the present application.
  • FIG2 is a flow chart of a method for solving a geometric problem according to Embodiment 2 of the present application.
  • FIG3 is an example diagram of optical character recognition provided according to Embodiment 2 of the present application.
  • FIG4A is an example diagram of a geometric electronic exercise provided according to the second embodiment of the present application.
  • FIG4B is a flowchart illustrating a geometric problem-solving method according to the first embodiment of the present application.
  • FIG5 is a flow chart of a method for solving a geometric problem according to Embodiment 3 of the present application.
  • FIG6 is a schematic diagram of the structure of a geometric problem-solving device provided according to Embodiment 4 of the present application.
  • FIG7 is a schematic diagram of the structure of a geometric problem-solving device provided according to Embodiment 5 of the present application.
  • FIG8 is a schematic diagram of the structure of an electronic device that implements the geometry problem-solving method according to an embodiment of the present application.
  • some learning applications use neural networks to analyze the solution process of electronic geometry exercises. That is, the stem information of the electronic geometry exercises is input into the text feature extractor to obtain text features, and the geometric figures of the electronic geometry exercises are input into the image feature extractor to obtain image features. The text and image features are fused and passed through the decoder to obtain the predicted results.
  • interpretability can be divided into two types: intrinsic interpretability and post-hoc interpretability.
  • Intrinsic interpretability requires limiting the complexity of the model
  • post-hoc interpretability requires analyzing the model results after model training.
  • Models that meet this condition include linear/logistic regression, decision trees, naive Bayes, K-nearest neighbors, etc.
  • neural networks obviously do not meet this condition due to the complexity of their parameters.
  • FIG1 is a flow chart of a geometry problem-solving method provided in the first embodiment of the present application.
  • the present embodiment is applicable to the case where a reinforcement learning model is used to solve a geometry electronic exercise selected by a user.
  • the method can be executed by a geometry problem-solving device, which can be implemented in the form of hardware and/or software, and can be configured in an electronic device. As shown in FIG1 , the method includes:
  • Step 101 Receive a problem-solving request sent by a client for an electronic exercise in the subject of geometry.
  • the electronic device acts as a server, stores electronic exercises, and maintains the logic for screening and solving electronic exercises, so as to save storage resources and reduce updates to the client.
  • the server provides users with a service for solving electronic geometry exercises.
  • Users log in to the client (Client).
  • the client can receive electronic geometry exercises selected by the user or receive electronic geometry exercises screened by the server, and display the electronic geometry exercises to the user for answering and practicing. In some cases, some electronic geometry exercises are missing answers.
  • the user can operate in the client and send a solution request to the server, requesting the server to solve electronic exercises belonging to the subject of geometry.
  • the server can also provide users with services such as recommending electronic exercises and solving electronic exercises of other subjects, which are not limited in this embodiment.
  • users include teachers and students.
  • the teacher can log in to the client, select some or all students based on the students' learning situation, and notify the server to select suitable electronic geometry exercises for these students, and push the electronic exercises to the client where the corresponding students are logged in, so that these students can answer and practice respectively, and after answering and practicing, the server loads the process of solving electronic geometry exercises for the students.
  • the student can log in to the client, notify the server to select electronic geometry exercises, notify the server to select suitable electronic exercises for them, and push the electronic exercises to the client where the student is logged in, so that the student can answer and practice, and request the server to solve the electronic geometry exercises.
  • the number of electronic geometry exercises can be reduced to the level of hundreds of thousands.
  • electronic geometry exercises for practicing a certain grade electronic geometry exercises for practicing a certain knowledge point, etc.
  • the client can download electronic geometry exercises from the server, maintain the logic of screening and solving electronic geometry exercises, and provide users with services for solving electronic geometry exercises, so that users can still answer, practice, and browse the process of solving electronic geometry exercises normally in offline scenarios. This embodiment does not limit this.
  • Step 102 In response to a problem-solving request, known geometric conditions and geometric problems to be solved are extracted from the electronic exercises.
  • the electronic exercises can be preprocessed and some key information for solving the electronic exercises can be parsed from the electronic exercises, including known conditions, recorded as geometric conditions, and problems to be solved, recorded as geometric problems.
  • geometric conditions may include the length of a side, the relationship between a point and a side (such as a midpoint, etc.), the relationship between a point and a figure (such as the center of gravity of a triangle, etc.), the degree of an angle, the relationship between a side and an angle (such as an angle bisector, etc.), and so on.
  • the problems to be solved may include finding the length of a side, the relationship between a point and a side, the relationship between a point and a figure, the measure of an angle, the relationship between a side and an angle, and so on.
  • Electronic exercises include stem information, option information and other parts, and there are many types of electronic geometry exercises, for example, multiple-choice questions, fill-in-the-blank questions, question-and-answer questions (also known as answer questions), etc.
  • the information contained in electronic exercises of different types is different.
  • multiple-choice questions usually include stem information and option information
  • fill-in-the-blank questions and question-and-answer questions usually include stem information, and so on.
  • the stem information contains key information for solving electronic exercises, and the option information is several possible answers. Therefore, in this embodiment, the option information can be filtered out, and the known geometric conditions and the geometric problems to be solved can be extracted from the stem information of the electronic exercises.
  • formula data in mathematics and character data in English are usually recorded using some specific formats so that they can be correctly displayed on the UI page, such as latex (an electronic typesetting system based on the underlying programming language), HTML (HyperText Markup Language), MathML (Mathematical Markup Language), etc., and tags will be generated when recording.
  • latex an electronic typesetting system based on the underlying programming language
  • HTML HyperText Markup Language
  • MathML MathML (Mathematical Markup Language), etc.
  • tags will be generated when recording.
  • data format templates can be set.
  • the template has one or more wildcards.
  • the template is Equals(LengthOf(Line( ⁇ 1)), ⁇ 2), where the wildcard “ ⁇ 1” can be written as a letter representing a line segment, and the wildcard “ ⁇ 2" can be written as the length of the line segment.
  • the template is Find(LengthOf(Line( ⁇ 1))), where the wildcard " ⁇ 1" can be written as a letter representing a line segment.
  • Step 103 During each iteration of the reinforcement learning model, the current geometric conditions are input into the reinforcement learning model for learning to obtain a geometric theorem that is compatible with the current geometric conditions.
  • Electronic geometry exercises involve a lot of reasoning, and some geometric theorems are used as auxiliary information in this process. Especially when an electronic exercise involves multiple geometric theorems for reasoning, the subsequent derivation using geometric theorems depends on the derivation using geometric theorems first, and there is a strong correlation between logical expressions.
  • neural networks are black box models with poor analyzability
  • traditional neural network-based problem-solving methods have weak reasoning and are unable to introduce geometric theorems. They fundamentally lose the correlation between geometric theorems, resulting in poor quality of problem-solving and inability to be put into practical use.
  • a reinforcement learning model can be used to assist in solving geometric electronic exercises.
  • Using the reinforcement learning model to assist in solving geometric electronic exercises will involve one or more iterative learning processes. During each iterative learning process, all geometric conditions in this iteration can be input into the reinforcement learning model for learning, and the reinforcement learning model outputs geometric theorems that are compatible with the current geometric conditions.
  • the reinforcement learning model is a model that expresses reinforcement learning.
  • the so-called reinforcement learning is to understand information, obtain the mapping from input to output, and continuously learn from one's own previous experience in solving geometric electronic exercises to acquire knowledge, thereby avoiding a large number of labeled definite labels, and providing feedback with a reward and punishment mechanism that evaluates the behavior of selecting geometric theorems. Reinforcement learning "learns" itself through such feedback.
  • Reinforcement learning models can be described using methods such as Markov Decision Process (MDP), that is, the machine is in an environment, and each state is the machine's perception of the current environment; the machine affects the environment through actions, and when the machine performs an action, the environment will transfer to another state with a certain probability; at the same time, the environment will feedback an incentive to the machine based on the potential incentive function.
  • MDP Markov Decision Process
  • Step 104 Apply the current geometric theorem to the current geometric conditions to infer new geometric conditions.
  • the selected geometric theorem can be applied to the geometric conditions for logical deduction to obtain new information, which is recorded as a new geometric condition.
  • Step 105 When the iterative reinforcement learning model ends, the new geometric condition is determined as the answer to the geometric problem.
  • the new geometric condition output by the last iterative learning can be defined as the answer to the geometric problem.
  • Step 106 Send the answer to the client to display the geometry exercise in association with the answer.
  • the server encapsulates the answer into a problem-solving response and sends the problem-solving response to the client.
  • the client receives the problem-solving response, it parses the answer from the problem-solving response, displays the geometry exercise in association with the answer, and prompts the user to solve the geometry exercise to obtain the answer.
  • the server receives a problem-solving request sent by the client for an electronic exercise belonging to the subject of geometry; in response to the problem-solving request, the server extracts known geometric conditions and geometric problems to be solved from the electronic exercise; in each iteration of the reinforcement learning model, the current geometric conditions are input into the reinforcement learning model for learning, and a geometric theorem adapted to the current geometric conditions is obtained; the current geometric theorem is applied to the current geometric conditions to infer new geometric conditions; when the iterative reinforcement learning model ends, the new geometric conditions are determined to be the answer to the geometric problem; the answer is sent to the client to associate the geometric exercise with the answer for display.
  • This embodiment proposes a geometry problem-solving framework with strong compatibility and scalability.
  • the reinforcement learning model is introduced into the geometry problem-solving framework to learn geometry theorems.
  • the logic is clear and the parsability is strong.
  • the correlation between geometry theorems is maintained, and the accuracy of geometry problem-solving is improved.
  • this embodiment can predict the geometry theorem step by step for inference, and the inference process can be described, which is more in line with the user's learning process, and achieves the effect of knowing the truth and the reason.
  • FIG2 is a flow chart of a geometric problem-solving method provided in Example 2 of the present application. This embodiment refines the derivation process of the reinforcement learning model based on the above embodiment. As shown in FIG2, the method includes:
  • Step 201 Receive a problem-solving request sent by a client for an electronic exercise in the subject of geometry.
  • the application can be a client that independently provides learning services, or it can be a functional module that provides learning services in other clients (such as SDK (Software Development Kit)), such as instant messaging tools, industry work clients, etc., and it can also be a client with a browsing component.
  • the client with a browsing component may include a browser, an application that configures the browsing component (such as WebView), and this embodiment does not limit this.
  • identity data For users, they can use user account, password and other information to log in in the application, which is represented by identity data. If the user is not logged in, temporary identity data can be provided for the user, and the temporary identity data can be bound to the device identifier, and the temporary identity data bound to the same device identifier can be merged. If the temporary user subsequently registers and logs in, the user's temporary identity data can be converted into formal identity data.
  • the client can provide a UI (User Interface) on which users can browse geometric electronic exercises.
  • UI User Interface
  • operations to solve a certain geometric electronic exercise can be triggered, such as clicking on a certain geometric electronic exercise to study and browse the solution process, clicking on a certain geometric learning task to take a test and browse the solution process of the electronic exercise, requesting the solution of a new geometric electronic exercise, and so on.
  • This embodiment is applied to the server side.
  • the client side can send a solution request for the geometry electronic exercises to the server side.
  • the server side receives the solution request, it starts the logic of solving the geometry electronic exercises (ie, executes steps 201 to 212).
  • the user actively selects geometry electronic exercises in the client, and the client can also filter geometry electronic exercises suitable for the user in the question bank on the server side, and push the geometry electronic exercises to the user.
  • factors such as methods, knowledge points, novelty of question types, etc. involved in the geometry electronic exercises can be considered to select the electronic exercises for the user.
  • the method of screening electronic geometry exercises may include at least one of the following:
  • the user's mastery of various knowledge points in geometry is diagnosed, so that geometric electronic exercises of appropriate difficulty can be selected for the user.
  • the similarities between geometric electronic exercises are counted based on the user's answers to geometric electronic exercises, taking geometric electronic exercises as units, and then the similarities are used to assist in screening geometric electronic exercises, that is, to screen other geometric electronic exercises that are similar to the geometric electronic exercises that the user answered incorrectly before and give them to the user.
  • the user can also directly input the geometry electronic exercises into the client, operate on the client UI, and control the client to send a solution request for the geometry electronic exercises to the server.
  • a user copies an editable electronic exercise on the subject of geometry from a web page or other application, and inputs the editable electronic exercise on the subject of geometry into a client.
  • the electronic exercise on geometry contains text data, formula data, image data and other data, wherein the text data mainly records information such as question stem and options, the formula data is recorded in the form of latex, HTML, MathML and the like, and the image data mainly records geometric figures (including geometric parameters, such as the labels and angles of line segments, etc.).
  • the client can encapsulate the editable electronic exercises belonging to the subject of geometry into a problem-solving request, and send the problem-solving request to the server, and the server parses the editable electronic exercises belonging to the subject of geometry from the problem-solving request.
  • the client can call the camera to capture image data of the electronic exercises on the subject of geometry and frame the electronic exercises on the subject of geometry in the image data, that is, the image data may contain information such as the question stem, options, formulas, geometric figures (including geometric parameters, such as the labels and angles of line segments, etc.), and the client encapsulates the image data into a problem-solving request and sends the problem-solving request to the server.
  • the server receives the problem-solving request sent by the client and extracts the image data from the problem-solving request. At this time, it can perform OCR (Optical Character Recognition) operation on the image data to obtain text information, and perform normalization processing on the text information based on natural language processing to classify the text information into question stem, options, formulas, geometric figures (including geometric parameters, such as line segment numbers, angles, etc.), etc., so as to read electronic exercises belonging to the subject of geometry in the image data.
  • OCR Optical Character Recognition
  • Step 202 In response to a problem-solving request, known geometric conditions and geometric problems to be solved are extracted from the electronic exercises.
  • the electronic exercise has first text information, and then, in this method, a regular expression can be determined.
  • the features of known geometric conditions or geometric problems to be solved in electronic geometry exercises can be pre-selected for analysis, and one or more regular expressions can be constructed based on these features.
  • the regular expressions are used to describe the matching patterns of geometric conditions or geometric problems.
  • the components of the regular expressions can be single characters, character sets, character ranges, selections between characters, or any combination of all these components.
  • the regular expressions may be matched with the first text information.
  • a regular expression may be used to match the first text information, check whether the first text information contains a certain substring, replace the matched substring, or extract a substring that meets a certain condition from a certain string.
  • the first text information that is successfully matched it may be determined that the first text information that is successfully matched is a known geometric condition or a geometric problem to be solved.
  • the substring is determined to be a geometric condition.
  • the geometric condition can be converted into a data format with a specific structure for representation.
  • the geometric condition “D is the midpoint of AB” can be expressed as Equals(LengthOf(Line(AD)),LengthOf(Line(BD))).
  • the substring is determined to be a geometry problem.
  • the geometry problem can be converted into a data format with a specific structure for representation.
  • the geometric problem “find the length of AC” can be expressed as Find(LengthOf(Line(AC))).
  • the electronic exercises have image data.
  • at least one of symbols, numbers, and letters can be used as a target to be detected in the image data.
  • the image data contains a wealth of information, especially for displaying the structure of geometric figures (such as triangles, quadrilaterals, etc.), as well as labeling information of geometric figures (such as length, angle, relationship between line segments, etc.).
  • This information is mostly represented by symbols, numbers, and letters.
  • the symbols can be some commonly used mathematical symbols (such as the symbol of angle) or some specific geometric symbols (such as the symbol of perpendicular angles), which is not limited in this embodiment.
  • this embodiment can use at least one of symbols, numbers, and letters as a target, and use target detection models such as RetinaNet, YOLO, and PSENet to detect the area where the target exists in the image data.
  • target detection models such as RetinaNet, YOLO, and PSENet to detect the area where the target exists in the image data.
  • optical character recognition is performed on the area to obtain second text information.
  • the target detection model When the target detection model outputs an area containing at least one of symbols, numbers, and letters, the area can be captured and an OCR operation can be performed on the area using an optical recognition model such as CRNN, and the optical recognition model outputs second text information.
  • an optical recognition model such as CRNN
  • a triangle (geometric figure) is drawn, in which some angles (60°, 55°), lengths (73), and unknown quantities (X) are marked.
  • Symbols, numbers, and letters can be used as targets, and four areas are detected in the triangle (i.e., the middle box in Figure 2). By performing OCR on these areas, four data can be obtained (i.e., the boxes on the right side of Figure 2).
  • geometric figures are identified in the image data, so that second text information is assigned to the geometric figures to obtain known geometric conditions.
  • geometric figures such as points, lines, and angles can be extracted from image data using geometric figure extraction tools such as OpenCV.
  • the second text information is assigned to the geometric figure so that an association is established between the second text information and the geometric figure, thereby obtaining a geometric condition.
  • the second text information is assigned to the geometric figure that is closest to it.
  • the second text information is identification information of a geometric figure, for example, the second text information is letters representing line segments, the second text information is letters identifying angles, and so on.
  • the second text information is a numerical value of a geometric figure, for example, the second text information represents the length of a line segment, the second text information represents the degree of an angle, and so on.
  • Step 203 During each iteration of the reinforcement learning model, the current geometric condition is set as the state of the environment in the reinforcement learning model, and the geometric theorem is set as the action in the reinforcement learning model.
  • Agent There are four basic elements in the reinforcement learning model: Agent, Environment, Action, and Reward.
  • the intelligent agent can perceive the state of the environment, and according to the incentive provided by the environment, it selects a suitable action through learning to maximize the long-term incentive.
  • the agent learns a series of mappings from the state of the environment to the action based on the reward provided by the environment as feedback.
  • the principle of action selection is to maximize the probability of future accumulated reward.
  • the selected action not only affects the reward at the current moment, but also affects the reward at the next moment or even in the future. Therefore, the basic rule of the agent in the learning process is: if an action brings a positive reward from the environment, then this action will be strengthened, and if an action brings a negative reward from the environment, then this action will be weakened.
  • the environment receives a series of actions performed by the agent, evaluates the quality of the actions, and converts them into a quantifiable (scalar signal) reward and feeds it back to the agent.
  • the environment also provides the agent with its state.
  • Reward is a quantifiable scalar feedback signal provided by the environment to the agent, which is used to evaluate the quality of the action performed by the agent at a certain time.
  • Reinforcement learning is based on a hypothesis of maximizing cumulative rewards, that is, in reinforcement learning, the goal of the agent to select a series of actions is to maximize the future cumulative reward.
  • the state contains the information that the agent uses to select actions, and it is a function of the history.
  • the Markov decision process can be expressed as follows:
  • S represents the set of states of the environment
  • A represents the set of actions
  • P sa represents the state transition probability, that is, the probability distribution of transitioning to other states after taking action a in state s.
  • the goal of learning is to find the optimal strategy ⁇ for the above Markov decision process:
  • the electronic device (geometric problem-solving device) is the intelligent agent Agent
  • the geometric condition is the state State of the environment Environment
  • the geometric positioning is selected as the action Action for the geometric condition.
  • geometry theorems For electronic geometry exercises, there are generally corresponding geometry knowledge points, which can be summarized into different geometry theorems, such as the Pythagorean theorem, the projection theorem, Euler's theorem, the median theorem, Stewart's theorem, Apollonius' theorem, Ptolemy's theorem, etc., so as to design electronic geometry exercises using one or more geometry theorems. Therefore, in order to improve the efficiency of solving electronic geometry exercises and improve the accuracy of solving electronic geometry exercises, geometry theorems can be screened according to the knowledge points contained in the electronic geometry exercises, thereby constructing the action space.
  • geometry theorems can be screened according to the knowledge points contained in the electronic geometry exercises, thereby constructing the action space.
  • Step 204 Execute the reinforcement learning model to learn the value of applying all geometric theorems to the geometric conditions as the first target value.
  • the electronic device acts as an intelligent agent, extracts geometric conditions from electronic geometry exercises as the state of the environment, executes the action of applying geometric theorems to the geometric conditions, and thus calculates the value (Q value) of the action of applying geometric theorems to the geometric conditions, which is recorded as the first target value.
  • DQN Deep Q-Learing
  • the action-value function Q(s t ,a t ; ⁇ ) can be defined in advance for the reinforcement learning model, where s t represents the state of the environment at the current moment t, that is, the geometric conditions of the electronic exercise at the current moment t, a t represents the action performed at the current moment t, that is, the geometric theorem selected at the current moment t, and ⁇ is a parameter.
  • the action-value function can also be called Q-network.
  • Q-network can apply neural networks, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Deep Neural Networks (DNN), etc., so as to convert high-dimensional, continuous state space (geometric conditions, various geometric theorems) into low-dimensional value functions through neural networks.
  • CNN Convolutional Neural Networks
  • RNN Recurrent Neural Networks
  • DNN Deep Neural Networks
  • the geometric conditions and various geometric theorems are input into the action-value function, and the first target value of each geometric theorem is output.
  • Step 205 Select a geometric theorem that is compatible with the current geometric conditions according to the first target value.
  • the first target value of each geometric theorem can be referred to and a geometric theorem that is suitable for the geometric conditions can be selected.
  • the first objective values of various geometric theorems can be compared, and the geometric theorem with the highest first objective value can be selected as the geometric theorem that matches the geometric conditions.
  • the ⁇ -greedy method can be used to select geometric theorems that match the geometric conditions, that is, there is a probability of ⁇ to select the geometric theorem with the first target value, there is a probability of (1- ⁇ ) to randomly select a geometric theorem, and so on. This embodiment does not limit this.
  • Step 206 Calculate the second target value as the time-difference target under the current geometric conditions.
  • the parameters of the action-value function (such as DQN) in the reinforcement learning model can be learned using temporal difference (TD).
  • the value of the target as a time difference can be calculated under geometric conditions and recorded as the second target value.
  • the incentive Reward for the selected geometric theorem under the geometric conditions at the current time t can be determined, and the electronic device (geometric problem-solving device) executes the action of selecting the geometric theorem for the geometric conditions, with the aim of making the incentive Reward optimal.
  • the incentive reward is positively correlated with the matching degree, which indicates the degree of matching between the geometric conditions and the electronic exercises. That is, the higher the matching degree, the greater the value of the incentive reward. Conversely, the lower the matching degree, the smaller the value of the incentive reward.
  • the incentive for the geometric theorem is determined to be the first value.
  • the excitation to the geometric theorem is determined to be the second value.
  • the incentive for the geometric theorem is determined to be the third value.
  • the first value (such as 10) is greater than the second value (such as 1), and the second value is greater than the third value (such as 0).
  • Compare all the first candidate values select the maximum value among all the first candidate values, attenuate the maximum value among all the first candidate values (i.e., calculate the product between the maximum value among all the first candidate values and a preset attenuation factor, where the attenuation factor is (0, 1)), and obtain the second candidate value.
  • the sum of the reward and the second candidate value is calculated as the second target value of the time-differential target, expressed as r t + ⁇ *Max a ⁇ A Q(s t+1 ,a; ⁇ ), where r t is the reward and ⁇ is the attenuation factor.
  • Step 207 Calculate the difference between the first target value and the second target value as the loss value.
  • the first target value and the second target value are both value estimates of the optimal action (i.e., the selection geometry theorem) by the reinforcement learning model (such as DQN), and the second target value is based on the observed incentives.
  • the second target value is closer to the actual result. Therefore, the goal of training the reinforcement learning model (such as DQN) can be set to encourage the first target value to approach the second target value.
  • the preset loss function can be called, and the first target value and the second target value can be substituted to calculate the difference between the first target value and the second target value, which is recorded as the loss value.
  • the first target value is subtracted from the second target value to obtain the value difference, and the square of the value difference is multiplied by a preset coefficient to obtain the loss value.
  • the loss function is expressed as follows:
  • L( ⁇ ) is the loss value
  • is the coefficient, which is generally (0, 1), such as 1/2
  • qt is the first target value
  • yt is the second target value.
  • Step 208 Update the reinforcement learning model according to the loss value.
  • the reinforcement learning model (such as DQN) can be back-propagated.
  • the loss value is substituted into optimization algorithms such as SGD (stochastic gradient descent) and Adam (Adaptive momentum) to calculate and update the gradients of the parameters in the reinforcement learning model (such as DQN), and the parameters in the reinforcement learning model (such as DQN) are updated according to the gradients.
  • the above-mentioned DQN algorithm is only used as an example of a reinforcement learning model.
  • other reinforcement learning models can be set according to actual conditions, for example, SARAS (a temporal difference method) algorithm, DDPG (Deep Deterministic Policy Gradient) algorithm, A3C (Actor-Critic Algorithm, asynchronous advantage actor critic algorithm) algorithm, NAF (normalized advantage functions, normalized advantage function) algorithm, TRPO (Trust region policy optimization, trust region policy optimization) algorithm, PPO (Proximal Policy Optimization, proximal policy optimization algorithm) algorithm, etc., which are not limited in this embodiment.
  • SARAS a temporal difference method
  • DDPG Deep Deterministic Policy Gradient
  • A3C Automatic-Critic Algorithm, asynchronous advantage actor critic algorithm
  • NAF normalized advantage functions, normalized advantage function
  • TRPO Trust region policy optimization, trust region policy optimization
  • PPO Proximal Policy Optimization, proximal policy optimization algorithm
  • Step 209 Apply the current geometric theorem to the current geometric conditions to infer new geometric conditions.
  • the logic codes for implementing various geometric theorems can be pre-encapsulated as various application programming interfaces (Application Program Interface, API) on the server side, and interface specifications can be provided.
  • API Application Program Interface
  • the target interface may be queried according to the current geometric theorem, wherein the target interface is an application programming interface encapsulated by a logic code for implementing the current geometric theorem.
  • the geometric conditions applicable to the geometric theorem can be selected from the current geometric conditions according to the interface specification as the target conditions.
  • any two side lengths of a right triangle can be selected from the current geometric conditions as the target condition.
  • the target condition is packaged into an inference request according to the interface specification, and the inference request is sent to the target interface to call the logic code to calculate the target condition according to the geometric theorem and return a new geometric condition.
  • Step 210 determine whether the new geometric condition is the answer to the geometric problem; if so, execute step 211, if not, return to execute steps 203-210.
  • Step 211 determine that the iterative reinforcement learning model ends, and output the new geometric conditions as the answer to the geometric problem.
  • the new geometric conditions can be compared with the geometric problem. If the new geometric conditions match the geometric problem, the new geometric conditions can be confirmed as the answer to the geometric problem, and the iteration of the reinforcement learning model is stopped. The new geometric conditions are output to the user as the answer to the geometric problem. If the new geometric conditions do not match the geometric problem, the new geometric conditions can be added to the known geometric conditions, and the reinforcement learning model is used to enter the next iteration of learning.
  • the current geometric theorem can be confirmed to be an incorrect geometric theorem, and other geometric theorems can be selected for inference.
  • Step 212 Encapsulate the answer into the derivation information.
  • Step 213 Send the derivation information to the client to associate the geometry exercise with the derivation information for display.
  • the process of using geometric theorems to gradually infer is instructive for the user's learning. Therefore, the answer can be encapsulated into the derivation information, where the derivation information sequentially displays the process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of the iterative reinforcement learning model until the answer is obtained.
  • the server encapsulates the derivation information into a problem-solving response and sends the problem-solving response to the client.
  • the client receives the problem-solving response, it parses the derivation information from the problem-solving response, associates the geometry exercise with the derivation information and displays it, prompting the user to solve the geometry exercise according to the derivation information to obtain the answer.
  • the electronic geometry exercise shown in FIG. 4A has stem information (text information) and a legend (image data).
  • the geometric theorem learned by DQN is the Pythagorean theorem.
  • DQN learns the geometric theorem a t+1 as the equal area method.
  • FIG5 is a flow chart of a geometry problem-solving method provided in the third embodiment of the present application. This embodiment is applicable to the case where a reinforcement learning model is used to solve geometry electronic exercises in the process of building a question bank.
  • the method can be executed by a geometry problem-solving device, which can be implemented in the form of hardware and/or software.
  • the geometry problem-solving device can be configured in an electronic device, which can be a server. As shown in FIG5, the method includes:
  • Step 501 Search for electronic exercises in the subject of geometry from the question bank.
  • the administrator can log in to the learning platform, select the question bank management function, and batch select and import electronic exercises belonging to the subject of geometry.
  • These electronic exercises can be in the format of word, excel, audio data, video data, etc.
  • the interface can generally be divided into an input area and a check area.
  • the input area is used to edit the electronic exercises, and the check area is for confirming that the electronic exercises are correct and can be imported.
  • the number of electronic exercises that can be successfully identified can be intelligently detected.
  • the electronic exercises that are not successfully identified they can be checked and compared according to the prompts given by the system, and the template can be re-edited and imported (can be edited directly in the input area).
  • the repeated electronic exercises can be displayed. The user can choose to remove the repeated electronic exercises or choose to import the repeated test questions.
  • Step 502 If the electronic exercise lacks answers and/or derivation processes, known geometric conditions and geometric problems to be solved are extracted from the electronic exercise.
  • the server can periodically detect whether the electronic exercises in the question bank are missing answers and/or derivation processes. When the electronic exercises are missing answers and/or derivation processes, the electronic exercises are automatically solved to obtain the answers and/or derivation processes of the electronic exercises.
  • managers can also actively request the server to solve the electronic exercises and obtain the answers and/or derivation processes of the electronic exercises when browsing and finding that a certain electronic exercise is missing answers and/or derivation processes.
  • an electronic exercise has first text information. Then, in this method, a regular expression is determined, and the regular expression is used to describe the matching pattern of known geometric conditions or geometric problems to be solved; the regular expression is matched with the first text information; and the first text information that is successfully matched is determined to be a known geometric condition or a geometric problem to be solved.
  • the electronic exercises have image data.
  • at least one of symbols, numbers, and letters can be used as a target and detected in the image data; if the area where at least one of the symbols, numbers, and letters is located is detected, optical character recognition is performed on the area to obtain second text information; geometric figures are identified in the image data; the second text information is assigned to the geometric figures to obtain known geometric conditions.
  • Step 503 During each iteration of the reinforcement learning model, the current geometric conditions are input into the reinforcement learning model for learning to obtain a geometric theorem that is compatible with the current geometric conditions.
  • the current geometric conditions are set as the state of the environment in the reinforcement learning model, and the geometric theorem is set as the action in the reinforcement learning model; the reinforcement learning model is executed, and the value of applying all geometric theorems to the learned geometric conditions is used as the first target value; and the geometric theorem that is compatible with the current geometric conditions is selected according to the first target value.
  • the second target value as the time-difference target under the current geometric conditions is calculated; the difference between the first target value and the second target value is calculated as the loss value; and the reinforcement learning model is updated according to the loss value.
  • the incentive for the geometric theorem under the geometric conditions at the current moment can be determined; the reinforcement learning model is executed to learn the value of applying all geometric theorems under the geometric conditions at the next moment as the first candidate value; the maximum value among all first candidate values is attenuated to obtain the second candidate value; the sum between the incentive and the second candidate value is calculated as the second target value of the time-difference target.
  • the excitation to the geometric theorem When determining the excitation to the geometric theorem under the geometric conditions at the current moment, if the geometric information is the answer to the geometric problem under the geometric conditions at the current moment, then the excitation to the geometric theorem is determined to be a first value; if the geometric information is a new and known geometric condition under the geometric conditions at the current moment, then the excitation to the geometric theorem is determined to be a second value; if the geometric information is other information except the answer to the geometric problem and the new and known geometric condition under the geometric conditions at the current moment, then the excitation to the geometric theorem is determined to be a third value; wherein the first value is greater than the second value, and the second value is greater than the third value.
  • the second target value When calculating the difference between the first target value and the second target value as the loss value, the second target value can be subtracted from the first target value to obtain the value difference; the square of the value difference can be multiplied by a preset coefficient to obtain the loss value.
  • Step 504 Apply the current geometric theorem to the current geometric conditions to infer new geometric conditions.
  • a target interface can be queried, which is an application programming interface that encapsulates the logic code for implementing the current geometric theorem; a geometric condition applicable to the geometric theorem is selected from the current geometric conditions as the target condition; the target condition is packaged into an inference request; and the inference request is sent to the target interface to call the logic code to operate on the target condition according to the geometric theorem and return a new geometric condition.
  • Step 505 When the iterative reinforcement learning model ends, the new geometric condition is used as the answer to the geometric problem.
  • Step 506 Store the mapping relationship between the electronic exercises and the derivation information in the question bank.
  • the answer can be encapsulated into the derivation information, wherein the derivation information sequentially displays the process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of the iterative reinforcement learning model until the answer is obtained.
  • the mapping relationship between the electronic exercises and the derivation information is stored in the question bank, indicating that the geometry exercises are solved according to the derivation information to obtain the answers.
  • the mapping relationship can be provided to the management personnel for verification.
  • the server searches for electronic exercises belonging to the subject of geometry from the question bank; if the electronic exercises lack answers and/or derivation processes, known geometric conditions and geometric problems to be solved are extracted from the electronic exercises; in each iteration of the reinforcement learning model, the current geometric conditions are input into the reinforcement learning model for learning to obtain geometric theorems that are compatible with the current geometric conditions; the current geometric theorems are applied to the current geometric conditions to infer new geometric conditions; when the iterative reinforcement learning model ends, the new geometric conditions are used as answers to the geometric problems; the mapping relationship between the electronic exercises and the derivation information is stored in the question bank, and the derivation information is a process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of the iterative reinforcement learning model until the answer is obtained.
  • This embodiment proposes a geometry problem-solving framework that is rich in reasoning, which is highly compatible and extensible.
  • a reinforcement learning model is introduced into the geometry problem-solving framework to learn geometry theorems.
  • the framework has clear logic and strong analyzability, and the correlation between geometry theorems is maintained, thereby improving the accuracy of geometry problem-solving.
  • this embodiment can predict geometry theorems step by step for inference, and can describe the inference process, which is more in line with the user's learning process and achieves the effect of knowing not only the facts but also the reasons.
  • the derivation information of geometry problem-solving is recorded in the question bank for verification by management personnel, which can greatly reduce the operations of users in solving problems and building question banks, thereby greatly improving the efficiency of building question banks.
  • FIG6 is a schematic diagram of the structure of a geometric problem-solving device provided in Embodiment 4 of the present application. As shown in FIG6 , the device is applied to a server, and includes:
  • the problem-solving request receiving module 601 is used to receive a problem-solving request sent by a client for an electronic exercise in the subject of geometry;
  • the exercise information extraction module 602 is used to extract known geometric conditions and geometric problems to be solved from the electronic exercises in response to the problem-solving request;
  • a geometric theorem learning module 603 is used to input the current geometric conditions into the reinforcement learning model for learning each time the reinforcement learning model is iterated, so as to obtain a geometric theorem adapted to the current geometric conditions;
  • a geometric condition inference module 604 used for applying the current geometric theorem to the current geometric condition to infer a new geometric condition
  • An answer determination module 605 is used to determine the new geometric condition as the answer to the geometric problem when the iteration of the reinforcement learning model ends;
  • the answer sending module 606 is used to send the answer to the client so as to associate the geometry exercise with the answer for display.
  • the problem-solving request receiving module 601 includes:
  • the client request receiving module is used to receive the problem-solving request sent by the client;
  • An image data extraction module used to extract image data from the problem-solving request
  • the electronic exercise reading module is used to read the electronic exercises belonging to the subject of geometry in the image data.
  • the electronic exercise has first text information
  • the exercise information extraction module 602 includes:
  • a regular expression determination module used to determine a regular expression, wherein the regular expression is used to describe a matching pattern of a known geometric condition or a geometric problem to be solved;
  • a regular expression matching module used for matching the regular expression with the first text information
  • the matching success determination module is used to determine whether the first text information that has been matched successfully is a known geometric condition or a geometric problem to be solved.
  • the electronic exercise has image data
  • the exercise information extraction module 602 includes:
  • a target detection module configured to detect at least one of a symbol, a number, and a letter as a target in the image data
  • an optical character recognition module configured to perform optical character recognition on an area where at least one of the symbol, the number, and the letter is located to obtain second text information if the area is detected;
  • a geometric figure recognition module used for recognizing geometric figures in the image data
  • the text assignment module is used to assign the second text information to the geometric figure to obtain the known geometric conditions.
  • the geometric theorem learning module 603 includes:
  • a reinforcement learning module setting module used to set the current geometric condition as the state of the environment in the reinforcement learning model and the geometric theorem as the action in the reinforcement learning model in each iteration of the reinforcement learning model;
  • a reinforcement learning module execution module used for executing the reinforcement learning model, learning the value of applying all the geometric theorems to the geometric conditions as the first target value
  • a geometric theorem selection module is used to select the geometric theorem that is compatible with the current geometric conditions according to the first target value.
  • the geometric theorem learning module 603 further includes:
  • a target calculation module used for calculating a second target value as a time-difference target under the current geometric conditions
  • a loss value calculation module used for calculating the difference between the first target value and the second target value as a loss value
  • a reinforcement learning model updating module is used to update the reinforcement learning model according to the loss value.
  • the target calculation module includes:
  • An excitation determination module used to determine the excitation for the geometric theorem under the geometric conditions at the current moment
  • a first candidate value calculation module used for executing the reinforcement learning model to learn the value of applying all the geometric theorems under the geometric conditions at the next moment as the first candidate value
  • a second candidate value calculation module used for attenuating the maximum value among all the first candidate values to obtain a second candidate value
  • the target value calculation module is used to calculate the sum of the incentive and the second candidate value as the second target value of the time difference target.
  • the incentive determination module includes:
  • a first value determination module configured to determine, under the geometric condition at a current moment, if the geometric information is an answer to the geometric problem, that the stimulus for the geometric theorem is a first value
  • a second value determination module configured to determine, under the geometric condition at the current moment, if the geometric information is a new and known geometric condition, that the excitation to the geometric theorem is a second value
  • a third value determination module configured to determine, under the geometric condition at the current moment, if the geometric information is other information except the answer to the geometric problem and the new known geometric condition, that the stimulus for the geometric theorem is a third value
  • the first value is greater than the second value, and the second value is greater than the third value.
  • the loss value calculation module includes:
  • a value difference calculation module configured to obtain a value difference by subtracting the second target value from the first target value
  • the value difference processing module is used to multiply the square of the value difference by a preset coefficient to obtain a loss value.
  • the geometric condition inference module 604 includes:
  • An interface query module used for querying a target interface, wherein the target interface is an application programming interface encapsulated by a logic code for implementing the current geometric theorem;
  • a target condition selection module used for selecting the geometric condition applicable to the geometric theorem from the current geometric conditions as the target condition
  • An inference request packaging module used for packaging the target condition into an inference request
  • the interface calling module is used to send the inference request to the target interface to call the logic code to calculate the target condition according to the geometric theorem and return a new geometric condition.
  • the answer determination module 605 includes:
  • An answer judgment module is used to judge whether the new geometric condition is the answer to the geometric problem; if so, the geometric condition output module is called; if not, the geometric theorem learning module 603 and the geometric condition inference module 604 are called back;
  • the geometric condition output module is used to determine the end of iterating the reinforcement learning model and output the new geometric condition as the answer to the geometric problem.
  • the answer sending module 606 includes:
  • a derivation information encapsulation module used to encapsulate the answer into derivation information, wherein the derivation information sequentially displays the process of applying the current geometric theorem to the current geometric condition to infer a new geometric condition in the order of iterating the reinforcement learning model until the answer is obtained;
  • the derivation information sending module is used to send the derivation information to the client so as to associate the geometry exercise with the derivation information for display.
  • the geometry problem-solving device provided in the embodiments of the present application can execute the geometry problem-solving method provided in any embodiment of the present application, and has the corresponding functional modules and beneficial effects for executing the geometry problem-solving method.
  • FIG7 is a schematic diagram of the structure of a geometric problem-solving device provided in Embodiment 5 of the present application. As shown in FIG7 , the device is applied to a server, and includes:
  • the electronic exercise search module 701 is used to search for electronic exercises belonging to the subject of geometry from the question bank;
  • the exercise information extraction module 702 is used to extract known geometric conditions and geometric problems to be solved from the electronic exercise if the electronic exercise lacks answers and/or derivation processes;
  • a geometric theorem learning module 703 is used to input the current geometric condition into the reinforcement learning model for learning each time the reinforcement learning model is iterated, so as to obtain a geometric theorem adapted to the current geometric condition;
  • a geometric condition inference module 704 used for applying the current geometric theorem to the current geometric condition to infer a new geometric condition
  • An answer determination module 705 is used to set the new geometric condition as the answer to the geometric problem when the iteration of the reinforcement learning model ends;
  • the exercise storage module 706 is used to store the mapping relationship between the electronic exercises and the derivation information in the question bank, and the derivation information is a process of applying the current geometric theorem to the current geometric conditions to infer new geometric conditions in the order of iterating the reinforcement learning model until the answer is obtained.
  • the electronic exercise has first text information
  • the exercise information extraction module 702 includes:
  • a regular expression determination module used to determine a regular expression, wherein the regular expression is used to describe a matching pattern of a known geometric condition or a geometric problem to be solved;
  • a regular expression matching module used for matching the regular expression with the first text information
  • the matching success determination module is used to determine whether the first text information that has been matched successfully is a known geometric condition or a geometric problem to be solved.
  • the electronic exercise has image data
  • the exercise information extraction module 702 includes:
  • a target detection module configured to detect at least one of a symbol, a number, and a letter as a target in the image data
  • an optical character recognition module configured to perform optical character recognition on an area where at least one of the symbol, the number, and the letter is located to obtain second text information if the area is detected;
  • a geometric figure recognition module used for recognizing geometric figures in the image data
  • the text assignment module is used to assign the second text information to the geometric figure to obtain the known geometric conditions.
  • the geometric theorem learning module 703 includes:
  • a reinforcement learning module setting module used to set the current geometric condition as the state of the environment in the reinforcement learning model and the geometric theorem as the action in the reinforcement learning model in each iteration of the reinforcement learning model;
  • a reinforcement learning module execution module used for executing the reinforcement learning model, learning the value of applying all the geometric theorems to the geometric conditions as the first target value
  • a geometric theorem selection module is used to select the geometric theorem that is compatible with the current geometric conditions according to the first target value.
  • the geometric theorem learning module 703 further includes:
  • a target calculation module used for calculating a second target value as a time-difference target under the current geometric conditions
  • a loss value calculation module used for calculating the difference between the first target value and the second target value as a loss value
  • a reinforcement learning model updating module is used to update the reinforcement learning model according to the loss value.
  • the target calculation module includes:
  • An excitation determination module used to determine the excitation for the geometric theorem under the geometric conditions at the current moment
  • a first candidate value calculation module used for executing the reinforcement learning model to learn the value of applying all the geometric theorems under the geometric conditions at the next moment as the first candidate value
  • a second candidate value calculation module used for attenuating the maximum value among all the first candidate values to obtain a second candidate value
  • the target value calculation module is used to calculate the sum of the incentive and the second candidate value as the second target value of the time difference target.
  • the incentive determination module includes:
  • a first value determination module configured to determine, under the geometric condition at a current moment, if the geometric information is an answer to the geometric problem, that the stimulus for the geometric theorem is a first value
  • a second value determination module configured to determine, under the geometric condition at the current moment, if the geometric information is a new and known geometric condition, that the excitation to the geometric theorem is a second value
  • a third value determination module configured to determine, under the geometric condition at the current moment, if the geometric information is other information except the answer to the geometric problem and the new known geometric condition, that the stimulus for the geometric theorem is a third value
  • the first value is greater than the second value, and the second value is greater than the third value.
  • the loss value calculation module includes:
  • a value difference calculation module configured to obtain a value difference by subtracting the second target value from the first target value
  • the value difference processing module is used to multiply the square of the value difference by a preset coefficient to obtain a loss value.
  • the geometric condition inference module 704 includes:
  • An interface query module used for querying a target interface, wherein the target interface is an application programming interface encapsulated by a logic code for implementing the current geometric theorem;
  • a target condition selection module used for selecting the geometric condition applicable to the geometric theorem from the current geometric conditions as the target condition
  • An inference request packaging module used for packaging the target condition into an inference request
  • the interface calling module is used to send the inference request to the target interface to call the logic code to calculate the target condition according to the geometric theorem and return a new geometric condition.
  • the answer determination module 705 includes:
  • An answer judgment module is used to judge whether the new geometric condition is the answer to the geometric problem; if so, the geometric condition output module is called; if not, the geometric theorem learning module 703 and the geometric condition inference module 704 are called back;
  • the geometric condition output module is used to determine the end of iterating the reinforcement learning model and output the new geometric condition as the answer to the geometric problem.
  • the geometry problem-solving device provided in the embodiments of the present application can execute the geometry problem-solving method provided in any embodiment of the present application, and has the corresponding functional modules and beneficial effects for executing the geometry problem-solving method.
  • Fig. 8 shows a block diagram of an electronic device 10 that can be used to implement an embodiment of the present application.
  • the electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • the electronic device can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices (such as helmets, glasses, watches, etc.) and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the present application described and/or required herein.
  • the electronic device 10 includes at least one processor 11, and a memory connected to the at least one processor 11, such as a read-only memory (ROM) 12, a random access memory (RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor, and the processor 11 can perform various appropriate actions and processes according to the computer program stored in the read-only memory (ROM) 12 or the computer program loaded from the storage unit 18 to the random access memory (RAM) 13.
  • RAM 13 various programs and data required for the operation of the electronic device 10 can also be stored.
  • the processor 11, ROM 12 and RAM 13 are connected to each other through a bus 14.
  • An input/output (I/O) interface 15 is also connected to the bus 14.
  • a number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a disk, an optical disk, etc.; and a communication unit 19, such as a network card, a modem, a wireless communication transceiver, etc.
  • the communication unit 19 allows the electronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • the processor 11 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special artificial intelligence (AI) computing chips, various processors running machine learning model algorithms, a digital signal processor (DSP), and any appropriate processor, controller, microcontroller, etc.
  • the processor 11 executes the various methods and processes described above, such as a geometry problem-solving method.
  • the geometric problem-solving method may be implemented as a computer program, which is tangibly contained in a computer-readable storage medium, such as a storage unit 18.
  • a computer-readable storage medium such as a storage unit 18.
  • part or all of the computer program may be loaded and/or installed on the electronic device 10 via the ROM 12 and/or the communication unit 19.
  • the processor 11 may be configured to execute the geometric problem-solving method in any other suitable manner (e.g., by means of firmware).
  • Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOCs systems on chips
  • CPLDs load programmable logic devices
  • Various implementations can include: being implemented in one or more computer programs that can be executed and/or interpreted on a programmable system including at least one programmable processor, which can be a special purpose or general purpose programmable processor that can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • a programmable processor which can be a special purpose or general purpose programmable processor that can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • the computer programs for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor, the functions/operations specified in the flow charts and/or block diagrams are implemented.
  • the computer programs may be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a stand-alone software package, or entirely on a remote machine or server.
  • a computer-readable storage medium may be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, device, or equipment.
  • a computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or equipment, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be a machine-readable signal medium.
  • a more specific example of a machine-readable storage medium may include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or flash memory erasable programmable read-only memory
  • CD-ROM portable compact disk read-only memory
  • CD-ROM compact disk read-only memory
  • magnetic storage device or any suitable combination of the foregoing.
  • the systems and techniques described herein may be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user; and a keyboard and a pointing device (e.g., a mouse or trackball) through which the user can provide input to the electronic device.
  • a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and a pointing device e.g., a mouse or trackball
  • Other types of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form (including acoustic input, voice input, or tactile input).
  • the systems and techniques described herein may be implemented in a computing system that includes backend components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes frontend components (e.g., a user computer with a graphical user interface or a web browser through which a user can interact with implementations of the systems and techniques described herein), or a computing system that includes any combination of such backend components, middleware components, or frontend components.
  • the components of the system may be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: a local area network (LAN), a wide area network (WAN), a blockchain network, and the Internet.
  • a computing system may include a client and a server.
  • the client and the server are generally remote from each other and usually interact through a communication network.
  • the client and server relationship is generated by computer programs running on the corresponding computers and having a client-server relationship with each other.
  • the server may be a cloud server, also known as a cloud computing server or cloud host, which is a host product in the cloud computing service system to solve the defects of difficult management and weak business scalability in traditional physical hosts and VPS services.
  • An embodiment of the present application also provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, it implements the geometry problem-solving method provided in any embodiment of the present application.
  • the computer program product can be written in one or more programming languages or a combination thereof to perform the computer program code of the present application, and the programming language includes an object-oriented programming language, such as Java, Smalltalk, C++, and also includes a conventional procedural programming language, such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partially on the user's computer, as an independent software package, partially on the user's computer and partially on the remote computer, or completely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or can be connected to an external computer (for example, using an Internet service provider to connect through the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet Internet service provider

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

本申请公开了一种几何解题方法、装置、设备及存储介质,该方法包括:接收到客户端针对学科属于几何的电子习题发送的解题请求;响应于解题请求,从电子习题中提取已知的几何条件与待解答的几何问题;在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理;将当前几何定理应用于当前几何条件推论新的几何条件;当迭代强化学习模型结束时,确定新的几何条件为几何问题的答案;将答案发送至客户端、以将几何习题与答案关联显示。本实施例在几何解题框架中引入强化学习模型学习几何定理,逻辑清晰,可解析性强,保持了几何定理与几何定理之间的关联性,提高了几何解题的精准度。

Description

一种几何解题方法、装置、设备及存储介质 技术领域
本申请涉及机器学习的技术领域,尤其涉及一种几何解题方法、装置、设备及存储介质。
背景技术
几何是数学中的一门学科,几何中的习题要求用户根据题干信息和几何图形进行问题的求解,该类型的习题要求学生有较好的几何思维,同时熟悉几何中的知识点,可以考验学生的数学综合能力。
发明内容
本申请提供了一种几何解题方法、装置、设备及存储介质,以解决如何提高解答几何习题的精准度的问题。
根据本申请的一方面,提供了一种几何解题方法,应用于服务端,所述方法包括:
接收客户端针对学科属于几何的电子习题发送的解题请求;
响应于所述解题请求,从所述电子习题中提取已知的几何条件与待解答的几何问题;
在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
当迭代所述强化学习模型结束时,确定新的所述几何条件为所述几何问题的答案;
将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示。
根据本申请的另一方面,提供了一种几何解题方法,应用于服务端,所述方法包括:
从题库中查找学科属于几何的电子习题;
若所述电子习题缺乏答案和/或推导过程,则从所述电子习题中提取已知的几何条件与待解答的几何问题;
在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
当迭代所述强化学习模型结束时,将新的所述几何条件为所述几何问题的答案;
在所述题库中存储所述电子习题与推导信息之间的映射关系,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程。
根据本申请的另一方面,提供了一种几何解题装置,应用于服务端,所述装置包括:
解题请求接收模块,用于接收客户端针对学科属于几何的电子习题发送的解题请求;
习题信息提取模块,用于响应于所述解题请求,从所述电子习题中提取已知的几何条件与待解答的几何问题;
几何定理学习模块,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
几何条件推论模块,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
答案确定模块,用于当迭代所述强化学习模型结束时,确定新的所述几何条件为所述几何问题的答案;
答案发送模块,用于将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示。
根据本申请的另一方面,提供了一种几何解题装置,应用于服务端,所述装置包括:
电子习题查找模块,用于从题库中查找学科属于几何的电子习题;
习题信息提取模块,用于若所述电子习题缺乏答案和/或推导过程,则从所述电子习题中提取已知的几何条件与待解答的几何问题;
几何定理学习模块,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
几何条件推论模块,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
答案确定模块,用于当迭代所述强化学习模型结束时,将新的所述几何条件为所述几何问题的答案;
习题存储模块,用于在所述题库中存储所述电子习题与推导信息之间的映射关系,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程。
根据本申请的另一方面,提供了一种电子设备,所述电子设备包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行本申请任一实施例所述的几何解题方法。
根据本申请的另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序用于使处理器执行时实现本申请任一实施例所述的几何解题方法。
在本实施例中,服务端接收到客户端针对学科属于几何的电子习题发送的解题请求;响应于解题请求,从电子习题中提取已知的几何条件与待解答的几何问题;在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理;将当前几何定理应用于当前几何条件推论新的几何条件;当迭代强化学习模型结束时,确定新的几何条件为几何问题的答案;将答案发送至客户端、以将几何习题与答案关联显示。本实施例提出了一种富有推理性的几何解题框架,兼容性强,可扩展性强,在几何解题框架中引入强化学习模型学习几何定理,逻辑清晰,可解析性强,保持了几何定理与几何定理之间的关联性,提高了几何解题的精准度,此外,本实施例可以一步步地预测几何定理进行推论,可推论的过程进行描述,更加符合用户学习的过程,达到知其然,更知其所以然的效果。
附图说明
图1是根据本申请实施例一提供的一种几何解题方法的流程图;
图2是根据本申请实施例二提供的一种几何解题方法的流程图;
图3是根据本申请实施例二提供的一种光学字符识别的示例图;
图4A是根据本申请实施例二提供的一种几何的电子习题的示例图;
图4B是根据本申请实施例一提供的一种几何解题的流程示例图;
图5是根据本申请实施例三提供的一种几何解题方法的流程图;
图6是根据本申请实施例四提供的一种几何解题装置的结构示意图;
图7是根据本申请实施例五提供的一种几何解题装置的结构示意图;
图8是实现本申请实施例的几何解题方法的电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为了辅助用户理解几何中的知识点,一些学习类的应用使用神经网络对几何的电子习题的求解过程进行剖析,即,将几何的电子习题的题干信息输入文本特征提取器中获取文本特征,将几何的电子习题的几何图形输入图像特征提取器中获取图片特征,融合文本和图片特征,并经过解码器后得到预测的结果。
对于机器学习而言,可解析性可划分为两种:内禀可解释(Intrinsic Interpretability)和事后可解释(Post Hoc Interpretability)。内禀可解释要求限制模型的复杂性,事后可解释则要求在模型训练之后分析模型结果。
若限制模型的复杂性到足够的程度,则可以完备性地了解模型中所有决 策过程及其原因,这就是内禀可解释性。满足这个条件的模型包括线性/逻辑回归,决策树、朴素贝叶斯、K近邻等等。而神经网络因为参数的复杂性,显然不满足该条件。
在事后可解释性方面,通常使用基于统计的方法给出关于模型特征的总结,例如,特征重要性分析,特征的可视化,等等。或者是反事实方法修改数据从而获得不同的结果来进行解释。这些方法都是提供了关于模型的全局可解释性(Global Interpretability),由于神经网络的复杂性,提供全局可解释性需要大量的数据和基于这些数据的大量统计,因此涉及到大量的人力工作。虽然看起来比较费劲,但是全局可解释性对于神经网络来说是可实现的。
在实际应用中,常常分析某一个数据点为什么被模型预测为某个特定的值,比如金融中的授信,也就是局部可解释性(Local Interpretability)。对于随机森林、GBDT(Gradient Boosting Decision Tree,梯度提升决策树)这样的模型来说,这样的局部可解释性是内宗的,看数据在模型走过哪些分支即可。
对于神经网络来说,这样的局部可解释性几乎是不可能实现的。虽然可以让数据过一遍神经网络然后判断哪些神经元被激活,但每个神经元的意义是什么,神经元簇的意义是什么,这些问题是不稳定的。经常地可以发现稍微修改数据的某个特征的值,就可能导致完全不同的预测结果。这也导致所谓的反事实分析在局部可解释性上并不实用。同时,这个问题也揭示了神经网络结果的不完备性——对同样的数据存在无穷多个可能的拟合,并不能保证得到的神经网络能准确地处理未见过的同类型数据。
几何的电子习题的复杂性较高,涉及较多的几何定理,由于几何定理大多属于知识点,因而几何的电子习题大多以几何定理作为辅助进行解题,而神经网络的可解析性较弱,无法引入几何定理,导致几何解题的精准度较低。
实施例一
图1为本申请实施例一提供的一种几何解题方法的流程图,本实施例可适用于使用强化学习模型对用户选定的几何的电子习题进行求解的情况,该方法可以由几何解题装置来执行,该几何解题装置可以采用硬件和/或软件的形式实现,该几何解题装置可配置于电子设备中。如图1所示,该方法包括:
步骤101、接收客户端针对学科属于几何的电子习题发送的解题请求。
在几何等不同学科的教学环节中,电子习题均是一种重要的学习资源,可以帮助用户巩固、复习和检验所学的知识,在教学平台的题库中存储了大 量的电子习题,规模可能达千万级别。
其中,几何是数学中的一门分科,是研究空间结构及性质的一门学科。
一般情况下,考虑到各个学科的电子习题的数量众多,可达千万级别,占用的存储资源巨大,以及,筛选、求解电子习题的逻辑众多、且偶有更新,电子设备作为服务端(Server)的角色,存储电子习题,维护筛选、求解电子习题的逻辑,以便节省存储资源、减少对客户端的更新。
在本实施例中,服务端面向用户提供解答几何的电子习题的服务,用户登录客户端(Client),客户端可接收用户选定几何的电子习题或接收服务端筛选的几何的电子习题,显示该几何的电子习题给用户作答、练习,在某些情况中,某些几何的电子习题缺失答案,此时,用户可在客户端中操作、向服务端发送解题请求,请求服务端求解学科属于几何的电子习题。
当然,服务端除了可以为用户提供求解几何的电子习题的服务,也可以为用户提供推荐电子习题、求解其他学科的电子习题等服务,本实施例对此不加以限制。
示例性地,在教育的场景中,用户包括教师、学生,一方面,教师可登录客户端,基于学生的学习情况,选择部分或全部学生、通知服务端为这些学生筛选适合这些学生的几何的电子习题,将该电子习题推送至相应学生登录的客户端中,让这些学生分别进行作答、练习,并在作答、练习之后,为学生加载服务端求解几何的电子习题的过程,另一方面,学生可登录客户端,通知服务端选定几何的电子习题、通知服务端为其筛选适合的电子习题,将该电子习题推送至该学生登录的客户端中,让该学生进行作答、练习,并请求服务端求解几何的电子习题。
当然,在部分业务场景下,可降低几何的电子习题的数量、达十万级别,例如,练习在某个年级的几何的电子习题、练习几何中某个知识点的电子习题,等等,客户端可从服务端下载几何的电子习题,维护筛选、求解几何的电子习题的逻辑,面向用户提供求解几何的电子习题的服务,以便用户在离线的场景下,依然正常作答、练习,浏览求解几何的电子习题的过程,本实施例对此不加以限制。
步骤102、响应于解题请求,从电子习题中提取已知的几何条件与待解答的几何问题。
在本实施例中,可以对电子习题进行预处理,并从电子习题中解析出一些对于解答电子习题属于关键的信息,分别为已知的条件,记为几何条件,以及,待解答的问题,记为几何问题。
其中,几何条件可以包括边的长度、点与边的关系(如中点等)、点与图形的关系(如三角形的重心等)、角的度数、边与角的关系(如角平分线等),等等。
待解答的问题可以包括求边的长度、求点与边的关系、求点与图形的关系、求角的度数、求边与角的关系,等等。
电子习题包括题干信息、选项信息等部分,而几何的电子习题的题型较多,例如,选择题、填空题、问答题(又称解答题),等等,针对不同题型的电子习题,电子习题所包含的信息有所不同,例如,选择题通常包括题干信息、选项信息,填空题与问答题通常包括题干信息,等等。
一般情况下,题干信息包含了解答电子习题的关键信息,而选项信息是答案的几种可能,因此,本实施例中可以滤除选项信息,而从电子习题的题干信息中提取已知的几何条件与待解答的几何问题。
对于几何的电子习题,除了文本信息之外,经常搭配公式数据、字符数据、几何图形等不同类型的数据组成题干信息,对于不同类型的数据,对其处理的方式也有所不同。
示例性地,针对数学中的公式数据、英文中的字符数据,通常会使用一些特定的格式进行记录,以便在UI的页面中正确显示,如latex(基于底层编程语言的电子排版系统)、HTML(HyperText Markup Language,超文本标记语言)、MathML(数学标记语言)等,在记录时会产生标签。
例如,针对数学中求解一元二次方程的公式数据
Figure PCTCN2022130858-appb-000001
使用MathML记录时,使用标签<math>记录文档的开始,使用标签<mi>记录各个标识符元素(代表变量、函数名、常量等),如x、b、a、c等,使用标签<mo>记录操作符元素,如=、±、-等,使用标签<mfrac>记录
Figure PCTCN2022130858-appb-000002
为分数模式,使用标签<msup>记录b 2为上标模式,等等,在对数据公式
Figure PCTCN2022130858-appb-000003
进行预处理时,可以去除这些标签。
为了方便电子设备进行解析、处理,已知的几何条件与待解答的几何问题均可以转换为特定结构的数据格式进行表示。
针对不同类型(如长度、角度等)的几何条件、几何问题,均可以设置数据格式的模板,在模板中具有一个或多个通配符,将几何条件、几何问题中的关键词写入该模板的通配符中,则可以获得以特定结构的数据格式表示的几何条件、几何问题。
例如,对于类型为长度的几何条件,模板为Equals(LengthOf(Line(\1)),\2), 其中,通配符“\1”可写入表示线段的字母,通配符“\2”可写入线段的长度。
对于类型为长度的几何问题,模板为Find(LengthOf(Line(\1))),其中,通配符“\1”可写入表示线段的字母。
步骤103、在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理。
几何的电子习题中涉及较多的推理,在这过程中会使用一些几何定理作为辅助信息,尤其是一道电子习题涉及多个几何定理进行推理的情况下,后续使用几何定理的推导依赖在先使用几何定理的推导,逻辑表达式之间存在较强的关联性。
由于神经网络属于黑盒模型,可解析性弱,传统基于神经网络进行解题的方法推理性较弱,无法引入几何定理,从根本上丧失了几何定理之间的关联性,导致解题的质量差,无法落地使用。
在本实施例中,可以使用强化学习模型辅助求解几何的电子习题,而使用强化学习模型辅助求解几何的电子习题会存在一次或多次迭代学习的过程,在每次迭代学习的过程中,可以将本次迭代中的所有几何条件输入强化学习模型中进行学习,强化学习模型输出与当前几何条件适配的几何定理。
其中,强化学习模型是表达强化学习的模型,所谓强化学习,即理解信息、获得输入到输出的映射,从自身的以往求解几何的电子习题的经验中去不断学习来获取知识,从而避免大量已标记的确定标签,以一个评价选定几何定理的行为好坏的奖惩机制进行反馈,强化学习通过这样的反馈自己进行“学习”。
如果当前选定几何定理的行为的反馈“好”,则以后就多往这个方向发展,如果当前选定几何定理的行为的反馈“坏”,则以后尽量避免这样的行为,即不是直接得到了标签,而是自己在实际中总结得到的。
强化学习模型可以使用马尔可夫决策过程(Markov Decision Process,MDP)等方式来描述,即,机器处在一个环境中,每个状态为机器对当前环境的感知;机器通过动作来影响环境,当机器执行一个动作后,会使得环境按某种概率转移到另一个状态;同时,环境会根据潜在的激励函数反馈给机器一个激励。
步骤104、将当前几何定理应用于当前几何条件推论新的几何条件。
在本实施例中,可以将选定的几何定理应用于几何条件进行逻辑上的推论,得到新的信息,记为新的几何条件。
步骤105、当迭代强化学习模型结束时,确定新的几何条件为几何问题的答案。
如果使用强化学习模型完成迭代学习,则可以将最后一次迭代学习输出的新的几何条件定义为几何问题的答案。
步骤106、将答案发送至客户端、以将几何习题与答案关联显示。
服务端将答案封装至解题响应中,并将解题响应发送至客户端,客户端在接收到解题响应时,从解题响应中解析出答案,将几何习题与答案关联显示,向用户提示对几何习题进行解答,得到该答案。
在本实施例中,服务端接收到客户端针对学科属于几何的电子习题发送的解题请求;响应于解题请求,从电子习题中提取已知的几何条件与待解答的几何问题;在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理;将当前几何定理应用于当前几何条件推论新的几何条件;当迭代强化学习模型结束时,确定新的几何条件为几何问题的答案;将答案发送至客户端、以将几何习题与答案关联显示。本实施例提出了一种富有推理性的几何解题框架,兼容性强,可扩展性强,在几何解题框架中引入强化学习模型学习几何定理,逻辑清晰,可解析性强,保持了几何定理与几何定理之间的关联性,提高了几何解题的精准度,此外,本实施例可以一步步地预测几何定理进行推论,可推论的过程进行描述,更加符合用户学习的过程,达到知其然,更知其所以然的效果。
实施例二
图2为本申请实施例二提供的一种几何解题方法的流程图,本实施例与在上述实施例的基础上细化了强化学习模型的推导过程。如图2所示,该方法包括:
步骤201、接收客户端针对学科属于几何的电子习题发送的解题请求。
在用户使用的设备中,如学习机、移动终端(如手机、平板电脑、数字助理等)等,其操作系统可以包括Android(安卓)、iOS、Windows等,可安装支持作答电子习题的应用程序,该应用程序可以为独立提供学习服务的客户端,也可以为其他客户端中提供学习服务的功能模块(如SDK(Software Development Kit,软件开发工具包)),如即时通讯工具、行业工作的客户端等,还可以为具有浏览组件的客户端,该具有浏览组件的客户端可包括浏览器、配置浏览组件(如WebView(网络视图))的应用程序,本实施例对此不加以限制。
对于用户而言,可以在该应用程序中使用用户账号、密码等信息进行登录,从而以身份数据进行表示,若用户没有登录,可为该用户提供临时的身份数据,并将该临时的身份数据与设备标识进行绑定,将绑定相同设备标识的临时的身 份数据进行合并,若后续该临时用户注册、登录,则可以将用户临时的身份数据转换为正式的身份数据。
该客户端可提供UI(User Interface,用户界面),用户可在该UI上浏览几何的电子习题,在某些情况下可触发求解某个几何的电子习题的操作,如点击某个几何的电子习题进行学习并浏览解答的过程、点击某个几何的学习任务进行测试并浏览解答电子习题的过程、遇到新的几何的电子习题请求解答电子习题,等等。
本实施例应用于服务端,那么,客户端可针对几何的电子习题的解题请求发送至服务端,服务端在接收到该解题请求时,启动求解几何的电子习题的逻辑(即执行步骤201-步骤212)。
进一步而言,用户主动在客户端中选择几何的电子习题,客户端还可以向服务端在题库中筛选适合用户的、几何的电子习题,并将该几何的电子习题推送给用户,在此过程中,可以考虑几何的电子习题涉及的方法、知识点、题型的新颖度等等因素,从而筛选电子习题给用户。
示例性地,筛选几何的电子习题的方法可以包括如下至少一种:
一、基于规则的方法
在此方法中,考虑用户的学情、几何的电子习题的热度等因素,利用规则或者线性的加权筛选几何的电子习题给用户。
二、基于认知诊断的方法
在此方法中,对用户进行几何中各个知识点的掌握度诊断,从而筛选难度合适的、几何的电子习题给用户。
三、基于协同过滤的方法
在此方法中,找到和当前用户学习行为、学情相似的其他用户,将其他用户作答较差的、几何的电子习题筛选给当前用户。
四、基于内容的方法
在此方法中,通过用户的、几何的作答电子习题的情况,以几何的电子习题为单位,去统计几何的电子习题与几何的电子习题之间的相似性,进而利用这种相似性辅助筛选几何的电子习题,即筛选与用户在先作答错误的、几何的电子习题相似的其他几何的电子习题给该用户。
此外,用户还可以直接将几何的电子习题输入到客户端中,在客户端的UI上操作,控制客户端针对该几何的电子习题向服务端发送解题请求。
在一种情况中,用户从网页等其他应用程序中复制可编辑的、学科属于几 何的电子习题,并将可编辑、学科属于几何的电子习题输入客户端中,该几何的电子习题中具有文本数据、公式数据、图像数据等数据,其中,文本数据主要记录题干、选项等信息,公式数据以latex、HTML、MathML等形式记录,图像数据主要记录几何图形(含几何参数,如线段的标号、角度等)。
此时,客户端可将可编辑的、学科属于几何的电子习题封装至解题请求中,并将解题请求发送至服务端,服务端从解题请求中解析可编辑的、学科属于几何的电子习题。
在另一种情况中,面向无法编辑的、学科属于几何的电子习题,例如,位于书本上的、学科属于几何的电子习题,禁止复制的、学科属于几何的电子习题,等等,客户端可调用摄像头对学科属于几何的电子习题采集图像数据并在图像数据中框定学科属于几何的电子习题,即,该图像数据中可能包含题干、选项、公式、几何图形(含几何参数,如线段的标号、角度等)等信息,客户端将图像数据封装至至解题请求中,并将解题请求发送至服务端。
服务端接收到客户端发送的解题请求,从解题请求中提取图像数据,此时,可对图像数据执行OCR(Optical Character Recognition,光学字符识别)操作,得到文本信息,并将文本信息执行自然语言处理上的规范化处理,将文本信息分别归类为题干、选项、公式、几何图形(含几何参数,如线段的标号、角度等)等信息,从而在图像数据中读取学科属于几何的电子习题。
步骤202、响应于解题请求,从电子习题中提取已知的几何条件与待解答的几何问题。
由于几何的电子习题中包含的数据有所不同,因此,从电子习题中提取已知的几何条件与待解答的几何问题的方式也有所不同。
在一种提取几何条件、几何问题的方式中,电子习题具有第一文本信息,那么,在本方式中,可确定正则表达式。
在本方式中,可以预选分析几何的电子习题中、已知的几何条件或待解答的几何问题的特征,针对这些特征构造一个或多个正则表达式(regular expression),该正则表达式用于描述几何条件或几何问题的匹配模式(pattern),针对几何的电子习题的特征,正则表达式的组件可以是单个的字符、字符集合、字符范围、字符间的选择或者所有这些组件的任意组合。
例如,在学习一些简单的几何定理时,如勾股定理等,几何的电子习题会经常对线段标示长度,线段以字母表示,长度均为简单的数值,如“AB=5”,对此,可以设置正则表达式为“([A-Z]{2})(?:=|等于)([0-9]+)”。
针对不同的正则表达式,可将正则表达式与第一文本信息进行匹配。
在本方式中,可以使用正则表达式与第一文本信息进行匹配,检查第一文本信息中是否含有某种子串、将匹配的子串替换或者从某个串中取出符合某个条件的子串等。
针对确定匹配成功的第一文本信息,可确定匹配成功的第一文本信息为已知的几何条件或待解答的几何问题。
若第一文本信息中的某个子串与解析几何条件的正则表达式匹配成功,则确定该子串为几何条件,此时,可以将几何条件转换为特定结构的数据格式进行表示。
示例性地,几何条件“∠ABC=90°”可表示为Equals(MeasureOf(Angle(ABC),90),几何条件“AB=5”可表示为Equals(LengthOf(Line(AB)),5),几何条件“D是AB的中点”可表示为Equals(LengthOf(Line(AD)),LengthOf(Line(BD)))。
若第一文本信息中的某个子串与解析几何问题的正则表达式匹配成功,则确定该子串为几何问题,此时,可以将几何问题转换为特定结构的数据格式进行表示。
示例性地,几何问题“求AC的长度”可表示为Find(LengthOf(Line(AC)))。
在另一种提取几何条件、几何问题的方式中,电子习题具有图像数据,则在方式中,可以以符号、数字、字母中的至少一者作为目标,在图像数据中进行检测。
在几何的电子习题中,图像数据中包含丰富的信息,尤其用于显示几何图形的结构(如三角形、四边形等),以及,对几何图形标注信息(如长度、角度、线段之间的关系等等),这些信息多以符号、数字、字母表示,其中,符号可以为一些常用的数学符号(如角度的符号),也可以为一些特定的几何符合(如角垂直的符号),本实施例对此不加以限制。
因此,本实施例可以以符号、数字、字母中的至少一者作为目标,使用RetinaNet、YOLO、PSENet等目标检测模型在图像数据中检测存在该目标的区域。
针对不同的目标,若检测到符号、数字、字母中的至少一者所处的区域,则对区域进行光学字符识别,得到第二文本信息。
在目标检测模型输出存在符号、数字、字母中的至少一者的区域的情况下,可以截取该区域,使用CRNN等光学识别模型对该区域进行OCR操作,光学识别模型输出第二文本信息。
例如,如图3所示,在某个几何的电子习题中,绘制了一个三角形(几何图形),里面标记了一些角度(60°、55°)、长度(73)、未知量(X),则可以以符号、数字、字母作为目标,在三角形检测到了四个区域(即图2中间的框),对这些区域进行OCR,则可以得到四个数据(即图2右侧的框)。
此外,在图像数据中识别几何图形,从而将第二文本信息赋予几何图形,得到已知的几何条件。
在本方式中,可以使用OpenCV等几何图形的提取工具从图像数据中提取点、线、角等几何图形。
将第二文本信息赋予该几何图形,使得第二文本信息与几何图形之间产生关联,从而得到几何条件。
考虑到对几何图形标注的信息基本在几何图形的附近,因此,一般情况下,第二文本信息为赋予距离与其最近的几何图形。
在一些情况中,第二文本信息为几何图形的标识信息,例如,第二文本信息为表示线段的字母,第二文本信息为标识角的字母,等等。
在另一些情况中,第二文本信息为几何图形的数值,例如,第二文本信息为表示线段的长度,第二文本信息为标识角的度数,等等。
当然,上述提取几何条件、几何问题的方式只是作为示例,在实施本申请实施例时,可以根据实际情况设置其它几何条件、几何问题的方式,本申请实施例对此不加以限制。另外,除了上述几何条件、几何问题的方式外,本领域技术人员还可以根据实际需要采用其它几何条件、几何问题的方式,本申请实施例对此也不加以限制。
步骤203、在每一次迭代强化学习模型时,分别设定当前几何条件为强化学习模型中环境的状态、设定几何定理为强化学习模型中的动作。
在强化学习模型中包含四个基本元素:智能体Agent、环境Environment、动作Action以及激励Reward。
其中,智能体Agent能够感知环境Environment的状态State,并且根据环境Environment提供的激励Reward,通过学习选择一个合适的动作Action,来最大化长期的激励Reward。
简而言之,智能体Agent根据环境Environment提供的激励Reward作为反馈,学习一系列的环境Environment的状态State到动作Action的映射,动作Action选择的原则是最大化未来累积的激励Reward的概率。选择的动作Action不仅影响当前时刻的激励Reward,还会影响下一时刻甚至未来的激励Reward, 因此,智能体Agent在学习过程中的基本规则是:如果某个动作Action带来了环境Environment的正激励Reward,那么这一动作会被加强,如果某个动作Action带来了环境Environment的负激励Reward,那么这一动作会被削弱。
环境Environment会接收智能体Agent执行的一系列的动作Action,并且对这一系列的动作Action的好坏进行评价,并转换成一种可量化的(标量信号)激励Reward反馈给智能体Agent。同时,环境Environment还像智能体Agent提供它所处的状态State。
激励Reward是环境Environment提供给智能体Agent的一个可量化的标量反馈信号,用于评价智能体Agent在某一个时间所执行的动作Action的好坏。强化学习是基于一种最大化累计激励假设,即在强化学习中,智能体Agent进行一系列的动作Action选择的目标是最大化未来的累计激励Reward。
状态State包含了智能体Agent用于动作Action选择所参考的信息,它是历史History的一个函数。
则马尔科夫决策过程可表示如下:
M=(S,A,P sa,R)
其中,S表示环境的状态的集合,A表示为动作的集合,P sa表示状态转移概率,即在状态s下采取动作a后,转移到其他状态的概率分布情况。
学习的目标即为针对上述马尔可夫决策过程,寻找最优策略π:
π(a|s)=P[A t=a|S t=s]
即在t时刻,对于给定状态s,寻找该状态s下执行动作a的最优策略。
在本实施例中,在每一次迭代强化学习模型时,电子设备(几何解题装置)为智能体Agent、几何条件为环境Environment的状态State、为几何条件选定几何定位为动作Action。
对于几何的电子习题,一般具有相应的几何的知识点,这些几何的知识点可归纳为不同的几何定理,例如,勾股定理、射影定理、欧拉定理、中线定理、斯图尔特定理、阿波罗尼斯定理、托勒密定理,等等,从而使用一个或多个几何定理设计几何的电子习题,因此,为了提高解答几何的电子习题的效率,提高解答几何的电子习题的准确性,可以按照几何的电子习题所包含的知识点筛选几何定理,从而构建动作Action的空间。
例如,如果某个几何的电子习题为初中考试的题目,此时,可以选择初中阶段所有的几何定理,构建动作Action的空间。
步骤204、执行强化学习模型,学习几何条件应用所有几何定理的价值,作 为第一目标价值。
在执行强化学习模型进行学习时,电子设备(几何解题装置)作为智能体Agent,从几何的电子习题中提取几何条件,作为环境Environment的状态State,执行为几何条件应用几何定理这个动作Action,从而计算为几何条件应用几何定理这个动作Action的价值(Q值),记为第一目标价值。
在具体实现中,可以应用DQN(Deep Q-Learing,深度Q学习)执行强化学习模型,对此,可预先对强化学习模型定义动作价值函数Q(s t,a t;ω),其中,s t表示当前时刻t的环境Environment的状态State,即当前时刻t电子习题的几何条件,a t表示当前时刻t执行的动作Action,即当前t时刻选择的几何定理,ω为参数。
其中,动作价值函数又可称为Q网络,Q网络可以应用神经网络,例如,卷积神经网络(Convolutional Neural Networks,CNN)、循环神经网络(Recurrent Neural Network,RNN)、深度神经网络(Deep Neural Networks,DNN),等等,从而将高维度、连续的状态空间(几何条件、各个几何定理)通过神经网络转换为低纬度的价值函数。
即,将几何条件、各个几何定理输入到动作价值函数中,输出各个几何定理的第一目标价值。
步骤205、按照第一目标价值选择与当前几何条件适配的几何定理。
在本实施例中,可以参考各个几何定理的第一目标价值,选择与几何条件适配的几何定理。
一般情况下,可以对比各个几何定理的第一目标价值,选择第一目标价值最高的几何定理为与几何条件适配的几何定理。
当然,除了选择第一目标价值最高的几何定理之外,还可以应用其他方式选择与几何条件适配的几何定理,例如,使用∈-贪婪法选择与几何条件适配的几何定理,即,有∈的概率选择第一目标价值的几何定理,有(1-∈)的概率随机选择几何定理,等等,本实施例对此不加以限制。
步骤206、计算在当前几何条件下,作为时间差分的目标的第二目标价值。
在本实施例中,强化学习模型中的动作价值函数(如DQN)的参数可以使用时间差分(Temporal Difference,TD)进行学习。
此时,可以在几何条件下,计算作为时间差分的目标的价值,记为第二目标价值。
在具体实现中,可以确定在当前时刻t的几何条件下、对选定的几何定理的 激励Reward,电子设备(几何解题装置)执行为几何条件选择几何定理这个动作Action,目的是使得这个激励Reward最优。
一般情况下,激励Reward与匹配度正相关,该匹配度表示几何条件与电子习题的匹配程度,即,匹配度越高,激励Reward的数值越大,反之,匹配度越低,激励Reward的数值越小。
示例性地,在当前时刻的几何条件下,若几何信息为几何问题的答案,即匹配度最高,则确定对几何定理的激励为第一值。
在当前时刻的几何条件下,若几何信息为新的、已知的几何条件,即匹配度次之,则确定对几何定理的激励为第二值。
在当前时刻的几何条件下,若几何信息为除几何问题的答案与新的已知的几何条件之外的其他信息,即匹配度最低,则确定对几何定理的激励为第三值。
其中,第一值(如10)大于第二值(如1),第二值大于第三值(如0)。
执行强化学习模型,学习在下一时刻(t+1)的几何条件下应用所有几何定理的价值,作为第一候选价值,表示为Q(s t+1,a;ω),其中,s t+1表示下一时刻(t+1)的几何条件(状态),a∈A,A表示几何定理形成的空间。
对所有第一候选价值进行比较,选择所有第一候选价值中的最大值,对所有第一候选价值中的最大值进行衰减(即,计算所有第一候选价值中的最大值与预设的衰减因子之间的乘积,该衰减因子的取值为(0,1)),获得第二候选价值。
计算激励与第二候选价值之间的和值,作为时间差分的目标的第二目标价值,表示为r t+λ*Max a∈AQ(s t+1,a;ω),其中,r t为激励Reward,λ为衰减因子。
步骤207、计算第一目标价值与第二目标价值之间的差异,作为损失值。
在本实施例中,第一目标价值与第二目标价值均是强化学习模型(如DQN)对最优的动作(即选择几何定理)的价值估计,而第二目标价值的部分是基于观测到的激励,第二目标价值更加接近实际的结果,因此,可以将训练强化学习模型(如DQN)的目标设置为鼓励第一目标价值接近第二目标价值,此时,可以调用预设的损失函数,代入第一目标价值与第二目标价值,从而计算第一目标价值与第二目标价值之间的差异,记为损失值。
示例性地,将第一目标价值减去第二目标价值,获得价值差,将价值差的平方乘以预设的系数,得到损失值,则损失函数表示如下:
L(ω)=α(q t-y t) 2
其中,L(ω)为损失值,α为系数,取值一般为(0,1),如1/2,q t为第一 目标价值,y t为第二目标价值。
步骤208、按照损失值更新强化学习模型。
在本实施例中,可以对强化学习模型(如DQN)进行反向传播,在反向传播的过程中,将损失值代入SGD(stochastic gradient descent,随机梯度下降)、Adam(Adaptive momentum,自适应动量)等优化算法中,计算更新强化学习模型(如DQN)中参数的梯度,分别按照该梯度更新强化学习模型(如DQN)中的参数。
当然,上述DQN算法只是作为强化学习模型的示例,在实施本实施例时,可以根据实际情况设置其他强化学习模型,例如,SARAS(一种时序差分法)算法、DDPG(Deep Deterministic Policy Gradient,深度确定性策略梯度)算法、A3C(Actor-Critic Algorithm,异步的优势行动者评论家算法)算法、NAF(normalized advantage functions,归一化优势函数)算法、TRPO(Trust region policy optimization,信赖域策略优化)算法、PPO(Proximal Policy Optimization,近端策略优化算法)算法,等等,本实施例对此不加以限制。另外,除了上述强化学习模型外,本领域技术人员还可以根据实际需要采用其它强化学习模型,本实施例对此也不加以限制。
步骤209、将当前几何定理应用于当前几何条件推论新的几何条件。
在本实施例中,在服务端中可以预先将实现各个几何定理的逻辑代码封装为各个应用程序编程接口(Application Program Interface,API),并提供接口规范。
在确定当前几何定理与当前几何条件时,可以依据当前几何定理查询目标接口,其中,目标接口为用于实现当前几何定理的逻辑代码封装的应用程序编程接口。
不同几何定理需求的几何条件有所不同,因而可以按照接口规范从当前几何条件中选择适用于几何定理的几何条件,作为目标条件。
例如,如果几何定理为勾股定理,接口规范定义输入直角三角形的任意两条边长,那么,可以从从当前几何条件中选择直角三角形的任意两条边长,作为目标条件。
按照接口规范将目标条件作为打包至推论请求中,以及,将推论请求发送至目标接口,以调用逻辑代码按照几何定理对目标条件进行运算并返回新的几何条件。
步骤210、判断新的几何条件是否为几何问题的答案;若是,则执行步骤211,若否,则返回执行步骤203-步骤210。
步骤211、确定迭代强化学习模型结束,将新的几何条件作为几何问题的答案输出。
在本实施例中,可以检测当前迭代中新的几何条件是否为有效的信息,例如,新的几何条件是否具有有效的几何图形的标识信息、数值等。
如果新的几何条件有效,则可以将新的几何条件与几何问题进行比较,如果新的几何条件与几何问题匹配,则可以确认新的几何条件为几何问题的答案,则停止迭代强化学习模型结束,将该新的几何条件作为几何问题的答案输出给用户,如果新的几何条件与几何问题不匹配,则可以将新的几何条件添加到已知的几何条件中,使用强化学习模型进入下一次迭代的学习。
如果新的几何条件无效,则可以确认当前几何定理为错误的几何定理,选择其他几何定理进行推论。
步骤212、将答案封装至推导信息中。
步骤213、将推导信息发送至客户端、以将几何习题与推导信息关联显示。
在本实施例中,除了答案之外,使用几何定理逐步进行推论的过程对于用户的学习具有指导意义,因此,可以将答案封装至推导信息中,其中,推导信息为按照迭代强化学习模型的顺序依次显示将当前几何定理应用于当前几何条件推论新的几何条件,直至得到答案的过程。
服务端将推导信息封装至解题响应中,并将解题响应发送至客户端,客户端在接收到解题响应时,从解题响应中解析出推导信息,将几何习题与推导信息关联显示,向用户提示对几何习题按照推导信息进行解答,得到答案。
为使本领域技术人员更好地理解本实施例,以下通过具体的示例来说明本实施例中解答几何的电子习题的方法。
如图4A所示的几何的电子习题,具有题干信息(文本信息)与图例(图像数据)。
使用正则表达式对题干信息(文本信息)进行搜索,得到已知的几何条件为“RtΔABC”(即Triangle(A,B,C))、“AC=3”(即Equals(Line(A,C),3))、“BC=4”(即Equals(Line(B,C),4)),待求解的几何问题为“CD的长等于()”。
对图例(图像数据)进行搜索,得到已知的几何条件为“∠ACB=90°”(即Equals(Angle(A,C,B),90))、“∠CDB=90°”(即Equals(Angle(C,D,B),90))、“∠CDA=90°”(即Equals(Angle(C,D,A),90))。
如图4B所示,在时刻t,将如下几何条件s t输入DQN中学习:
Triangle(A,B,C)
Equals(Line(A,C),3)
Equals(Line(B,C),4)
Equals(Angle(A,C,B),90)
Equals(Angle(C,D,B),90)
Equals(Angle(C,D,A),90)
DQN学习得到几何定理a t为勾股定理。
将几何定理a t应用于几何条件s t进行推理,得到几何信息“AB=5”(即Equals(Line(A,B),5))。
由于“AB=5”并非“CD的长等于()”的答案,因此,将“AB=5”设置为新的已知的几何条件。
在时刻(t+1),将如下几何条件s t+1输入DQN中学习:
Triangle(A,B,C)
Equals(Line(A,C),3)
Equals(Line(B,C),4)
Equals(Angle(A,C,B),90)
Equals(Angle(C,D,B),90)
Equals(Angle(C,D,A),90)
Equals(Line(A,B),5)
DQN学习得到几何定理a t+1为等面积法。
将几何定理a t+1应用于几何条件s t+1进行推理,得到新的几何条件“CD=2.4”(即Equals(Line(C,D),2.4))。
由于“CD=2.4”为“CD的长等于()”的答案,此时,解题结束。
实施例三
图5为本申请实施例三提供的一种几何解题方法的流程图,本实施例可适用于在构建题库的过程中使用强化学习模型对几何的电子习题进行求解的情况,该方法可以由几何解题装置来执行,该几何解题装置可以采用硬件和/或软件的形式实现,该几何解题装置可配置于电子设备中,该电子设备可以 为服务端。如图5所示,该方法包括:
步骤501、从题库中查找学科属于几何的电子习题。
在本实施例中,管理人员可以登录学习平台,选择题库管理的功能,可批量选择导入学科属于几何的电子习题,这些电子习题可以为word、excel、音频数据、视频数据等格式。
在管理人员选择电子习题上传后,在相应界面将看到导入的试题,界面一般可分为输入区与检查区,输入区用于编辑电子习题,检查区为确认无误可导入的电子习题。
对于学习平台而言,可以智能检测可成功识别的电子习题的数量,对于未成功识别的电子习题,可根据系统给出的提示进行查看对比,重新编辑模板导入(可在输入区直接编辑)。
若导入的电子习题中与已有的电子习题重复,则可以展示重复的电子习题,用户可以选择去掉重复的电子习题,也可以选择依然导入重复的试题。
以上步骤操作无误后,即可成功导入电子习题到学习平台的题库中,等待以课后练习、试卷等形式使用。
步骤502、若电子习题缺乏答案和/或推导过程,则从电子习题中提取已知的几何条件与待解答的几何问题。
由于收录电子习题的过程中可能存在缺失,使得电子习题缺乏答案和/或推导过程,即在题库中未录入电子习题的答案和/或推导过程,服务端可以定时检测题库中的电子习题是否缺失答案和/或推导过程,在电子习题缺失答案和/或推导过程时,自动对电子习题进行解答,得到电子习题的答案和/或推导过程,当然,管理人员也可以在浏览到某个电子习题缺失答案和/或推导过程时,主动请求服务端对电子习题进行解答,得到电子习题的答案和/或推导过程。
在一种提取几何条件、几何问题的方式中,电子习题具有第一文本信息,那么,在本方式中,确定正则表达式,正则表达式用于描述已知的几何条件或待解答的几何问题的匹配模式;将正则表达式与第一文本信息进行匹配;确定匹配成功的第一文本信息为已知的几何条件或待解答的几何问题。
在另一种提取几何条件、几何问题的方式中,电子习题具有图像数据,则在方式中,可以以符号、数字、字母中的至少一者作为目标,在图像数据中进行检测;若检测到符号数字、字母中的至少一者所处的区域,则对区域进行光学字符识别,得到第二文本信息;在图像数据中识别几何图形;将第二文本信息赋予几何图形,得到已知的几何条件。
步骤503、在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理。
在具体实现中,在每一次迭代强化学习模型时,分别设定当前几何条件为强化学习模型中环境的状态、设定几何定理为强化学习模型中的动作;执行强化学习模型,学习几何条件应用所有几何定理的价值,作为第一目标价值;按照第一目标价值选择与当前几何条件适配的几何定理。
此外,计算在当前几何条件下,作为时间差分的目标的第二目标价值;计算第一目标价值与第二目标价值之间的差异,作为损失值;按照损失值更新强化学习模型。
在计算在当前几何条件下,作为时间差分的目标的第二目标价值时,可以确定在当前时刻的几何条件下、对几何定理的激励;执行强化学习模型,学习在下一时刻的几何条件下应用所有几何定理的价值,作为第一候选价值;对所有第一候选价值中的最大值进行衰减,获得第二候选价值;计算激励与第二候选价值之间的和值,作为时间差分的目标的第二目标价值。
在确定在当前时刻的几何条件下、对几何定理的激励时,可以在当前时刻的几何条件下,若几何信息为几何问题的答案,则确定对几何定理的激励为第一值;在当前时刻的几何条件下,若几何信息为新的、已知的几何条件,则确定对几何定理的激励为第二值;在当前时刻的几何条件下,若几何信息为除几何问题的答案与新的已知的几何条件之外的其他信息,则确定对几何定理的激励为第三值;其中,第一值大于第二值,第二值大于第三值。
在计算第一目标价值与第二目标价值之间的差异,作为损失值时,可以将第一目标价值减去第二目标价值,获得价值差;将价值差的平方乘以预设的系数,得到损失值。
步骤504、将当前几何定理应用于当前几何条件推论新的几何条件。
在具体实现中,可以查询目标接口,目标接口为用于实现当前几何定理的逻辑代码封装的应用程序编程接口;从当前几何条件中选择适用于几何定理的几何条件,作为目标条件;将目标条件作为打包至推论请求中;将推论请求发送至目标接口,以调用逻辑代码按照几何定理对目标条件进行运算并返回新的几何条件。
步骤505、当迭代强化学习模型结束时,将新的几何条件为几何问题的答案。
在具体实现中,可以判断新的几何条件是否为几何问题的答案;若是,则确定迭代强化学习模型结束,将新的几何条件作为几何问题的答案输出;若否,则返回执行步骤503-步骤504。
步骤506、在题库中存储电子习题与推导信息之间的映射关系。
在本实施例中,可以将答案封装至推导信息中,其中,推导信息为按照迭代强化学习模型的顺序依次显示将当前几何定理应用于当前几何条件推论新的几何条件,直至得到答案的过程。
在题库中存储电子习题与推导信息之间的映射关系,表示对几何习题按照推导信息进行解答,得到答案,该映射关系可提供给管理人员进行校验。
在本实施例中,由于电子习题的推导过程与实施例一、实施例二的应用基本相似,所以描述的比较简单,相关之处参见实施例一、实施例二的部分说明即可,本实施例在此不加以详述。
在本实施例中,服务端从题库中查找学科属于几何的电子习题;若电子习题缺乏答案和/或推导过程,则从电子习题中提取已知的几何条件与待解答的几何问题;在每一次迭代强化学习模型时,将当前几何条件输入强化学习模型中进行学习,得到与当前几何条件适配的几何定理;将当前几何定理应用于当前几何条件推论新的几何条件;当迭代强化学习模型结束时,将新的几何条件为几何问题的答案;在题库中存储电子习题与推导信息之间的映射关系,推导信息为按照迭代强化学习模型的顺序依次显示将当前几何定理应用于当前几何条件推论新的几何条件,直至得到答案的过程。本实施例提出了一种富有推理性的几何解题框架,兼容性强,可扩展性强,在几何解题框架中引入强化学习模型学习几何定理,逻辑清晰,可解析性强,保持了几何定理与几何定理之间的关联性,提高了几何解题的精准度,再者,本实施例可以一步步地预测几何定理进行推论,可推论的过程进行描述,更加符合用户学习的过程,达到知其然,更知其所以然的效果,此外,将几何解题的推导信息记录在题库中供管理人员校验,可以大量减少用户解题、构建题库的操作,大大提高了构建题库的效率。
实施例四
图6为本申请实施例四提供的一种几何解题装置的结构示意图。如图6所示,该装置应用于服务端,包括:
解题请求接收模块601,用于接收客户端针对学科属于几何的电子习题发送的解题请求;
习题信息提取模块602,用于响应于所述解题请求,从所述电子习题中提取已知的几何条件与待解答的几何问题;
几何定理学习模块603,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几 何定理;
几何条件推论模块604,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
答案确定模块605,用于当迭代所述强化学习模型结束时,确定新的所述几何条件为所述几何问题的答案;
答案发送模块606,用于将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示。
在本申请的一个实施例中,所述解题请求接收模块601包括:
客户端请求接收模块,用于接收到客户端发送的解题请求;
图像数据提取模块,用于从所述解题请求中提取图像数据;
电子习题读取模块,用于在所述图像数据中读取学科属于几何的电子习题。
在本申请的一个实施例中,所述电子习题具有第一文本信息;
所述习题信息提取模块602包括:
正则表达式确定模块,用于确定正则表达式,所述正则表达式用于描述已知的几何条件或待解答的几何问题的匹配模式;
正则表达式匹配模块,用于将所述正则表达式与所述第一文本信息进行匹配;
匹配成功确定模块,用于确定匹配成功的所述第一文本信息为已知的几何条件或待解答的几何问题。
在本申请的另一个实施例中,所述电子习题具有图像数据;
所述习题信息提取模块602包括:
目标检测模块,用于以符号、数字、字母中的至少一者作为目标,在所述图像数据中进行检测;
光学字符识别模块,用于若检测到所述符号所述数字、所述字母中的至少一者所处的区域,则对所述区域进行光学字符识别,得到第二文本信息;
几何图形识别模块,用于在所述图像数据中识别几何图形;
文本赋予模块,用于将所述第二文本信息赋予所述几何图形,得到已知的几何条件。
在本申请的一个实施例中,所述几何定理学习模块603包括:
强化学习模块设定模块,用于在每一次迭代强化学习模型时,分别设定当 前所述几何条件为所述强化学习模型中环境的状态、设定所述几何定理为强化学习模型中的动作;
强化学习模块执行模块,用于执行所述强化学习模型,学习所述几何条件应用所有所述几何定理的价值,作为第一目标价值;
几何定理选择模块,用于按照所述第一目标价值选择与当前所述几何条件适配的所述几何定理。
在本申请的一个实施例中,所述几何定理学习模块603还包括:
目标计算模块,用于计算在当前所述几何条件下,作为时间差分的目标的第二目标价值;
损失值计算模块,用于计算所述第一目标价值与所述第二目标价值之间的差异,作为损失值;
强化学习模型更新模块,用于按照所述损失值更新所述强化学习模型。
在本申请的一个实施例中,所述目标计算模块包括:
激励确定模块,用于确定在当前时刻的所述几何条件下、对所述几何定理的激励;
第一候选价值计算模块,用于执行所述强化学习模型,学习在下一时刻的所述几何条件下应用所有所述几何定理的价值,作为第一候选价值;
第二候选价值计算模块,用于对所有所述第一候选价值中的最大值进行衰减,获得第二候选价值;
目标价值计算模块,用于计算所述激励与所述第二候选价值之间的和值,作为时间差分的目标的第二目标价值。
在本申请的一个实施例中,所述激励确定模块包括:
第一值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为所述几何问题的答案,则确定对所述几何定理的激励为第一值;
第二值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为新的、已知的几何条件,则确定对所述几何定理的激励为第二值;
第三值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为除所述几何问题的答案与新的已知的几何条件之外的其他信息,则确定对所述几何定理的激励为第三值;
其中,所述第一值大于所述第二值,所述第二值大于所述第三值。
在本申请的一个实施例中,所述损失值计算模块包括:
价值差计算模块,用于将所述第一目标价值减去所述第二目标价值,获得价值差;
价值差处理模块,用于将所述价值差的平方乘以预设的系数,得到损失值。
在本申请的一个实施例中,所述几何条件推论模块604包括:
接口查询模块,用于查询目标接口,所述目标接口为用于实现当前所述几何定理的逻辑代码封装的应用程序编程接口;
目标条件选择模块,用于从当前所述几何条件中选择适用于所述几何定理的所述几何条件,作为目标条件;
推论请求打包模块,用于将所述目标条件作为打包至推论请求中;
接口调用模块,用于将所述推论请求发送至所述目标接口,以调用所述逻辑代码按照所述几何定理对所述目标条件进行运算并返回新的几何条件。
在本申请的一个实施例中,所述答案确定模块605包括:
答案判断模块,用于判断新的所述几何条件是否为所述几何问题的答案;若是,则调用几何条件输出模块,若否,则返回调用所述几何定理学习模块603与所述几何条件推论模块604;
几何条件输出模块,用于确定迭代所述强化学习模型结束,将新的所述几何条件作为所述几何问题的答案输出。
在本申请的一个实施例中,所述答案发送模块606包括:
推导信息封装模块,用于将所述答案封装至推导信息中,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程;
推导信息发送模块,用于将所述推导信息发送至所述客户端、以将所述几何习题与所述推导信息关联显示。
本申请实施例所提供的几何解题装置可执行本申请任意实施例所提供的几何解题方法,具备执行几何解题方法相应的功能模块和有益效果。
实施例五
图7为本申请实施例五提供的一种几何解题装置的结构示意图。如图7所示,该装置应用于服务端,包括:
电子习题查找模块701,用于从题库中查找学科属于几何的电子习题;
习题信息提取模块702,用于若所述电子习题缺乏答案和/或推导过程,则从所述电子习题中提取已知的几何条件与待解答的几何问题;
几何定理学习模块703,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
几何条件推论模块704,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
答案确定模块705,用于当迭代所述强化学习模型结束时,将新的所述几何条件为所述几何问题的答案;
习题存储模块706,用于在所述题库中存储所述电子习题与推导信息之间的映射关系,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程。
在本申请的一个实施例中,所述电子习题具有第一文本信息;
所述习题信息提取模块702包括:
正则表达式确定模块,用于确定正则表达式,所述正则表达式用于描述已知的几何条件或待解答的几何问题的匹配模式;
正则表达式匹配模块,用于将所述正则表达式与所述第一文本信息进行匹配;
匹配成功确定模块,用于确定匹配成功的所述第一文本信息为已知的几何条件或待解答的几何问题。
在本申请的另一个实施例中,所述电子习题具有图像数据;
所述习题信息提取模块702包括:
目标检测模块,用于以符号、数字、字母中的至少一者作为目标,在所述图像数据中进行检测;
光学字符识别模块,用于若检测到所述符号所述数字、所述字母中的至少一者所处的区域,则对所述区域进行光学字符识别,得到第二文本信息;
几何图形识别模块,用于在所述图像数据中识别几何图形;
文本赋予模块,用于将所述第二文本信息赋予所述几何图形,得到已知的几何条件。
在本申请的一个实施例中,所述几何定理学习模块703包括:
强化学习模块设定模块,用于在每一次迭代强化学习模型时,分别设定当前所述几何条件为所述强化学习模型中环境的状态、设定所述几何定理为强化学习模型中的动作;
强化学习模块执行模块,用于执行所述强化学习模型,学习所述几何条件应用所有所述几何定理的价值,作为第一目标价值;
几何定理选择模块,用于按照所述第一目标价值选择与当前所述几何条件适配的所述几何定理。
在本申请的一个实施例中,所述几何定理学习模块703还包括:
目标计算模块,用于计算在当前所述几何条件下,作为时间差分的目标的第二目标价值;
损失值计算模块,用于计算所述第一目标价值与所述第二目标价值之间的差异,作为损失值;
强化学习模型更新模块,用于按照所述损失值更新所述强化学习模型。
在本申请的一个实施例中,所述目标计算模块包括:
激励确定模块,用于确定在当前时刻的所述几何条件下、对所述几何定理的激励;
第一候选价值计算模块,用于执行所述强化学习模型,学习在下一时刻的所述几何条件下应用所有所述几何定理的价值,作为第一候选价值;
第二候选价值计算模块,用于对所有所述第一候选价值中的最大值进行衰减,获得第二候选价值;
目标价值计算模块,用于计算所述激励与所述第二候选价值之间的和值,作为时间差分的目标的第二目标价值。
在本申请的一个实施例中,所述激励确定模块包括:
第一值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为所述几何问题的答案,则确定对所述几何定理的激励为第一值;
第二值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为新的、已知的几何条件,则确定对所述几何定理的激励为第二值;
第三值确定模块,用于在当前时刻的所述几何条件下,若所述几何信息为除所述几何问题的答案与新的已知的几何条件之外的其他信息,则确定对所述几何定理的激励为第三值;
其中,所述第一值大于所述第二值,所述第二值大于所述第三值。
在本申请的一个实施例中,所述损失值计算模块包括:
价值差计算模块,用于将所述第一目标价值减去所述第二目标价值,获得价值差;
价值差处理模块,用于将所述价值差的平方乘以预设的系数,得到损失值。
在本申请的一个实施例中,所述几何条件推论模块704包括:
接口查询模块,用于查询目标接口,所述目标接口为用于实现当前所述几何定理的逻辑代码封装的应用程序编程接口;
目标条件选择模块,用于从当前所述几何条件中选择适用于所述几何定理的所述几何条件,作为目标条件;
推论请求打包模块,用于将所述目标条件作为打包至推论请求中;
接口调用模块,用于将所述推论请求发送至所述目标接口,以调用所述逻辑代码按照所述几何定理对所述目标条件进行运算并返回新的几何条件。
在本申请的一个实施例中,所述答案确定模块705包括:
答案判断模块,用于判断新的所述几何条件是否为所述几何问题的答案;若是,则调用几何条件输出模块,若否,则返回调用所述几何定理学习模块703与所述几何条件推论模块704;
几何条件输出模块,用于确定迭代所述强化学习模型结束,将新的所述几何条件作为所述几何问题的答案输出。
本申请实施例所提供的几何解题装置可执行本申请任意实施例所提供的几何解题方法,具备执行几何解题方法相应的功能模块和有益效果。
实施例六
图8示出了可以用来实施本申请的实施例的电子设备10的结构示意图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备(如头盔、眼镜、手表等)和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。
如图8所示,电子设备10包括至少一个处理器11,以及与至少一个处理器11通信连接的存储器,如只读存储器(ROM)12、随机访问存储器(RAM) 13等,其中,存储器存储有可被至少一个处理器执行的计算机程序,处理器11可以根据存储在只读存储器(ROM)12中的计算机程序或者从存储单元18加载到随机访问存储器(RAM)13中的计算机程序,来执行各种适当的动作和处理。在RAM 13中,还可存储电子设备10操作所需的各种程序和数据。处理器11、ROM 12以及RAM 13通过总线14彼此相连。输入/输出(I/O)接口15也连接至总线14。
电子设备10中的多个部件连接至I/O接口15,包括:输入单元16,例如键盘、鼠标等;输出单元17,例如各种类型的显示器、扬声器等;存储单元18,例如磁盘、光盘等;以及通信单元19,例如网卡、调制解调器、无线通信收发机等。通信单元19允许电子设备10通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
处理器11可以是各种具有处理和计算能力的通用和/或专用处理组件。处理器11的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的处理器、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。处理器11执行上文所描述的各个方法和处理,例如几何解题方法。
在一些实施例中,几何解题方法可被实现为计算机程序,其被有形地包含于计算机可读存储介质,例如存储单元18。在一些实施例中,计算机程序的部分或者全部可以经由ROM 12和/或通信单元19而被载入和/或安装到电子设备10上。当计算机程序加载到RAM 13并由处理器11执行时,可以执行上文描述的几何解题方法的一个或多个步骤。备选地,在其他实施例中,处理器11可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行几何解题方法。
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本申请的方法的计算机程序可以采用一个或多个编程语言的任 何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,使得计算机程序当由处理器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本申请的上下文中,计算机可读存储介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的计算机程序。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。备选地,计算机可读存储介质可以是机器可读信号介质。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在电子设备上实施此处描述的系统和技术,该电子设备具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给电子设备。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)、区块链网络和互联网。
计算系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云 服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务中,存在的管理难度大,业务扩展性弱的缺陷。
实施例七
本申请实施例还提供了一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序在被处理器执行时实现如本申请任一实施例所提供的几何解题方法。
计算机程序产品在实现的过程中,可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请的技术方案所期望的结果,本文在此不进行限制。

Claims (17)

  1. 一种几何解题方法,应用于服务端,包括:
    接收客户端针对学科属于几何的电子习题发送的解题请求;
    响应于所述解题请求,从所述电子习题中提取已知的几何条件与待解答的几何问题;
    在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
    将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
    当迭代所述强化学习模型结束时,确定新的所述几何条件为所述几何问题的答案;
    将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示。
  2. 根据权利要求1所述的方法,其中,所述接收到客户端针对学科属于几何的电子习题发送的解题请求,包括:
    接收到客户端发送的解题请求;
    从所述解题请求中提取图像数据;
    在所述图像数据中读取学科属于几何的电子习题。
  3. 根据权利要求1所述的方法,其中,所述电子习题具有第一文本信息;
    所述从所述电子习题中提取已知的几何条件与待解答的几何问题,包括:
    确定正则表达式,所述正则表达式用于描述已知的几何条件或待解答的几何问题的匹配模式;
    将所述正则表达式与所述第一文本信息进行匹配;
    确定匹配成功的所述第一文本信息为已知的几何条件或待解答的几何问题。
  4. 根据权利要求1所述的方法,其中,所述电子习题具有图像数据;
    所述从所述电子习题中提取已知的几何条件与待解答的几何问题,包括:
    以符号、数字、字母中的至少一者作为目标,在所述图像数据中进行检测;
    若检测到所述符号所述数字、所述字母中的至少一者所处的区域,则对所述区域进行光学字符识别,得到第二文本信息;
    在所述图像数据中识别几何图形;
    将所述第二文本信息赋予所述几何图形,得到已知的几何条件。
  5. 根据权利要求1所述的方法,其中,所述在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理,包括:
    在每一次迭代强化学习模型时,分别设定当前所述几何条件为所述强化学习模型中环境的状态、设定所述几何定理为强化学习模型中的动作;
    执行所述强化学习模型,学习所述几何条件应用所有所述几何定理的价值,作为第一目标价值;
    按照所述第一目标价值选择与当前所述几何条件适配的所述几何定理。
  6. 根据权利要求5所述的方法,其中,所述在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理,还包括:
    计算在当前所述几何条件下,作为时间差分的目标的第二目标价值;
    计算所述第一目标价值与所述第二目标价值之间的差异,作为损失值;
    按照所述损失值更新所述强化学习模型。
  7. 根据权利要求6所述的方法,其中,所述计算在当前所述几何条件下,作为时间差分的目标的第二目标价值,包括:
    确定在当前时刻的所述几何条件下、对所述几何定理的激励;
    执行所述强化学习模型,学习在下一时刻的所述几何条件下应用所有所述几何定理的价值,作为第一候选价值;
    对所有所述第一候选价值中的最大值进行衰减,获得第二候选价值;
    计算所述激励与所述第二候选价值之间的和值,作为时间差分的目标的第二目标价值。
  8. 根据权利要求7所述的方法,其中,所述确定在当前时刻的所述几何条件下、对所述几何定理的激励,包括:
    在当前时刻的所述几何条件下,若所述几何信息为所述几何问题的答案,则确定对所述几何定理的激励为第一值;
    在当前时刻的所述几何条件下,若所述几何信息为新的、已知的几何条件,则确定对所述几何定理的激励为第二值;
    在当前时刻的所述几何条件下,若所述几何信息为除所述几何问题的答案与新的已知的几何条件之外的其他信息,则确定对所述几何定理的激励为第三值;
    其中,所述第一值大于所述第二值,所述第二值大于所述第三值。
  9. 根据权利要求6所述的方法,其中,所述计算所述第一目标价值与所述第二目标价值之间的差异,作为损失值,包括:
    将所述第一目标价值减去所述第二目标价值,获得价值差;
    将所述价值差的平方乘以预设的系数,得到损失值。
  10. 根据权利要求1-9中任一项所述的方法,其中,所述将当前所述几何定理应用于当前所述几何条件推论新的几何条件,包括:
    查询目标接口,所述目标接口为用于实现当前所述几何定理的逻辑代码封装的应用程序编程接口;
    从当前所述几何条件中选择适用于所述几何定理的所述几何条件,作为目标条件;
    将所述目标条件作为打包至推论请求中;
    将所述推论请求发送至所述目标接口,以调用所述逻辑代码按照所述几何定理对所述目标条件进行运算并返回新的几何条件。
  11. 根据权利要求1-9中任一项所述的方法,其中,所述当迭代所述强化学习模型结束时,确定新的所述几何信息为所述几何问题的答案,包括:
    判断新的所述几何条件是否为所述几何问题的答案;
    若是,则确定迭代所述强化学习模型结束,将新的所述几何条件作为所述几何问题的答案输出;
    若否,则返回执行所述在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理、所述将当前所述几何定理应用于当前所述几何条件推论新的几何条件。
  12. 根据权利要求1-9中任一项所述的方法,其中,所述将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示,包括:
    将所述答案封装至推导信息中,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程;
    将所述推导信息发送至所述客户端、以将所述几何习题与所述推导信息关联显示。
  13. 一种几何解题方法,应用于服务端,包括:
    从题库中查找学科属于几何的电子习题;
    若所述电子习题缺乏答案和/或推导过程,则从所述电子习题中提取已知的几何条件与待解答的几何问题;
    在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
    将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
    当迭代所述强化学习模型结束时,将新的所述几何条件为所述几何问题的答案;
    在所述题库中存储所述电子习题与推导信息之间的映射关系,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程。
  14. 一种几何解题装置,应用于服务端,包括:
    解题请求接收模块,用于接收客户端针对学科属于几何的电子习题发送的解题请求;
    习题信息提取模块,用于响应于所述解题请求,从所述电子习题中提取已知的几何条件与待解答的几何问题;
    几何定理学习模块,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何定理;
    几何条件推论模块,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
    答案确定模块,用于当迭代所述强化学习模型结束时,确定新的所述几何条件为所述几何问题的答案;
    答案发送模块,用于将所述答案发送至所述客户端、以将所述几何习题与所述答案关联显示。
  15. 一种几何解题装置,应用于服务端,包括:
    电子习题查找模块,用于从题库中查找学科属于几何的电子习题;
    习题信息提取模块,用于若所述电子习题缺乏答案和/或推导过程,则从所述电子习题中提取已知的几何条件与待解答的几何问题;
    几何定理学习模块,用于在每一次迭代强化学习模型时,将当前所述几何条件输入所述强化学习模型中进行学习,得到与当前所述几何条件适配的几何 定理;
    几何条件推论模块,用于将当前所述几何定理应用于当前所述几何条件推论新的几何条件;
    答案确定模块,用于当迭代所述强化学习模型结束时,将新的所述几何条件为所述几何问题的答案;
    习题存储模块,用于在所述题库中存储所述电子习题与推导信息之间的映射关系,所述推导信息为按照迭代所述强化学习模型的顺序依次显示将当前所述几何定理应用于当前所述几何条件推论新的几何条件,直至得到所述答案的过程。
  16. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-13中任一项所述的几何解题方法。
  17. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序用于使处理器执行时实现权利要求1-13中任一项所述的几何解题方法。
PCT/CN2022/130858 2022-11-09 2022-11-09 一种几何解题方法、装置、设备及存储介质 WO2024098282A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/130858 WO2024098282A1 (zh) 2022-11-09 2022-11-09 一种几何解题方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/130858 WO2024098282A1 (zh) 2022-11-09 2022-11-09 一种几何解题方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2024098282A1 true WO2024098282A1 (zh) 2024-05-16

Family

ID=91031621

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/130858 WO2024098282A1 (zh) 2022-11-09 2022-11-09 一种几何解题方法、装置、设备及存储介质

Country Status (1)

Country Link
WO (1) WO2024098282A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060003303A1 (en) * 2004-06-30 2006-01-05 Educational Testing Service Method and system for calibrating evidence models
CN103473224A (zh) * 2013-09-30 2013-12-25 成都景弘智能科技有限公司 基于问题求解过程的习题语义化方法
CN107038146A (zh) * 2017-05-04 2017-08-11 电子科技大学 函数分支处理方法及装置
CN109657046A (zh) * 2018-12-24 2019-04-19 上海仁静信息技术有限公司 内容分析处理方法、装置、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060003303A1 (en) * 2004-06-30 2006-01-05 Educational Testing Service Method and system for calibrating evidence models
CN103473224A (zh) * 2013-09-30 2013-12-25 成都景弘智能科技有限公司 基于问题求解过程的习题语义化方法
CN107038146A (zh) * 2017-05-04 2017-08-11 电子科技大学 函数分支处理方法及装置
CN109657046A (zh) * 2018-12-24 2019-04-19 上海仁静信息技术有限公司 内容分析处理方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
WO2020249125A1 (zh) 用于自动训练机器学习模型的方法和系统
US11615331B2 (en) Explainable artificial intelligence
CN109033068A (zh) 基于注意力机制的用于阅读理解的方法、装置和电子设备
CN112509690B (zh) 用于控制质量的方法、装置、设备和存储介质
CN110705255B (zh) 检测语句之间的关联关系的方法和装置
CN113742733B (zh) 阅读理解漏洞事件触发词抽取和漏洞类型识别方法及装置
CN110569356A (zh) 基于智能面试交互系统的面试方法、装置和计算机设备
CN115563297A (zh) 一种基于图神经网络的食品安全知识图谱构建与补全方法
US20220027768A1 (en) Natural Language Enrichment Using Action Explanations
CN111782793A (zh) 智能客服处理方法和系统及设备
CN114647713A (zh) 基于虚拟对抗的知识图谱问答方法、设备及存储介质
CN117520503A (zh) 基于llm模型的金融客服对话生成方法、装置、设备及介质
CN113283488B (zh) 一种基于学习行为的认知诊断方法及系统
Li et al. Classifying crowdsourced mobile test reports with image features: An empirical study
US11995573B2 (en) Artificial intelligence system providing interactive model interpretation and enhancement tools
CN113704420A (zh) 文本中的角色识别方法、装置、电子设备及存储介质
CN113705207A (zh) 语法错误识别方法及装置
CN116484025A (zh) 漏洞知识图谱构建方法、评估方法、设备及存储介质
WO2023137918A1 (zh) 文本数据的分析方法、模型训练方法、装置及计算机设备
CN114818682B (zh) 基于自适应实体路径感知的文档级实体关系抽取方法
WO2024098282A1 (zh) 一种几何解题方法、装置、设备及存储介质
CN111414609B (zh) 一种对象验证方法和装置
CN114637850A (zh) 异常行为识别及模型训练方法、装置、设备及存储介质
CN112989001A (zh) 一种问答处理方法、装置、介质及电子设备
CN114067343A (zh) 一种数据集的构建方法、模型训练方法和对应装置