Disclosure of Invention
The invention aims to provide a multi-character structure self-adaptive input method and a layout generation method thereof, which can extract necessary character structures related in answer data and generate an input method containing the necessary character structures and the classification character structures in a self-adaptive manner, so that a user can input answers containing various character structures quickly and accurately without repeatedly switching the input method, and the input efficiency of the user is greatly improved.
In order to achieve the above object, the present technical solution provides a layout generating method for a multi-character structure adaptive input method, including the following steps: generating an input method page basic template, wherein character placeholders are preset in the input method page basic template; extracting necessary character structures in the target answer data; acquiring a category character structure associated with the essential character structure; and filling the category character structure and the necessary character structure into the preset character placeholder of the basic template of the input method page.
In some embodiments, lexical analysis is performed on the target answer data according to a parsing rule to obtain a syntax tree, and each node and leaf in the syntax tree are extracted and de-duplicated to obtain a necessary character structure.
In some embodiments, the necessary character structure is obtained by eliminating the numbers and letters in the character structure extracted from the grammar tree.
In some embodiments, the basic template of the input method page comprises at least one common input keyboard, and the characters with the same character structure extracted from the grammar tree and the input character category corresponding to the common input keyboard are removed to obtain the necessary character structure.
In some embodiments, the degree of association between the category character structure and the essential character structure is proportional to the probability that the category character structure and the essential character structure occur simultaneously in the answer to the topic.
In some embodiments, the step of obtaining the category character structure is as follows: obtaining a sample answer data set formed by at least two sample answer data, extracting alternative characters of the sample answer data, and forming an alternative character set; selecting characters in the alternative character set to calculate PMI values pairwise, generating a PMI matrix, calculating the similarity between the characters according to the PMI matrix, and selecting the characters with high similarity with the necessary character structure as the category character structure.
According to another aspect of the present invention, there is provided a multi-character structure adaptive input method, which is obtained based on the multi-character structure adaptive input method layout generating method of any one of claims 1 to 6.
In some embodiments, the method is suitable for inputting a teaching topic answer, wherein the teaching topic answer comprises multi-category character structures.
According to another aspect of the present invention, there is provided an electronic apparatus including: at least one processor; a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the multi-character configuration adaptive input method of any of claims 1-6.
According to another aspect of the present invention, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the multi-character structure adaptive input method of any one of claims 1 to 8.
Compared with the prior art, the technical scheme has the following characteristics and beneficial effects: the multi-symbol self-adaptive input method layout generation method is particularly suitable for the layout generation of the input method of teaching software products, can help a user to correct the input of wrong answers through the matching search of the category character structures, and prevents the leakage of question answer information caused by the self-adaptively generated input method. The self-adaptive input method can analyze and extract question answer data with any unknown structure, a unique input method is generated for each question in a self-adaptive mode, a user can input all answer contents on the same input method page without switching an input method keyboard, the input efficiency of the user is greatly improved, and the user experience of the user is improved. That is, compared with several existing modes, the self-adaptive keyboard layout is different from one another, and basically can cover the requirement of user input. Therefore, the user can input answers quickly and accurately, and the question making efficiency is greatly improved.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
The invention provides a multi-character structure self-adaptive input method and a layout generation method thereof, which are particularly suitable for education product software.
The layout generation method of the multi-character structure self-adaptive input method comprises the following steps:
step S1: generating an input method page basic template, wherein character placeholders are preset in the input method page basic template;
step S2: extracting necessary character structures in the target answer data, wherein the necessary character structures are necessary character structures when the target answer is input;
step S3: acquiring a category character structure associated with the essential character structure;
step S4: and filling the category character structure and the necessary character structure into a preset character placeholder of the basic template of the input method page to form the multi-character structure self-adaptive input method.
In step S1, the basic template of the input method page includes at least one commonly used input keyboard and a predetermined character placeholder, wherein the commonly used input keyboard is selected from one of a numeric input keyboard, a chinese input keyboard, an english input keyboard, and a handwriting input keyboard. In the embodiment of the scheme, the input method page basic template comprises a plurality of common input keyboards capable of being switched and selected and preset character placeholders, wherein the preset character placeholders are arranged on pages corresponding to the common input keyboards.
For example, as shown in fig. 1, the input method page basic template provided by the present disclosure includes a selectable numeric input keyboard, a chinese input keyboard, an english input keyboard, and a handwriting input keyboard, and can be switched to a corresponding common input keyboard page by selecting a corresponding input keyboard option, where each common input keyboard page is provided with a plurality of preset character placeholders displayed in dashed boxes, and the preset character placeholders are arranged at the sides of the common input keyboard without affecting normal display rules thereof.
Further comprising the step of, before step S2: whether the target answer data meets the LaTeX specification or not is judged, and if yes, the step S2 is executed, so that the target answer data is extracted according to the present embodiment. If the target answer data does not meet the LaTeX specification, the following processing methods are considered:
1. and ignoring the target answer data which do not conform to the LaTeX specification, and only processing the target answer data which conform to the LaTeX specification. At this time, if the target answer data is judged to be not in accordance with the LaTeX standard, the generation of the self-adaptive input method is stopped.
2. If the target answer data is determined not to conform to the LaTeX specification, a fixed keyboard is generated, and if the target answer data is determined to conform to the LaTeX specification, step S2 is executed.
3. And if the target answer data is judged to be not in accordance with the LaTeX specification, converting the target answer data into an HTML format, and converting the HTML format into LaTeX specification data through a third party tool, wherein the third party tool selects htmltelatex for example.
In step S2, the method for extracting the essential character structure in the target answer data includes: and performing lexical analysis on the target answer data according to an analysis rule to obtain a syntax tree, extracting each node and leaf in the syntax tree, and performing duplication removal to obtain a necessary character structure, wherein the target answer data conforms to the LaTeX standard.
Specifically, lexical analysis is sequentially performed on target answer data from left to right, and the characters are processed by using a set rule to obtain a syntax tree, wherein the set rule is as follows;
1. if \ the character is scanned, matching the following continuous letters by using the regular expression, and regarding the result as a formula structure, such as \ frac.
2. When (, [, { and other symbols, corresponding to the later }, ]) are scanned, all the contents included are recursively analyzed as a whole.
3. And if the monomials with a plurality of characters appear in the scanning process, splitting each character. The lowest granularity of the syntax tree is a single character, such as a number, letter, etc.
For example, if the target answer data is a formula:
the above rules are used to analyze the morphology, and the obtained LaTeX data corresponding to the company is "\\ frac { \ alpha } { x ^ 2} -1 }", and the obtained grammar tree is shown in FIG. 2, and the character structure is "\\ frac, \\\ alpha, -, < lambda >, 1, x, 2" extracted from the grammar tree.
However, in this step S2, preferably, in order to avoid implicitly including the correct answer in the generated essential character structure, the character structure extracted from the syntax tree may be filtered to obtain the essential character structure, and the rule of filtering may be selected as: screening input character types corresponding to the commonly used input keyboard in the input method page basic template mentioned in the step S1, for example, if the commonly used input keyboard comprises a numeric input keyboard, removing the numbers in the character structure extracted from the syntax tree to obtain a necessary character structure; and eliminating numbers and letters in the character structure to avoid the leakage of answers.
In step S3, a category character structure that is homogeneous with the essential character structure is extracted from the candidate character set based on the essential character structure, where the category character structure and the essential character structure have high correlation. In the scheme, the judgment basis for judging whether the correlation degree between the category character structure and the necessary character structure is high is as follows: whether the category character structure and the essential character structure occur simultaneously in the answer to the question is high.
That is, in step S3, a category character structure that is homogeneous with the essential character structure is extracted from the candidate character set based on the essential character structure, wherein the association degree of the category character structure and the essential character structure is proportional to the probability that the category character structure and the essential character structure occur simultaneously in the answer to the question.
The selection of the category character structure in the step is mainly based on the application scene of the scheme, the multi-character structure self-adaptive input method layout generation method provided by the scheme is particularly suitable for the keyboard design of the question answer aiming at the complex symbol in teaching product software, and the answer similarity aiming at the question answer directly influences the answer accuracy of the user. In other words, if the probability that two characters appear in the answer to the title at the same time is higher, the correlation between the two characters is more compact, and the higher the correlation is, the more confusing the user is.
Specifically, step S3 further includes the steps of:
s31, acquiring a sample answer data set consisting of at least two sample answer data, and extracting an alternative character set consisting of alternative characters of the sample answer data;
s32: calculating PMI values of the characters in the alternative character set pairwise to generate a PMI matrix, and calculating the similarity between the characters according to the PMI matrix;
and S33, selecting the character with high similarity with the necessary character structure as the category character structure.
In step S31, the method for extracting candidate characters from the sample answer data is the same as that in step S2, the sample answer data corresponds to a sample topic set, and preferably, the sample topic set includes a target topic corresponding to the target answer data, so that the sample answer data and the target answer data have higher comparability.
In step S2, the matrix values in the PMI matrix
In which C is
iAnd C
jRepresenting two different characters, P (C) representing the probability of a character appearing in the sample answer data, P (C)
i,C
j) Indicates the probability of two characters appearing simultaneously, PMI (C)
i, C
j) The dot type mutual information numerical value of the two characters is represented, the numerical value is negative to indicate that the two characters are mutually exclusive, the numerical value is 0 to indicate that the two characters are mutually independent, the numerical value is positive to indicate that the two characters are related, and the larger the positive numerical value is, the stronger the correlation is, and the higher the correlation is.
The similarity between the characters is calculated by utilizing the cosine similarity of PMI vectors corresponding to two characters:
wherein n is the total number of characters in the alternative character set, i is more than 0 and less than or equal to n, and j is more than 0 and less than or equal to n.
In step S3, the category character structure is determined according to the number of the predetermined number identifiers.
In step S4, the category character structure and the necessary character structure are converted into a loadable format of the basic template of the input method page, and the loaded format is filled into the preset character placeholder, so as to form the multi-character structure adaptive input method.
The method for converting the character structure into the loadable format of the basic template of the input method page comprises the following steps: designing a transfer functionf(φ) → Φ, where Φ includes characters and structures, the transfer function corresponds to the results of lexical analysis, e.g.,f(\alpha)=α,f(\ frac) =/, where α and/are character structures based on the LaTeX specification.
Illustratively, if the character structure extracted from the syntax tree is "\ frac, \\ alpha, -, ^ 1, x, 2", then the corresponding character structure of the acquirable template display is: alpha and/.
The method for displaying the loadable format of the basic template of the input method page comprises the following steps: for a mobile terminal at a web end, if a category character structure and a necessary character structure can be directly displayed through HTML, the category character structure and the necessary character structure are directly loaded; if the characters cannot be directly loaded, pictures or SVG vector diagrams can be generated for the category character structures and the necessary character structures through the canvas and loaded into corresponding preset character placeholders; for the Android-end mobile terminal, the loadable format of the basic template of the input method page can be designed into icons such as icon, and the icons are directly loaded into the preset character placeholder through the mapping relation.
In some embodiments, the category character structures and the essential character structures are loaded into the input method page base template with set rules including, but not limited to, the following:
1. and aiming at specific target answer data, specifying specific characters as a category character structure, and loading the category character structure into an input method page basic template. The condition is suitable for the condition of specially loading the designated characters corresponding to the error-prone answers and aiming at the questions with the error-prone answers.
2. Setting the priority order of different types of character structures, and displaying the character structures according to the priority order, such as the basic operators having priority over the relational symbols.
3. Setting the priority order of the character structures in the same category, such as the priority order in the basic operators is: div, ×, -, +.
4. Fixed display characters designating certain preset symbol placeholders, e.g., the first four placeholders on the right side of the layout template fixedly display basic operators and display in order of priority ÷ x-, -and + in order of priority.
As shown in fig. 3, the finally formed multi-character structure adaptive input method is presented.
For example, the character structure of the multi-character structure adaptive input method for the occurrence of the target answer data is as follows: 2, ° c, and, or,%, (), [ in ], <, >, = etc.
It is worth mentioning that the multi-character structure self-adaptive input method layout generation method provided by the scheme is particularly suitable for teaching product software, more particularly, is particularly suitable for answer input application of mathematical questions and chemical questions, taking a mathematical answer as an example, and the corresponding character structure is as follows; the number: 0 to 9; english letters: a-Z, A-Z; chinese logic expression: and, or, and, etc.; roman and greek characters: α, θ, π, γ, etc.; basic operators: =, -, ± etc., relational expression: greater than or equal to ═ not equal to ≦ less.
The technical advantages of the multi-character structure self-adaptive input method layout generation method provided by the scheme are as follows: the method is particularly suitable for the input method design of educational software products, particularly for the input method design of question answer input comprising various types of characters, and can automatically extract necessary character structures in any question answer, so that the method has wide applicability; in addition, the input method layout generation method is actively triggered when a user answers each question, namely, a unique and targeted input method page is provided for each question, so that the use efficiency and the use accuracy are improved, the multi-character structure self-adaptive input method designed by the scheme does not need to switch an input method keyboard, only the character structure in the preset character placeholder is replaced, and the scheme adopts the category character structure which is searched and has high similarity with the necessary character structure as filling, so that the targeted training user can be played to a certain extent, and the user is helped to correct the effect of easily-wrong answers.
According to another aspect of the present invention, the present invention provides a multi-character structure adaptive input method, which is specifically generated according to a layout generation method of the multi-character structure adaptive input method mentioned above for the input of a teaching question answer, wherein the teaching question answer includes multi-class character structures, which can be a mathematical question, a chemical question, a physical question, and the like, and the multi-character structure adaptive input method is loaded into teaching product software for use, so that the user experience of a user can be greatly improved, and the learning efficiency of the user can be further improved.
The multi-character structure adaptive input method layout generating method according to the present invention is applied to WEB terminals and mobile terminals, wherein the mobile terminals can be implemented in various forms, and in particular, the multi-character structure adaptive input method layout generating method is loaded on an educational software product of the mobile terminal, and the mobile terminal can be, for example, a mobile phone, a notebook computer, a PDA (personal digital assistant), a PAD (tablet computer), etc.
The multi-character structure adaptive input method layout generation method may be implemented as a computer program, and embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the above multi-character structure adaptive input method layout generation method. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program performs the above-described functions defined in the system of the present invention when executed by a Central Processing Unit (CPU).
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the mobile terminal described in the above embodiments; or may exist separately and not be assembled into the mobile terminal. The computer readable medium carries one or more computer programs, and when the one or more computer programs are executed by a mobile terminal, the mobile terminal executes the flow steps corresponding to the adaptive input method layout generation method.
The present invention is not limited to the above-mentioned preferred embodiments, and any other products in various forms can be obtained by anyone in the light of the present invention, but any changes in the shape or structure thereof, which have the same or similar technical solutions as those of the present application, fall within the protection scope of the present invention.