CN109933668B - Hierarchical evaluation modeling method for readability of simplified Chinese text - Google Patents
Hierarchical evaluation modeling method for readability of simplified Chinese text Download PDFInfo
- Publication number
- CN109933668B CN109933668B CN201910206775.7A CN201910206775A CN109933668B CN 109933668 B CN109933668 B CN 109933668B CN 201910206775 A CN201910206775 A CN 201910206775A CN 109933668 B CN109933668 B CN 109933668B
- Authority
- CN
- China
- Prior art keywords
- text
- readability
- difficulty
- features
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Abstract
The invention belongs to the field of Chinese language data processing, and particularly relates to a hierarchical evaluation modeling method for readability of simplified Chinese texts. The grading evaluation modeling method for the readability of the simplified Chinese text comprises the following steps: creating a standard corpus; extracting text features; and (4) constructing a readability formula and evaluating the effect of the formula. The invention selects the text characteristics of three layers of Chinese characters, vocabularies and sentences on the basis of the traditional Chinese readability formula, and constructs a Chinese text readability formula which is suitable for simplified Chinese native language at primary school and has grade classification.
Description
Technical Field
The invention belongs to the field of Chinese language data processing, and particularly relates to a hierarchical evaluation modeling method for readability of simplified Chinese texts.
Background
In the modern information society, the books for children grow exponentially, and the problem that how to select good books suitable for children from books in the great amount as in the tobacco sea is troubling teachers and parents is solved. According to the recent development area theory, the difficulty of reading materials for children is slightly higher than the current development level of children, but not too high, so as to achieve the purposes of training and improving the reading ability of children. If the selected reading material is too difficult, the reading efficiency of the children is damaged, so that the children can escape reading; and too simple materials can make children feel uninteresting and lose reading interest, and the purposes of cultivating reading habits and improving reading ability cannot be achieved. At present, most of the existing book grading systems are dominated by publishers, solid theoretical research is not taken as a foundation, the effectiveness of the book grading systems is also verified by empirical research, the book grading systems are not scientific enough, the public confidence is not high, the influence is not large, and the book grading systems have limited guiding significance for teenagers to read. In order to realize the matching of the reading ability of children and the difficulty of books, an objective and efficient Chinese text readability formula is researched and developed while the reading ability of children is accurately evaluated, and the text difficulty is accurately evaluated, so that the method is one of the difficulties and hot problems of the existing grading reading research.
The readability formula refers to extracting some quantifiable text features which affect reading difficulty by adopting a mathematical expression method, and determining a functional relation between the features and the text difficulty. Currently, there are dozens of readability formulas in the English system, such as the U.S. blues readability formula, the A-Z classification method, the Oxford reading Tree series in the United kingdom, and the like. The formulas have high accuracy and wide application range, and a huge grading reading system is established on the basis of the formulas, so that the formulas play a great role in promoting the reading ability cultivation and habit formation of English children and the like.
Because the Chinese language and the English language have great difference, the readability formula in the English world cannot be directly applied to Chinese text, but the Chinese readability formula of the prior searchable mathematical formula only has 7 items, mainly aims at traditional Chinese learners or Chinese teaching, most formulas do not provide clear grade division standards, and the reading selection guidance significance for pupils in continental region is limited. Therefore, creating a text readability formula for the primary school simplified Chinese native language remains a challenging frontier task.
Disclosure of Invention
The invention aims to provide a simplified Chinese text readability grading evaluation modeling method.
The method for modeling simplified Chinese text readability by hierarchical evaluation according to the specific embodiment of the invention comprises the following steps:
selecting a proper text to establish a standard corpus and carrying out grade marking on the text;
the characteristics of the text are extracted,
defining text difficulty characteristics of word, word and sentence levels, respectively carrying out word cutting, word and sentence labeling and the like on texts in a standard corpus, calculating difficulty characteristic values of each text, and then selecting an optimal characteristic set of the text difficulty characteristics;
a text readability grading evaluation formula is constructed,
the text in the standard corpus is divided into a training text set and a test text set,
the marked grade of the training text set is used as a dependent variable Y, and the optimal feature set is used as an independent variable (X)1,X2,X3) Adopting a linear regression model to obtain a readability grading evaluation formula as follows:
Yi=β0+β1X1i+β2X2i+β3X3i+μiwherein Y isiRepresenting the readability level (1-12), X, of the text1i,X2iAnd X3iValues, β, representing the three best feature sets of this text, respectively0Is constant, represents the intercept, beta1,β2And beta3Is a partial regression coefficient, representing the variable X with the other variables remaining unchanged1,X2Or X3The amount of change in the Y value by one unit;
and evaluating the readability formula by taking the test text set as a reference.
According to the grading evaluation modeling method for the readability of the simplified Chinese text, in the step of extracting the text characteristics, an NLPIR Chinese word segmentation system is adopted to perform word segmentation and part-of-speech tagging on the text.
According to the grading evaluation modeling method for readability of simplified Chinese texts, which is disclosed by the embodiment of the invention, the optimal feature set is selected through the following steps:
respectively calculating the correlation between all the text difficulty characteristics and the text difficulty grades, and sequencing the text difficulty characteristics from large to small according to the absolute value of the correlation coefficient;
according to the sorting, sequentially selecting text difficulty characteristic values to enter an alternative characteristic set, and establishing a regression equation;
and selecting the text difficulty features left in the alternative feature set through co-linear judgment to obtain an optimal feature set.
According to the grading evaluation modeling method for readability of simplified Chinese texts, the method for selecting the text difficulty characteristics left in the alternative characteristic set through collinearity judgment comprises the following steps:
if the text difficulty characteristic X in the alternative characteristic set is used1、X2、……XkThere is a number λ of not all 01、λ2……λkSo that λ1X1+λ2X2+……λk Xk+μiIf the candidate feature set is 0, the collinearity problem exists in the candidate feature set, at this time, two text difficulty features with the collinearity problem need to be found out, and under the condition that other features are not changed, Δ R after the two text difficulty features are added is compared2Retention of Δ R in the alternative feature set2Larger features; if the candidate feature set does not have the collinearity problem, calculating the Delta R after the feature is added2If Δ R2>2%, reserving the features in the alternative feature set, and otherwise, deleting the features;
and circulating the steps until all the text difficulty features in the alternative feature set are traversed.
According to the hierarchical evaluation modeling method for readability of simplified Chinese texts, the construction method of the hierarchical evaluation formula for readability of simplified Chinese texts comprises the following steps:
the marked grade of the training text set is used as a dependent variable Y, and the optimal feature set is used as an independent variable (X)1,X2,X3) Let Y follow X1,X2,X3Changes, and exists in a linear relationship: y isi=β0+β1X1i+β2X2i+β3X3i+μi(i ═ 1,2,3, …, n), supposeRespectively is a parameter beta0,β1,β2,β3The regression value of Y can be expressed as:
According to the method of least squares,should be such that all observations YkAnd the regression valueThe sum of squared deviations of (a) and (b) is minimized, i.e. Q is obtainedThe minimum value is obtained, and the minimum value,
according to the extreme value principle of the multivariate function, Q is respectively pairedFirst order partial derivatives are calculated and made equal to zero, i.e.In the form of a matrix of
Because of the fact that
Is provided withFor the estimated value vector, sample regression modelThe transposed matrix X' of the sample observation matrix X is multiplied by the two sides, thenGet the equation system
Since there is no multicollinearity, X 'X is a 4 th order square matrix, so X' X full rank, the inverse of X 'X (X' X)-1Exist, thusI.e. the OLS estimator for beta,
to obtainAccording to the grading evaluation modeling method for readability of the simplified Chinese text, which is provided by the specific embodiment of the invention, a test text set is taken as a reference, and a grading evaluation formula for readability of the simplified Chinese text is evaluated through the following steps:
calculating an observed value Y calculated from a readability formulaObservation ofAnd the actual value Y of the test text setPractice ofR between the two;
calculating the variation interpretation quantity R of the readability formula to the test text set data2,R2=r2;
Calculating the approach accuracy rate, wherein the approach accuracy rate is equal to YObservation of-YPractice ofIf the adjacent accuracy is not more than 1, the evaluation is determined to be correct; calculating the proportion of the total number of the correctly evaluated texts in the total number of the test text sets, namely the near accuracy;
calculating the root mean square error:
when 0< r <1, r is close to 1, and
0<R2<1,R2is close to 1, and
the closer the accuracy rate is 1, the closer the accuracy rate is to 1, and
the smaller the root mean square error is, the more accurate the readability grade evaluation formula is judged.
The invention has the beneficial effects that:
based on the characteristics of Chinese, the invention provides a hierarchical assessment modeling method which can carry out difficulty characteristic analysis and automation on three levels of Chinese characters, vocabularies and syntax on Chinese texts, and ensures the objectivity of text difficulty assessment;
based on the statistical principle, the feature optimization is carried out on the basis of comprehensively analyzing 44 text features, the model is simplified, the problem of multiple collinearity is avoided, and the intelligibility of the model is improved while the prediction accuracy is ensured;
the invention constructs a Chinese readability formula and a text grading system, can be combined with Chinese reading capability evaluation, finally establishes a ladder reading system with Chinese characteristics and promotes the ladder reading system, realizes the effective matching of the reading capability of students and the difficulty of books, and scientifically promotes the development of the reading capability of all teenagers and children.
Drawings
FIG. 1 shows a flow chart of a hierarchical assessment method of the present invention;
FIG. 2 shows a flow chart of optimal feature set selection.
Detailed Description
Example 1
As shown in fig. 1, the modeling method for hierarchical evaluation of readability of simplified chinese text of the present invention comprises the following steps:
1. establishing golden standard corpus, i.e. defining dependent variables
1.1 selecting appropriate text
The invention mainly aims at reading materials of primary school children in continental areas, so that the selected text is from four versions of primary school Chinese textbooks widely used in the continental areas, and mainly comprises a set of people education publishers, Beijing university publishers, Jiangsu education publishers and southwest university publishers, wherein each publisher is provided with a set (12 books), 48 books are counted, and each book has clear grade information (book number) which can be used as the grade of the text.
1.2 screening text
Because ancient Chinese and modern Chinese have great difference in syntax, word meaning, modern poetry does not have punctuation marks, it is difficult to make statistics of the text characteristics at the sentence level, so the texts of ancient poetry, ancient Chinese, modern poetry, etc. have been deleted through manual inspection. The final gold standard corpus has 1478 texts, which totals 801550 characters, and the specific information is shown in table 1.
TABLE 1 Standard corpus
1.3 text rating labels
And marking each text at a grade of 1-12 according to the number of appearing books of the text in the teaching material (each grade is divided into an upper school period and a lower school period, and the six grades are 12 books in total).
2. Extracting text features, i.e. defining arguments
2.1 defining text features
The invention defines 44 text difficulty characteristics of three layers of characters, words and sentences, and the specific text characteristic names and definitions are shown in table 2:
table 2 text feature summary
2.2 text preprocessing
The method adopts an NLPIR Chinese word segmentation system (originated from NLPIR. org (natural language processing and information retrieval shared platform)) to perform word segmentation and part-of-speech tagging on the text, and the word segmentation and tagging accuracy of the system reaches 98.45%.
2.3 text feature computation
2.3.1 counting the number of words, word numbers, word types and the number of punctuation marks in the article;
2.3.2 comparing the characters and words with a Chinese character stroke number table, a word difficulty level table and the like to obtain the relevant information of each word and word;
2.3.3, counting the part of speech distribution of the vocabulary;
2.3.4 the operative definition of 44 features in table 2, and the results of 2.3.1 to 2.3.3, the corresponding 44 feature values for each text were obtained.
2.4 selecting an optimal feature set
2.4.1 calculate 44 features (X) respectively1,X2,X3,……X44) A correlation coefficient (r) with the text difficulty level (Y), in particular
Wherein j is 1,2,3, … …, 44; n is 1478; sigmaXj,σYRepresents XjStandard deviation of Y; xjiRepresenting the fraction of the ith text on the characteristics of the jth text; y isiA text difficulty rating representing the ith text;representing the average of scores of all texts on the j text feature;representing the average of the Y values of all text.
2.4.2 according to the absolute value of the correlation coefficient (r), sorting 44 characteristics from large to small, and sequentially selecting one characteristic according to the sequenceInputting the candidate characteristic set and establishing a regression equation Yi=β0+β1X1i+β2X2i+……+βkXki+μi;
Wherein, YiIndicating the difficulty rating, X, of the ith text1i,X2i,……,XkiK candidate feature set scores, beta, representing the text, respectively0Is constant, represents the intercept, beta1,β2……,βkIs a partial regression coefficient, representing the variable X with the other variables remaining unchanged1,X2,……,XkThe amount of change in the Y value by one unit.
2.4.3 making collinearity decisions
If for feature X in the candidate feature set at this time1,X2,……XkThere is a constant lambda of not all 01,λ2……λkμ, such that λ1X1+λ2X2+……λk XkAnd the + mu is 0, namely, the co-linearity problem exists in the judgment candidate feature set. On the other hand, if the expression is not solved, the constant λ of not all 0 can not be found1,λ2……λkMu makes the equation true, then there is no collinearity problem.
When the collinearity problem exists in the alternative feature set, k features X in the alternative feature set are calculated1,X2,……XkIf the correlation coefficient between two characteristics is larger than 0.75, the collinearity problem of the two characteristics can be determined.
Hypothesis feature Xk-1And XkIf the collinearity problem exists, firstly establishing a regression equation model M without adding the two characteristics0:Yi=β0+β1X1i+……+βk-2Xk-2i+μi(the meaning of the parameters is the same as 2.4.2) and calculating multiple blocks of the model
Wherein the content of the first and second substances,the value of each text Y is calculated according to the regression modeliIs the actual value of Y and is,means the average value of Y values;
then, in the model M0Respectively adding the characteristics X on the basis of the characteristics ofk-1And XkEstablishing a model M1:Yi=β0+β1X1i+……+βk-2Xk-2i+βk-1Xk-1i+μi(the meaning of the parameters is 2.4.2) and M2:Yi=β0+β1X1i+……+βk-2Xk-2i+βkXki+μi(the meaning of the parameters is the same as 2.4.2), the multiple determination coefficients R of the models M1 and M2 are also obtainedM1 2And RM1 2. Finally, the calculation is compared to model M0In other words, model M1And model M2Increased R of2Variation amount: delta RM1 2=RM1 2-RM0 2;△RM2 2=RM2 2-RM0 2Retention of Δ R2All features in the larger model go into the set of candidate features.
If the candidate feature set does not have the co-linearity problem, calculating the Delta R after the feature is added2If Δ R2>2%, the feature is retained in the alternative feature set, otherwise the feature is deleted.
And 2.4.4 circulating the steps 2.4.2-2.4.3 until all the characteristics are traversed, and referring to the figure 2 in the flow chart.
2.4.5 finally obtaining an optimal feature set, wherein the optimal feature set finally comprises three features: the average difficulty of character types and the ratio of the virtual words in the character type and the literacy table.
3. Establishing readability formula and evaluating formula effect
3.1 determining training and test text sets
Randomly dividing the texts in each book of the Chinese teaching material into a training text set and a test text set, and ensuring that the number ratio of the texts in the training text set to the texts in the test text set in each version and each book is 1: 1.
3.2 establishing readability formulas
Marking the grade of the training text set as a dependent variable Y, and taking the optimal characteristic set (the character type, the average difficulty of the character types of the character learning table and the ratio of the null words) determined in the step 2.4 as an independent variable (X)1,X2,X3) Adopting a linear regression model to construct a readability formula, which is as follows:
let Y follow X1,X2,X3And in a linear relationship, formulated as follows:
Yi=β0+β1X1i+β2X2i+β3X3i+μi,
wherein, YiRepresenting the readability level of the text, X1i,X2i,X3iThe values of the average difficulty of the character type and the character type of the literacy table of the text, the virtual word proportion, beta0Is constant, represents the intercept, beta1,β2,β3Is a partial regression coefficient, representing the variable X with the other variables remaining unchanged1,X2Or X3The amount of change in the Y value by one unit.
Suppose thatRespectively is a parameter beta0,β1,β2,β3The regression value of Y can be expressed as:
According to the method of least squares,should be such that all observations YkAnd the regression valueThe sum of squared deviations of (a) and (b) is minimized, i.e. Q is obtainedThe minimum value is obtained.
According to the extreme value principle of the multivariate function, Q is respectively pairedFirst order partial derivatives are calculated and made equal to zero, i.e.After the arrangement and simplification, the matrix form is
Because of the fact that
Is provided withFor the estimated value vector, sample regression modelThe transposed matrix X' of the sample observation matrix X is multiplied by the two sides, thenGet normal system of equations
Since there is no multicollinearity, X 'X is a 4 th order square matrix, so X' X full rank, the inverse of X 'X (X' X)-1Exist, thusI.e. an OLS estimator for beta.
The resulting readability formula is:
grade number-4.84 +0.01*Type +3.34*Average difficulty of character type of character learning table +7.83*Ratio of the imaginary words.
3.3 readability formula evaluation
And evaluating the readability formula by taking the test text set as a reference, wherein the method specifically comprises the following steps:
3.3.1 calculate r value: calculating an observed value (Y) calculated from a readability formulaObservation of) And the actual value (Y) of the test text setPractice of) The correlation coefficient between (the calculation formula is as same as 2.4.1, concretely is
Wherein n is 1478; sigmaY observation,σY actualRespectively represent YObservation ofAnd YPractice ofStandard deviation of (d); y isObservation iRepresenting the difficulty level of the text calculated by the readability formula of the ith text; y isReality iRepresenting the actual text difficulty level of the ith text;representing an average of all text difficulty rating observations;representing the average of the actual values of all text difficulty ratings. The value of r ranges from 0 to 1, and the closer to 1, the better the readability formula is.
3.3.2 calculation of R2:R2Is an important index for measuring the regression result and represents the variation interpretation quantity of the readability formula on the difficulty value of the test text set, R2=r2。
R2The value range is between 0 and 1, and the closer to 1, the better the readability formula is.
3.3.3 calculate proximity accuracy: the near-accurate means that the observed value and the actual value are different by one level and the prediction is correct. For example, if the actual value of the text is 3, then the observed value is 2,3 or 4, and the adjacent accuracy is | YObservation of-YPractice of|<The text accounts for 1, the value range is between 0 and 1, and the closer to 1, the better the readability formula is.
3.3.4 root mean square error: the root mean square error is the square root deviation between an observed value and an actual value, and the specific calculation formula is as follows:
The indexes of the readability formula constructed by the invention are shown in table 3:
TABLE 3 readability formula indices
According to the result, the Chinese readability formula constructed by the method can be used for predicting the difficulty of Chinese texts in the primary school stage and carrying out 1-12-grade difficulty calibration.
Claims (4)
1. The hierarchical evaluation modeling method for the readability of the simplified Chinese text is characterized by comprising the following steps of:
selecting a proper text to establish a standard corpus and carrying out grade marking on the text;
extracting text features;
defining text difficulty characteristics of word, word and sentence levels, respectively carrying out word cutting and word, word and sentence marking processing on texts in a standard corpus, calculating difficulty characteristic values of each text, and then selecting an optimal characteristic set of the text difficulty characteristics;
a text readability grading evaluation formula is constructed,
the text in the standard corpus is divided into a training text set and a test text set,
the marked grade of the training text set is used as a dependent variable Y, and the optimal feature set is used as an independent variable (X)1,X2,X3) Adopting a linear regression model to obtain a readability grading evaluation formula as follows:
Yi=β0+β1X1i+β2X2i+β3X3i+μi,
wherein, beta0Is constant, represents the intercept, beta1,β2And beta3Is a partial regression coefficient, representing the variable X with the other variables remaining unchanged1,X2Or X3The amount of change in the Y value by one unit,
evaluating the readability grading evaluation formula by taking the test text set as a reference,
wherein the content of the first and second substances,
selecting an optimal feature set by:
respectively calculating correlation coefficients of the text difficulty features and the text difficulty grades, and sequencing the text difficulty features according to absolute values of the correlation coefficients;
according to the sorting, sequentially selecting the difficulty features to enter an alternative feature set, and establishing a regression equation;
selecting the text difficulty characteristics left in the candidate characteristic set through collinearity judgment to obtain an optimal characteristic set,
wherein the content of the first and second substances,
the method for selecting the text difficulty characteristics left in the alternative characteristic set through collinearity judgment comprises the following steps:
text difficulty feature X as in alternative feature set1、X2、……XkThere is a number λ of not all 01、λ2……λkSo that λ1X1+λ2X2+……λkXk+μiIf 0, the candidate features are concentrated to have a collinearity problem, at this time, two text difficulty features having the collinearity problem need to be found out, and Δ R after the two text difficulty features are respectively added is compared under the condition that other features are kept unchanged2Retention of Δ R in the alternative feature set2Larger features; if the candidate feature set does not have the collinearity problem, calculating the Delta R after the feature is added2If Δ R2>2%, the text difficulty feature is reserved in the alternative feature set, otherwise, the text difficulty feature is deleted;
and circulating the steps until all the text difficulty features in the alternative feature set are traversed.
2. The modeling method for hierarchical assessment of readability of simplified chinese text according to claim 1, wherein in the step of extracting the text features, the text is processed by word segmentation and part-of-speech tagging using NLPIR chinese segmentation system.
3. The modeling method for hierarchical evaluation of readability of simplified chinese text according to claim 1, wherein the readability hierarchical evaluation formula is constructed as follows:
the marked grade of the training text set is used as a dependent variable Y, and the optimal feature set is used as an independent variable (X)1,X2,X3) Let Y follow X1,X2,X3Changes, and exists in a linear relationship: y isi=β0+β1X1i+β2X2i+β3X3i+μi(i ═ 1,2,3, …, n), supposeRespectively is a parameter beta0,β1,β2,β3The regression value of Y can be expressed as:
According to the method of least squares,should be such that all observations YkAnd the regression valueIs minimized, i.e. such thatThe minimum value is obtained, and the minimum value,
according to the extreme value principle of the multivariate function, Q is respectively pairedFirst order partial derivatives are calculated and made equal to zero, i.e.In the form of a matrix of
Because of the fact that
Is provided withFor the estimated value vector, sample regression modelThe transposed matrix X' of the sample observation matrix X is multiplied by the two sides, thenGet the equation system
Since there is no multicollinearity, X 'X is a 4 th order square matrix, so X' X full rank, the inverse of X 'X (X' X)-1Exist, thusI.e. the OLS estimator for beta,
4. The modeling method for hierarchical assessment of readability of simplified chinese text according to claim 1, wherein the simplified chinese text readability hierarchical assessment formula is assessed with reference to the test text set by the following steps:
calculating an observed value Y calculated from a readability formulaObservation ofAnd the actual value Y of the test text setPractice ofThe correlation coefficient r between;
calculating the variation interpretation quantity R of the readability formula to the test text set data2,R2=r2;
Calculating the approach accuracy rate, wherein the approach accuracy rate is equal to YObservation of-YPractice ofIf the adjacent accuracy is not more than 1, the evaluation is determined to be correct; calculating the proportion of the total number of the correctly evaluated texts in the total number of the test text sets, namely the near accuracy;
calculating the root mean square error:
when 0< r <1, r is close to 1, and
0<R2<1,R2is close to 1, and
the closer the accuracy rate is 1, the closer the accuracy rate is to 1, and
the smaller the root mean square error is, the more accurate the readability grade evaluation formula is judged.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910206775.7A CN109933668B (en) | 2019-03-19 | 2019-03-19 | Hierarchical evaluation modeling method for readability of simplified Chinese text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910206775.7A CN109933668B (en) | 2019-03-19 | 2019-03-19 | Hierarchical evaluation modeling method for readability of simplified Chinese text |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109933668A CN109933668A (en) | 2019-06-25 |
CN109933668B true CN109933668B (en) | 2021-03-26 |
Family
ID=66987605
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910206775.7A Active CN109933668B (en) | 2019-03-19 | 2019-03-19 | Hierarchical evaluation modeling method for readability of simplified Chinese text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109933668B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472236A (en) * | 2019-07-23 | 2019-11-19 | 浙江大学城市学院 | A kind of two-way GRU text readability appraisal procedure based on attention mechanism |
CN111797499B (en) * | 2020-06-02 | 2023-12-15 | 黑龙江省农业科学院绥化分院 | Crop breeding multi-objective optimization method |
CN112115701B (en) * | 2020-09-07 | 2021-07-09 | 北京语言大学 | News reading text readability evaluation method and system |
CN112836275B (en) * | 2021-02-08 | 2023-03-14 | 哈尔滨工业大学 | Stadium emergency evacuation sign readability evaluation system based on fuzzy theory and control method thereof |
CN113408295B (en) * | 2021-06-22 | 2023-02-28 | 深圳证券信息有限公司 | Text readability evaluation method, computer device and computer storage medium |
CN113569556B (en) * | 2021-07-28 | 2024-04-02 | 怀化学院 | Grading method for children reading test text difficulty based on Ross model |
CN113934850B (en) * | 2021-11-02 | 2022-06-17 | 北京语言大学 | Chinese text readability evaluation method and system fusing text distribution law characteristics |
CN115147013B (en) * | 2022-08-31 | 2023-07-18 | 南京复保科技有限公司 | Insurance product readability calculating method, apparatus, computer device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207854A (en) * | 2012-01-11 | 2013-07-17 | 宋曜廷 | Chinese text readability measuring system and method thereof |
CN105068993A (en) * | 2015-07-31 | 2015-11-18 | 成都思戴科科技有限公司 | Method for evaluating text difficulty |
CN106951406A (en) * | 2017-03-13 | 2017-07-14 | 广西大学 | A kind of stage division of the Chinese reading ability based on text language variable |
CN107609591A (en) * | 2017-09-13 | 2018-01-19 | 深圳市悦好教育科技有限公司 | A kind of books stage division and system |
CN107977449A (en) * | 2017-12-14 | 2018-05-01 | 广东外语外贸大学 | A kind of linear model approach estimated for simplified form of Chinese Character readability |
CN108389147A (en) * | 2018-02-26 | 2018-08-10 | 浙江创课教育科技有限公司 | Item difficulty hierarchical processing method and system |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103544393B (en) * | 2013-10-23 | 2017-05-24 | 北京师范大学 | Method for tracking development of language abilities of children |
CN103530523B (en) * | 2013-10-23 | 2017-01-04 | 北京师范大学 | Child linguistic competence development evaluation modeling method |
US10162864B2 (en) * | 2015-06-07 | 2018-12-25 | Apple Inc. | Reader application system utilizing article scoring and clustering |
US10503829B2 (en) * | 2016-10-13 | 2019-12-10 | Booxby Inc. | Book analysis and recommendation |
CN106601041A (en) * | 2016-12-15 | 2017-04-26 | 邵宏锋 | Reading information grading analysis processing system |
CN107657559A (en) * | 2017-08-25 | 2018-02-02 | 北京享阅教育科技有限公司 | A kind of Chinese reading capability comparison method and system |
CN107977362B (en) * | 2017-12-11 | 2021-05-04 | 中山大学 | Method for grading Chinese text and calculating Chinese text difficulty score |
CN108984531A (en) * | 2018-07-23 | 2018-12-11 | 深圳市悦好教育科技有限公司 | Books reading difficulty method and system based on language teaching material |
-
2019
- 2019-03-19 CN CN201910206775.7A patent/CN109933668B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207854A (en) * | 2012-01-11 | 2013-07-17 | 宋曜廷 | Chinese text readability measuring system and method thereof |
CN105068993A (en) * | 2015-07-31 | 2015-11-18 | 成都思戴科科技有限公司 | Method for evaluating text difficulty |
CN106951406A (en) * | 2017-03-13 | 2017-07-14 | 广西大学 | A kind of stage division of the Chinese reading ability based on text language variable |
CN107609591A (en) * | 2017-09-13 | 2018-01-19 | 深圳市悦好教育科技有限公司 | A kind of books stage division and system |
CN107977449A (en) * | 2017-12-14 | 2018-05-01 | 广东外语外贸大学 | A kind of linear model approach estimated for simplified form of Chinese Character readability |
CN108389147A (en) * | 2018-02-26 | 2018-08-10 | 浙江创课教育科技有限公司 | Item difficulty hierarchical processing method and system |
Non-Patent Citations (2)
Title |
---|
Improvements in predicting children"s overall reading ability by modeling variability in evaluators" subjective judgments;Matthew P. Black,等;《2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20120831;第5069-5072页 * |
基于线性回归的中文文本可读性预测方法研究;孙刚;《中国优秀硕士学位论文全文数据库信息科技辑》;20160315;第23-50页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109933668A (en) | 2019-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109933668B (en) | Hierarchical evaluation modeling method for readability of simplified Chinese text | |
CN106503055B (en) | A kind of generation method from structured text to iamge description | |
CN107967318A (en) | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets | |
Van Hout et al. | Comparing measures of lexical richness | |
CN106528656A (en) | Student history and real-time learning state parameter-based course recommendation realization method and system | |
CN107943784A (en) | Relation extraction method based on generation confrontation network | |
CN109299380A (en) | Exercise personalized recommendation method in online education platform based on multidimensional characteristic | |
CN107977362A (en) | A kind of method defined the level for Chinese text and calculate the scoring of Chinese text difficulty | |
CN107832781A (en) | A kind of software defect towards multi-source data represents learning method | |
CN102279844A (en) | Method and system for automatically testing Chinese composition | |
CN114913729B (en) | Question selecting method, device, computer equipment and storage medium | |
Fuge et al. | Automatically inferring metrics for design creativity | |
CN105786898B (en) | A kind of construction method and device of domain body | |
KR102201709B1 (en) | Method and system for estimating a reading index using automatic analysis program for text of korean language | |
Mizumoto et al. | Modeling a prototypical use of language learning strategies | |
Rokade et al. | Automated grading system using natural language processing | |
CN108280065B (en) | Foreign text evaluation method and device | |
CN112015862A (en) | User abnormal comment detection method and system based on hierarchical multichannel attention | |
Dascalu et al. | Age of exposure: A model of word learning | |
Tack et al. | Human and automated CEFR-based grading of short answers | |
CN113486645A (en) | Text similarity detection method based on deep learning | |
Agarwal et al. | Autoeval: A nlp approach for automatic test evaluation system | |
CN112115701B (en) | News reading text readability evaluation method and system | |
CN111553821B (en) | Automatic problem solving method for application problems based on teacher-student network and multi-head decoder | |
CN112528011A (en) | Open type mathematic operation correction method, system and equipment driven by multiple data sources |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information |
Inventor after: Li Hong Inventor after: Liu Miaomiao Inventor after: Li Yan Inventor before: Li Hong Inventor before: Li Miaomiao Inventor before: Li Yan |
|
CB03 | Change of inventor or designer information |