CN104778160A - Analysis method for subject relevance of English composition contents - Google Patents

Analysis method for subject relevance of English composition contents Download PDF

Info

Publication number
CN104778160A
CN104778160A CN201510204370.1A CN201510204370A CN104778160A CN 104778160 A CN104778160 A CN 104778160A CN 201510204370 A CN201510204370 A CN 201510204370A CN 104778160 A CN104778160 A CN 104778160A
Authority
CN
China
Prior art keywords
composition
theme
training
reply
model essay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510204370.1A
Other languages
Chinese (zh)
Other versions
CN104778160B (en
Inventor
黄桂敏
杨国花
周娅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201510204370.1A priority Critical patent/CN104778160B/en
Publication of CN104778160A publication Critical patent/CN104778160A/en
Application granted granted Critical
Publication of CN104778160B publication Critical patent/CN104778160B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses an analysis method for subject relevance of English composition contents. The analysis method is characterized by comprising the following steps of firstly, performing training processing on a composition model essay set and a training composition set by an English composition subject-relevant analysis training module to construct analysis standards of a English composition subject-relevant degree; secondly, performing analysis processing on compositions to be corrected by an English composition subject-relevant analyzing and scoring module, and judging whether the compositions to be corrected are relevant to the subject or not by calculating the subject-relevant degree of the compositions to be corrected according to the analysis standards of the English composition subject-relevant degree.

Description

A kind of english composition content is kept to the point analytical approach
(1) technical field
The present invention relates to natural language processing technique, english composition content analysis techniques, specifically a kind of english composition content analytical approach of whether keeping to the point.
(2) background technology
Traditional text analyzing method mainly contains latent semantic analysis method, probability latent semantic analysis method and potential Dirichlet distribute method.Latent semantic analysis method be a kind of can the method for inherent semantic relation between analysing word, it adds a semantic dimension between text and word.But along with the appearance of probabilistic method, probability latent semantic analysis method instead of the new method that latent semantic analysis method becomes text analyzing.But probability latent semantic analysis method is to the more difficult acquisition accurate analysis of the text analyzing outside training text collection result.Therefore, on the basis of probability latent semantic analysis method, people also been proposed potential Dirichlet distribute method.Potential Dirichlet distribute is a kind of subject analysis method having supervision, it is when analyzing content of text and thematic relation, require that there is identical theme for training text, when analyzing other subject text with the training text of same subject, be difficult to obtain the content of text accurate analysis result whether relevant to theme.Therefore, in actual English teaching, analyze english composition content whether keep to the point, a kind of english composition content is needed to keep to the point analytical approach, whether analyze english composition content by it to keep to the point and the degree pertinent to the point of english composition content, this has important practical significance to raising english composition Automatic Read Overmarginalia level.
(3) summary of the invention
English composition content be author according to compostion topic and Writing Requirements, set forth the literal expression of oneself thought and viewpoint with correct English language, composition theme is compostion topic and the general name of composition Writing Requirements, the thought of the required expression of content of namely writing a composition and discussion.The object of this invention is to provide a kind of english composition content to keep to the point analytical approach, namely analyze whether composition content carries out setting forth author thought around composition theme.This analytical approach comprise english composition keep to the point analyzing and training module, english composition keep to the point analyze grading module, its overall procedure as shown in Figure 1, its treatment scheme is: first, by english composition analyzing and training module of keeping to the point, as collected works, training managing is carried out to composition model essay collection, training, build english composition and to keep to the point degree analytical standard; Whether the second, english composition is kept to the point and is analyzed grading module and carry out analyzing and processing to composition of awaiting the reply, and to keep to the point degree analytical standard according to english composition, keep to the point by calculating the awaiting the reply composition degree composition that judges to await the reply of keeping to the point.Keep to the point analyzing and training module, english composition of the english composition of analytical approach of the present invention is kept to the point and is analyzed grading module computing formula and be defined as follows:
(1) training composition content topic probability distribution computing formula
Training composition content topic probability distribution refers to the probability distribution of training composition content on its theme, and its computing formula is as follows:
In formula (1), | training composition idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., n, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Training composition ibe that i-th section of training composition is concentrated in training composition, the training composition sum that collected works are made in training is n; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k; Theme sampling number is the symmetrical Dirichlet distribute of training composition content topic probability distribution, and value is 0.1.
(2) training composition theme feature Word probability distribution computing formula
The distribution of training composition theme feature Word probability refers to the probability distribution of training composition theme on Feature Words, and its computing formula is as follows:
In formula (2), | training composition Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., m, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Training composition Feature Words ibe that the training composition occurred in training composition concentrates i-th Feature Words with composition model essay, training composition is m with the Feature Words sum of composition model essay collection; It is total with the Feature Words of composition model essay collection that Feature Words number refers to that collected works are made in training, and value is m; Feature Words sampling number is the symmetrical Dirichlet distribute of training composition theme feature Word probability distribution, and value is 0.01; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k.
(3) model essay content topic probability distribution of writing a composition computing formula
Composition model essay content topic probability distribution refers to the probability distribution of composition model essay content on its theme, and its computing formula is as follows:
In formula (3), | composition model essay idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., z, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Composition model essay ibe that i-th section of composition model essay concentrated in composition model essay, the composition model essay sum of composition model essay collection is z; Collected works and composition model essay collection is made, theme if input is training jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k; If input awaits the reply to make collected works and composition model essay collection, theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, number of topics refers to and awaits the reply that to make collected works total with the composition theme of model essay collection of writing a composition, and value is k; Theme sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.1.
(4) based on the composition model essay theme feature Word probability distribution computing formula of training composition
Composition model essay theme feature Word probability distribution based on training composition refers to train based on composition and composition model essay Feature Words number, and calculate the probability distribution of composition model essay theme on Feature Words, its computing formula is as follows:
In formula (4), | composition model essay Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Composition model essay Feature Words ibe that the training composition occurred in composition model essay concentrates i-th Feature Words with composition model essay, training composition is r with composition model essay collection Feature Words sum; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, collected works are made in training and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that training composition is total with the Feature Words of composition model essay collection, and value is r; Feature Words sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.01.
(5) training composition judgment value computing formula pertinent to the point
Training composition judgment value of keeping to the point refers to find out the maximum theme of training composition from training composition content topic probability distribution, and carry out whether training of judgement composition content carries out setting forth author thought around composition theme, its computing formula is as follows:
In formula (5), the training composition theme that the training composition content topic probability distribution of training the maximum theme of composition to refer to that formula (1) calculates is maximum, the composition model essay theme that the composition model essay content topic probability distribution that the maximum theme of model essay of writing a composition refers to that formula (3) calculates is maximum.
(6) training composition is kept to the point and is spent computing formula
Training composition degree pertinent to the point refers to the degree of closeness of training composition content and its composition subject requirement, and its computing formula is as follows:
In formula (6), theme jthat training is made collected works and concentrated jth to write a composition theme with composition model essay, model essay of writing a composition ithat i-th section of composition model essay concentrated in composition model essay; It is k that training makes collected works with the composition theme sum of composition model essay collection, and the composition model essay sum of composition model essay collection is z; Training composition is kept to the point and is spent span between 0 to 2, if training composition content is more pertinent to the point, then training composition angle value pertinent to the point is larger; If training composition content complete irrelevant to the subject time, then training composition keep to the point angle value be 0; If when training composition content to keep to the point completely, then training composition angle value pertinent to the point is 2.
(7) based on the composition model essay theme feature Word probability distribution computing formula of composition of awaiting the reply
Composition model essay theme feature Word probability distribution based on composition of awaiting the reply refers to that calculate the distribution probability of composition model essay theme on Feature Words, its computing formula is as follows based on await the reply composition and composition model essay Feature Words number:
In formula (7), | composition model essay Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Composition model essay Feature Words ibe that the composition of awaiting the reply occurred in composition model essay concentrates i-th Feature Words with composition model essay, composition of awaiting the reply is r with composition model essay collection Feature Words sum; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, await the reply and make collected works and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that composition of awaiting the reply is total with the Feature Words of composition model essay collection, and value is r; Feature Words sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.01.
(8) the composition content topic probability distribution that awaits the reply computing formula
The composition content topic probability distribution that awaits the reply refers to the probability distribution of composition content on its theme that await the reply, and its computing formula is as follows:
In formula (8), | composition of awaiting the reply idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., u, j=1,2 ..., k; To await the reply composition ibe the concentrated i-th section of composition of awaiting the reply of composition of awaiting the reply, the composition sum that awaits the reply making collected works that awaits the reply is u; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, number of topics refers to and awaits the reply that to make collected works total with the composition theme of model essay collection of writing a composition, and value is k; Feature Words is word relevant to composition theme in composition content; Theme sampling number is the symmetrical Dirichlet distribute of composition content topic probability distribution of awaiting the reply, and value is 0.1.
(9) await the reply composition theme feature Word probability distribution computing formula
Composition theme feature Word probability distribution of awaiting the reply refers to the probability distribution of composition theme on Feature Words that await the reply, and its computing formula is as follows:
In formula (9), | await the reply composition Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Await the reply composition Feature Words ibe that the composition of awaiting the reply occurred in composition of awaiting the reply concentrates i-th Feature Words with composition model essay, composition of awaiting the reply is v with the Feature Words sum of composition model essay collection; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, await the reply and make collected works and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that composition of awaiting the reply is total with the Feature Words of composition model essay collection, and value is v; Feature Words sampling number is the symmetrical Dirichlet distribute of composition theme feature Word probability distribution of awaiting the reply, and value is 0.01.
(10) await the reply composition judgment value computing formula pertinent to the point
Awaiting the reply composition judgment value of keeping to the point refers to find out the maximum theme of composition that awaits the reply from the composition content topic probability distribution that awaits the reply, and whether the composition content that judges to await the reply carries out setting forth the thought of author around composition theme, its computing formula is as follows:
In formula (10), the maximum theme of composition that awaits the reply refers to the composition model essay theme that composition model essay content topic probability distribution that the composition theme that awaits the reply that the composition content topic probability distribution that awaits the reply that formula (8) calculates is maximum, the maximum theme of composition model essay refer to that formula (3) calculates is maximum.
(11) composition of awaiting the reply is kept to the point and is spent computing formula
Composition of awaiting the reply degree pertinent to the point referring to the degree of closeness of await the reply composition content and its composition theme, and its computing formula is as follows:
In formula (11), theme jawait the reply to make collected works and concentrate jth to write a composition theme with composition model essay, model essay of writing a composition ithat i-th section of composition model essay concentrated in composition model essay; The composition theme sum making collected works and composition model essay collection that awaits the reply is k, and the composition model essay sum of composition model essay collection is z; Awaiting the reply writes a composition keeps to the point degree span between 0 to 2, if the composition content that awaits the reply is more pertinent to the point, then awaiting the reply composition angle value pertinent to the point is larger; If awaited the reply write a composition content complete irrelevant to the subject time, then awaiting the reply composition keep to the point angle value be 0; When content of writing a composition if awaited the reply is kept to the point completely, then composition of awaiting the reply angle value pertinent to the point is 2.
(4) concrete steps
Keep to the point analyzing and training module, english composition of the english composition of analytical approach of the present invention keeps to the point that to analyze grading module treatment scheme as described below.
As shown in Figure 2, described english composition analyzing and training resume module flow process of keeping to the point is as follows:
S0201 starts;
S0202 reads in composition model essay collection;
S0203 reads in training and makes collected works;
Stop words, punctuate, abb. that S0204 removes composition model essay collection and trains composition to concentrate;
S0205 calculation training makes collected works and the theme probability distribution of Feature Words concentrated in composition model essay;
S0206 arranges maximum iteration time;
If S0207 iterations is greater than maximum iteration time, then turn S0211 operation;
S0208 is according to formula (1) calculation training composition content topic probability distribution, the distribution of formula (2) calculation training composition theme feature Word probability, formula (3) calculates composition model essay content topic probability distribution, and formula (4) calculates the composition model essay theme feature Word probability distribution based on training composition;
S0209 calculation training composition content topic probability distribution and the product of training theme feature Word probability of writing a composition to distribute, calculate the product that composition model essay content topic probability distribution distributes with composition model essay theme feature Word probability;
S0210 iterations increases by 1, turns S0207;
S0211 preserves training composition content topic probability distribution, the distribution of training composition theme feature Word probability, composition model essay content topic probability distribution, composition model essay theme feature Word probability distribution based on training composition;
S0212 finds out the maximum theme of training composition from training composition content topic probability distribution, from composition model essay content topic probability distribution, find out the maximum theme of composition model essay;
S0213 is according to formula (5) calculation training composition judgment value pertinent to the point;
S0214 keeps to the point according to the composition of formula (6) calculation training and spends;
Training that S0215 analysis meter calculates composition keep to the point judgment value, training composition keep to the point degree and training write a composition artificial judgment value of keeping to the point, english composition is pertinent to the point spends analytical standard train the composition manually consistance spent pertinent to the point to obtain;
S0216 exports english composition degree pertinent to the point analytical standard;
S0217 terminates.
As shown in Figure 3, described english composition analysis pertinent to the point grading module treatment scheme is as follows:
S0301 starts;
S0302 reads in composition model essay collection;
S0303 reads in composition of awaiting the reply;
Stop words, punctuate, abb. that S0304 removes composition model essay collection and awaits the reply in composition;
S0305 calculates await the reply composition and composition model essay and concentrates the theme probability distribution of Feature Words;
S0306 arranges maximum iteration time;
If S0307 iterations is greater than maximum iteration time, then turn S0311 operation;
S0308 calculates according to formula (7) the composition content topic probability distribution that awaits the reply, formula (8) calculates composition theme feature Word probability distribution of awaiting the reply, formula (3) calculates composition model essay content topic probability distribution, and formula (9) calculates the composition model essay theme feature Word probability distribution based on composition of awaiting the reply;
S0309 calculates await the reply composition content topic probability distribution and the product that theme feature Word probability distribute of write a composition that awaits the reply, and calculates composition model essay content topic probability distribution and product that model essay theme feature Word probability of writing a composition distributes;
S0310 iterations increases by 1;
S0311 preserves the composition content topic probability distribution that awaits the reply, composition theme feature Word probability distribution of awaiting the reply, composition model essay content topic probability distribution, composition model essay theme feature Word probability distribution based on composition of awaiting the reply;
S0312 finds out the maximum theme of composition that awaits the reply from the composition content topic probability distribution that awaits the reply, from composition model essay content topic probability distribution, find out the maximum theme of composition model essay;
S0313 calculates composition of awaiting the reply judgment value pertinent to the point according to formula (10);
S0314 keeps to the point according to the composition of formula (11) calculation training and spends;
S0315 exports composition of awaiting the reply result pertinent to the point;
S0316 terminates.
(4) accompanying drawing explanation
Fig. 1 is the overall process flow figure of the inventive method;
Fig. 2 is that the english composition of the inventive method is kept to the point analyzing and training resume module process flow diagram;
Fig. 3 is the english composition analysis pertinent to the point grading module processing flow chart of the inventive method.
(5) embodiment
The keep to the point embodiment of analytical approach of a kind of english composition content of the present invention is divided into following two steps.
First step: perform " english composition keep to the point analyzing and training module "
One, the english composition that the composition model essay collection inputted, training have drawn from Chinese Learner English Corpus, CLEC as collected works.The exercise question of model essay of writing a composition in embodiment of the present invention is " My View on Job-Hopping ", but is not limitation of the invention, and the composition model essay of other exercise questions also can.The exercise question of the training composition of present embodiment input comprises " My View on Job-Hopping ", " Haste Makes Waste ".And present embodiment composition themes as:
Theme 1:view, job-hopping, people, enjoy, taking
Theme 2:perseverance, child, view, job-hopping, people
Theme 3:view, job-hopping, exercise, work, confidence
Theme 4:view, job-hopping, people, enjoy, taking
Theme 5:changing, excellently, view, job-hopping, people
Theme 6:job, people, view, change, job-hopping
Theme 7:job, devote, feel, view, job-hopping
Theme 8:job, challenges, good, view, job-hopping
Theme 9:life, jobs, people, likes, whatever
Theme 10:makes, haste, waste, reason, quickly
When the exercise question of the training composition inputted is " My View on Job-Hopping ", result of implementation is as described below:
(1) input composition model essay collection, training composition, wherein one section of english composition content is as follows:
My View on Job-Hopping
Some people enjoy taking up one job all their life.Because they think that it canexercise their perseverance.Another reason is that someone has a wish that he wantto devote himself to one job which he likes best from a child.Others do the work allthe time only because of their characters.
However,some people like changing their jobs because that they like challenges.They always have confidence that they can finish any work by their efforts.
My view on job-hopping is that whatever jobs you do,you should like them.If youwant to do a job excellently,you must be interested in it at first.Without interests,you can not devote yourself on it.Then,you certainly can not do it well.But,whenyou put your hearts on the job,you will find it so good,and you will feel that yourlife is also lively.
(2), after removing the stop words in the english composition of input, punctuate, abb., the composition content of generation is as follows:
view job-hopping people enjoy taking job life exercise perseverance reason wishdevote job likes best child work time characters people changing jobs challengesconfidence finish work efforts view job-hopping whatever jobs job excellentlyinterested interests devote well put hearts job find good feel life lively
(3) carry out iterative processing to the composition model essay collection after the removal stop words inputted, punctuate, abb., training composition, the training composition content topic probability distribution of generation, composition model essay content topic probability distribution are as follows:
The training composition theme feature Word probability distribution generated, model essay composition theme feature Word probability distribute as follows, the distribution of training composition theme feature Word probability, model essay composition theme feature Word probability distribute too many, cannot enumerate, only list the content of the distribution of part training composition theme feature Word probability, the distribution of composition model essay theme feature Word probability below, remaining training composition theme feature Word probability distribution, composition model essay theme feature Word probability distribution suspension points replace:
(3) search the maximum theme of training composition by training composition content topic probability distribution, search the maximum theme of composition model essay by composition model essay content topic probability distribution, result is as follows:
The maximum theme of training composition is: theme 6
The maximum theme of composition model essay is: theme 6
(4) calculation training composition judgment value pertinent to the point
According to formula (5) calculation training composition judgment value pertinent to the point, owing to training maximum theme of writing a composition identical with the maximum theme of composition model essay, so result of calculation is for training judgment value pertinent to the point of writing a composition to be 1, namely training composition is pertinent to the point.
(5) according to formula (6), by training composition content topic probability distribution, composition model essay content topic probability distribution, calculation training composition degree pertinent to the point, result of calculation is:
Training composition is kept to the point and is spent: 1.6458646966570719
Two, when the exercise question of the training composition inputted is " Haste Makes Waste ", result of implementation is as described below:
(1) input composition model essay collection, training composition, wherein one section of english composition content is as follows:
Haste Makes Waste
As a proverb say:Haste Makes Waste.It's quite clear that a haste people can'tmake achievement because he hasn't prepared enough.It is known to all of us.No onecan deny the proverb.Haste makes waste.For example:a very young baby,as we allknow,can't walk very well.He walks slowly.He throws himself to the ground now andthen.However,his mother let him run to her.He can't reach to her without any help.Every one learns to walk in childhood.No one can deny it cost him many time to walkwell,much more time to run.From the above we can conclude that without preparing
can't make a success.I have the opinion that haste makes waste.So weshould think it over before we begin it.Don't you think so?
(2), after removing the stop words in the english composition of input, punctuate, abb., the composition content of generation is as follows:
haste makes waste proverb say haste makes waste quite clear haste people makeachievement because prepared enough known deny the proverb haste makes waste exampleyoung baby walk walks slowly throws ground however mother let run reach without helplearns walk childhood deny cost time walk well more time run conclude without preparingmake success opinion haste makes waste think begin think
(3) carry out iterative processing to the composition model essay collection after the removal stop words inputted, punctuate, abb., training composition, the training composition content topic probability distribution of generation, composition model essay content topic probability distribution are as follows:
The training composition theme feature Word probability distribution generated, model essay composition theme feature Word probability distribute as follows, the distribution of training composition theme feature Word probability, model essay composition theme feature Word probability distribute too many, cannot enumerate, only list the content of the distribution of part training composition theme feature Word probability, the distribution of composition model essay theme feature Word probability below, remaining training composition theme feature Word probability distribution, composition model essay theme feature Word probability distribution suspension points replace:
(3) search the maximum theme of training composition by training composition content topic probability distribution, search the maximum theme of composition model essay by composition model essay content topic probability distribution, result is as follows:
The maximum theme of training composition is: theme 10
The maximum theme of composition model essay is: theme 7
(4) calculation training composition judgment value pertinent to the point
According to formula (5) calculation training composition judgment value pertinent to the point, owing to training maximum theme of writing a composition not identical with the maximum theme of composition model essay, so result of calculation is for training judgment value pertinent to the point of writing a composition to be 0, composition is namely trained to digress from the subject.
(5) according to formula (6), by training composition content topic probability distribution, composition model essay content topic probability distribution calculation training composition degree pertinent to the point, result of calculation is:
Training composition is kept to the point and is spent: 0.025421879261034
Three, the training composition that analysis meter calculates concentrate the training composition of every section of training composition keep to the point judgment value, training composition keep to the point degree and training write a composition artificial judgment value of keeping to the point, english composition is pertinent to the point spends analytical standard train the composition manually consistance spent pertinent to the point to obtain.
Second step: perform " english composition is kept to the point and analyzed grading module "
English composition is kept to the point and analyzed grading module is the english composition degree pertinent to the point analytical standard utilizing above-mentioned first step to generate, and carries out keeping to the point and analyzes, finally export the result awaiting the reply and write a composition and keep to the point and analyze to composition of awaiting the reply.
(1) input that to be below one section of exercise question be " My View on Job-Hopping " is awaited the reply composition:
My View on Job-Hopping
In these days,we may change our jobs constantly for all kinds of reasons.But dopeople like it?Here are some news.
Someone like do one job all along.They think that doing one job for long time,theymay get lots of experience from it and do it better and better.More important is thatworkmates are familiar to each other.However,someone change their jobs constantly.They think that only do many jobs,can they find which one they like most and theymay have more skills,meet more people and know more.
I think if you like your jobs.You may go on with it,it is good for your future.If you disgust it,you may change it and look for better ones.But be careful,youmust do everything from the very beginning when you get a new one.
Carry out keeping to the point to composition of awaiting the reply and analyze, analysis result is as follows:
Await the reply the maximum theme of composition: theme 6
The maximum theme of composition model essay: theme 6
Composition of awaiting the reply judgment value pertinent to the point is: 1
Composition of awaiting the reply degree pertinent to the point is: 1.7093883624062147.
(2) input that to be below one section of exercise question be " Haste Makes Waste " is awaited the reply composition:
Haste Makes Waste
In China there is a proverb:Haste makes waste.It means if you want something tobe done quickly,however,it would work slowly;if you want to make something donebetter,but it would be worse.Why people think haste makes waste?The reason is that,when someone plans to do something,he always hopes to do it as quickly as possible,which may result in failure,so he must do it from the beginning,leading to wastetime.
For example,in winter,students get up late.Because we are too late to catch thebus on time,so we want to save time.We may hurry to carry our books and notes tothe classroom,but when we reach the classroom,we would find the pen left in thedormitory,or we find we got the wrong notes.This is a good example of haste makeswaste.
In order to avoid of haste makes waste,we should do everything carefully,we shouldnot wonder the result but wonder be careful.So,we can do everything well rather thanhaste makes waste.
Carry out keeping to the point to composition of awaiting the reply and analyze, analysis result is as follows:
Await the reply the maximum theme of composition: theme 9
The maximum theme of composition model essay: theme 7
Composition of awaiting the reply judgment value pertinent to the point is: 0
Composition of awaiting the reply degree pertinent to the point is: 0.142576948213569.

Claims (4)

1. english composition content is kept to the point an analytical approach, it is characterized in that: the first, carries out training managing by english composition analyzing and training module of keeping to the point to composition model essay collection, training as collected works, builds english composition and to keep to the point degree analytical standard; Whether the second, english composition is kept to the point and is analyzed grading module and carry out analyzing and processing to composition of awaiting the reply, and to keep to the point degree analytical standard according to english composition, keep to the point by calculating the awaiting the reply composition degree composition that judges to await the reply of keeping to the point.
2. method according to claim 1, is characterized in that: described english composition the keep to the point computing formula of analyzing grading module of analyzing and training module, english composition of keeping to the point is as follows:
(1) training composition content topic probability distribution computing formula
Training composition content topic probability distribution refers to the probability distribution of training composition content on its theme, and its computing formula is as follows:
In formula (1), | training composition idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., n, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Training composition ibe that i-th section of training composition is concentrated in training composition, the training composition sum that collected works are made in training is n; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k; Theme sampling number is the symmetrical Dirichlet distribute of training composition content topic probability distribution, and value is 0.1;
(2) training composition theme feature Word probability distribution computing formula
The distribution of training composition theme feature Word probability refers to the probability distribution of training composition theme on Feature Words, and its computing formula is as follows:
In formula (2), | training composition Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., m, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Training composition Feature Words ibe that the training composition occurred in training composition concentrates i-th Feature Words with composition model essay, training composition is m with the Feature Words sum of composition model essay collection; It is total with the Feature Words of composition model essay collection that Feature Words number refers to that collected works are made in training, and value is m; Feature Words sampling number is the symmetrical Dirichlet distribute of training composition theme feature Word probability distribution, and value is 0.01; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k;
(3) model essay content topic probability distribution of writing a composition computing formula
Composition model essay content topic probability distribution refers to the probability distribution of composition model essay content on its theme, and its computing formula is as follows:
In formula (3), | composition model essay idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., z, j=1,2 ..., k; Feature Words is word relevant to composition theme in composition content; Composition model essay ibe that i-th section of composition model essay concentrated in composition model essay, the composition model essay sum of composition model essay collection is z; Collected works and composition model essay collection is made, theme if input is training jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, it is total with the composition theme of model essay collection of writing a composition that number of topics refers to that collected works are made in training, and value is k; If input awaits the reply to make collected works and composition model essay collection, theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, number of topics refers to and awaits the reply that to make collected works total with the composition theme of model essay collection of writing a composition, and value is k; Theme sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.1;
(4) based on the composition model essay theme feature Word probability distribution computing formula of training composition
Composition model essay theme feature Word probability distribution based on training composition refers to train based on composition and composition model essay Feature Words number, and calculate the probability distribution of composition model essay theme on Feature Words, its computing formula is as follows:
In formula (4), | composition model essay Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Composition model essay Feature Words ibe that the training composition occurred in composition model essay concentrates i-th Feature Words with composition model essay, training composition is r with composition model essay collection Feature Words sum; Theme jbe that training is made collected works and concentrated jth write a composition theme with composition model essay, collected works are made in training and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that training composition is total with the Feature Words of composition model essay collection, and value is r; Feature Words sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.01;
(5) training composition judgment value computing formula pertinent to the point
Training composition judgment value of keeping to the point refers to find out the maximum theme of training composition from training composition content topic probability distribution, and carry out whether training of judgement composition content carries out setting forth author thought around composition theme, its computing formula is as follows:
In formula (5), the training composition theme that the training composition content topic probability distribution of training the maximum theme of composition to refer to that formula (1) calculates is maximum, the composition model essay theme that the composition model essay content topic probability distribution that the maximum theme of model essay of writing a composition refers to that formula (3) calculates is maximum;
(6) training composition is kept to the point and is spent computing formula
Training composition degree pertinent to the point refers to the degree of closeness of training composition content and its composition subject requirement, and its computing formula is as follows:
In formula (6), theme jthat training is made collected works and concentrated jth to write a composition theme with composition model essay, model essay of writing a composition ithat i-th section of composition model essay concentrated in composition model essay; It is k that training makes collected works with the composition theme sum of composition model essay collection, and the composition model essay sum of composition model essay collection is z; Training composition is kept to the point and is spent span between 0 to 2, if training composition content is more pertinent to the point, then training composition angle value pertinent to the point is larger; If training composition content complete irrelevant to the subject time, then training composition keep to the point angle value be 0; If when training composition content to keep to the point completely, then training composition angle value pertinent to the point is 2;
(7) based on the composition model essay theme feature Word probability distribution computing formula of composition of awaiting the reply
Composition model essay theme feature Word probability distribution based on composition of awaiting the reply refers to that calculate the distribution probability of composition model essay theme on Feature Words, its computing formula is as follows based on await the reply composition and composition model essay Feature Words number:
In formula (7), | composition model essay Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Composition model essay Feature Words ibe that the composition of awaiting the reply occurred in composition model essay concentrates i-th Feature Words with composition model essay, composition of awaiting the reply is r with composition model essay collection Feature Words sum; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, await the reply and make collected works and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that composition of awaiting the reply is total with the Feature Words of composition model essay collection, and value is r; Feature Words sampling number is the symmetrical Dirichlet distribute of composition model essay content topic probability distribution, and value is 0.01;
(8) the composition content topic probability distribution that awaits the reply computing formula
The composition content topic probability distribution that awaits the reply refers to the probability distribution of composition content on its theme that await the reply, and its computing formula is as follows:
In formula (8), | composition of awaiting the reply idistribute to theme jfeature Words number+theme sampling number | ijthe matrix of the capable j row of i, the matrix that an i is capable, i=1,2 ..., u, j=1,2 ..., k; To await the reply composition ibe the concentrated i-th section of composition of awaiting the reply of composition of awaiting the reply, the composition sum that awaits the reply making collected works that awaits the reply is u; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, number of topics refers to and awaits the reply that to make collected works total with the composition theme of model essay collection of writing a composition, and value is k; Feature Words is word relevant to composition theme in composition content; Theme sampling number is the symmetrical Dirichlet distribute of composition content topic probability distribution of awaiting the reply, and value is 0.1;
(9) await the reply composition theme feature Word probability distribution computing formula
Composition theme feature Word probability distribution of awaiting the reply refers to the probability distribution of composition theme on Feature Words that await the reply, and its computing formula is as follows:
In formula (9), | await the reply composition Feature Words idistribute to theme jnumber of times+Feature Words sampling number | ijthe matrix of the capable j row of i, the matrix of j row, i=1,2 ..., r, j=1,2 ..., k; Await the reply composition Feature Words ibe that the composition of awaiting the reply occurred in composition of awaiting the reply concentrates i-th Feature Words with composition model essay, composition of awaiting the reply is v with the Feature Words sum of composition model essay collection; Theme jbe await the reply to make collected works and concentrate jth write a composition theme with composition model essay, await the reply and make collected works and the composition theme sum of model essay collection of writing a composition is k; Feature Words is word relevant to composition theme in composition content; Feature Words number refers to that composition of awaiting the reply is total with the Feature Words of composition model essay collection, and value is v; Feature Words sampling number is the symmetrical Dirichlet distribute of composition theme feature Word probability distribution of awaiting the reply, and value is 0.01;
(10) await the reply composition judgment value computing formula pertinent to the point
Awaiting the reply composition judgment value of keeping to the point refers to find out the maximum theme of composition that awaits the reply from the composition content topic probability distribution that awaits the reply, and whether the composition content that judges to await the reply carries out setting forth the thought of author around composition theme, its computing formula is as follows:
In formula (10), the maximum theme of composition that awaits the reply refers to the composition model essay theme that composition model essay content topic probability distribution that the composition theme that awaits the reply that the composition content topic probability distribution that awaits the reply that formula (8) calculates is maximum, the maximum theme of composition model essay refer to that formula (3) calculates is maximum;
(11) composition of awaiting the reply is kept to the point and is spent computing formula
Composition of awaiting the reply degree pertinent to the point referring to the degree of closeness of await the reply composition content and its composition theme, and its computing formula is as follows:
In formula (11), theme jawait the reply to make collected works and concentrate jth to write a composition theme with composition model essay, model essay of writing a composition ithat i-th section of composition model essay concentrated in composition model essay; The composition theme sum making collected works and composition model essay collection that awaits the reply is k, and the composition model essay sum of composition model essay collection is z; Awaiting the reply writes a composition keeps to the point degree span between 0 to 2, if the composition content that awaits the reply is more pertinent to the point, then awaiting the reply composition angle value pertinent to the point is larger; If awaited the reply write a composition content complete irrelevant to the subject time, then awaiting the reply composition keep to the point angle value be 0; When content of writing a composition if awaited the reply is kept to the point completely, then composition of awaiting the reply angle value pertinent to the point is 2.
3. method according to claim 1, is characterized in that: described english composition analyzing and training resume module flow process of keeping to the point is as follows:
S0201 starts;
S0202 reads in composition model essay collection;
S0203 reads in training and makes collected works;
Stop words, punctuate, abb. that S0204 removes composition model essay collection and trains composition to concentrate;
S0205 calculation training makes collected works and the theme probability distribution of Feature Words concentrated in composition model essay;
S0206 arranges maximum iteration time;
If S0207 iterations is greater than maximum iteration time, then turn S0211 operation;
S0208 is according to formula (1) calculation training composition content topic probability distribution, the distribution of formula (2) calculation training composition theme feature Word probability, formula (3) calculates composition model essay content topic probability distribution, and formula (4) calculates the composition model essay theme feature Word probability distribution based on training composition;
S0209 calculation training composition content topic probability distribution and the product of training theme feature Word probability of writing a composition to distribute, calculate the product that composition model essay content topic probability distribution distributes with composition model essay theme feature Word probability;
S0210 iterations increases by 1, turns S0207;
S0211 preserves training composition content topic probability distribution, the distribution of training composition theme feature Word probability, composition model essay content topic probability distribution, composition model essay theme feature Word probability distribution based on training composition;
S0212 finds out the maximum theme of training composition from training composition content topic probability distribution, from composition model essay content topic probability distribution, find out the maximum theme of composition model essay;
S0213 is according to formula (5) calculation training composition judgment value pertinent to the point;
S0214 keeps to the point according to the composition of formula (6) calculation training and spends;
Training that S0215 analysis meter calculates composition keep to the point judgment value, training composition keep to the point degree and training write a composition artificial judgment value of keeping to the point, english composition is pertinent to the point spends analytical standard train the composition manually consistance spent pertinent to the point to obtain;
S0216 exports english composition degree pertinent to the point analytical standard;
S0217 terminates.
4. method according to claim 1, is characterized in that: described english composition analysis pertinent to the point grading module treatment scheme is as follows:
S0301 starts;
S0302 reads in composition model essay collection;
S0303 reads in composition of awaiting the reply;
Stop words, punctuate, abb. that S0304 removes composition model essay collection and awaits the reply in composition;
S0305 calculates await the reply composition and composition model essay and concentrates the theme probability distribution of Feature Words;
S0306 arranges maximum iteration time;
If S0307 iterations is greater than maximum iteration time, then turn S0311 operation;
S0308 calculates according to formula (7) the composition content topic probability distribution that awaits the reply, formula (8) calculates composition theme feature Word probability distribution of awaiting the reply, formula (3) calculates composition model essay content topic probability distribution, and formula (9) calculates the composition model essay theme feature Word probability distribution based on composition of awaiting the reply;
S0309 calculates await the reply composition content topic probability distribution and the product that theme feature Word probability distribute of write a composition that awaits the reply, and calculates composition model essay content topic probability distribution and product that model essay theme feature Word probability of writing a composition distributes;
S0310 iterations increases by 1;
S0311 preserves the composition content topic probability distribution that awaits the reply, composition theme feature Word probability distribution of awaiting the reply, composition model essay content topic probability distribution, composition model essay theme feature Word probability distribution based on composition of awaiting the reply;
S0312 finds out the maximum theme of composition that awaits the reply from the composition content topic probability distribution that awaits the reply, from composition model essay content topic probability distribution, find out the maximum theme of composition model essay;
S0313 calculates composition of awaiting the reply judgment value pertinent to the point according to formula (10);
S0314 keeps to the point according to the composition of formula (11) calculation training and spends;
S0315 exports composition of awaiting the reply result pertinent to the point;
S0316 terminates.
CN201510204370.1A 2015-04-27 2015-04-27 A kind of english composition content is kept to the point analysis method Active CN104778160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510204370.1A CN104778160B (en) 2015-04-27 2015-04-27 A kind of english composition content is kept to the point analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510204370.1A CN104778160B (en) 2015-04-27 2015-04-27 A kind of english composition content is kept to the point analysis method

Publications (2)

Publication Number Publication Date
CN104778160A true CN104778160A (en) 2015-07-15
CN104778160B CN104778160B (en) 2017-10-24

Family

ID=53619634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510204370.1A Active CN104778160B (en) 2015-04-27 2015-04-27 A kind of english composition content is kept to the point analysis method

Country Status (1)

Country Link
CN (1) CN104778160B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183713A (en) * 2015-08-27 2015-12-23 北京时代焦点国际教育咨询有限责任公司 English composition automatic correcting method and system
CN106776551A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of analysis method of english composition emotion viewpoint
CN106776549A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of rule-based english composition syntax error correcting method
CN107301169A (en) * 2017-06-16 2017-10-27 科大讯飞股份有限公司 Digress from the subject composition detection method, device and terminal device
CN109508460A (en) * 2018-12-04 2019-03-22 广东外语外贸大学 Unsupervised composition based on Subject Clustering is digressed from the subject detection method and system
CN110264792A (en) * 2019-06-17 2019-09-20 上海元趣信息技术有限公司 One kind is for pupil's composition intelligent tutoring system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3708749B2 (en) * 1999-04-28 2005-10-19 日本電信電話株式会社 Operation-type learning scoring method, processing apparatus therefor, and recording medium recording a program for executing the method
CN1700200A (en) * 2005-05-30 2005-11-23 梁茂成 English composition automatic scoring system
CN102779220A (en) * 2011-05-10 2012-11-14 李德霞 English test paper scoring system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3708749B2 (en) * 1999-04-28 2005-10-19 日本電信電話株式会社 Operation-type learning scoring method, processing apparatus therefor, and recording medium recording a program for executing the method
CN1700200A (en) * 2005-05-30 2005-11-23 梁茂成 English composition automatic scoring system
CN102779220A (en) * 2011-05-10 2012-11-14 李德霞 English test paper scoring system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YA ZHOU 等: "An Automatic English Composition Scoring Model Based on Neural Network Algorithm", 《IEEE》 *
YU SHIWEN 等: "Automatic Evaluation of Output Quality for Machine Translation Systems", 《PROCEEDINGS OF THE EVALUATORS’ FORUM》 *
朱正才 等: "一个基于综合印象评分法的作文分事后调整模型", 《心理科学》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183713A (en) * 2015-08-27 2015-12-23 北京时代焦点国际教育咨询有限责任公司 English composition automatic correcting method and system
CN106776551A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of analysis method of english composition emotion viewpoint
CN106776549A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of rule-based english composition syntax error correcting method
CN106776549B (en) * 2016-12-06 2020-04-24 桂林电子科技大学 English composition grammar error correction method based on rules
CN106776551B (en) * 2016-12-06 2020-05-08 桂林电子科技大学 Method for analyzing emotion viewpoints of English composition
CN107301169A (en) * 2017-06-16 2017-10-27 科大讯飞股份有限公司 Digress from the subject composition detection method, device and terminal device
CN109508460A (en) * 2018-12-04 2019-03-22 广东外语外贸大学 Unsupervised composition based on Subject Clustering is digressed from the subject detection method and system
CN110264792A (en) * 2019-06-17 2019-09-20 上海元趣信息技术有限公司 One kind is for pupil's composition intelligent tutoring system
CN110264792B (en) * 2019-06-17 2021-11-09 上海元趣信息技术有限公司 Intelligent tutoring system for composition of pupils

Also Published As

Publication number Publication date
CN104778160B (en) 2017-10-24

Similar Documents

Publication Publication Date Title
CN104778160A (en) Analysis method for subject relevance of English composition contents
Zhang Necessity of grammar teaching.
CN110516245A (en) Fine granularity sentiment analysis method, apparatus, computer equipment and storage medium
CN112084299B (en) Reading comprehension automatic question-answering method based on BERT semantic representation
Larsen-Freeman Complex dynamic systems theory
CN105426361A (en) Keyword extraction method and device
CN107291694A (en) A kind of automatic method and apparatus, storage medium and terminal for reading and appraising composition
CN106033462A (en) Neologism discovering method and system
Omer et al. The Effect of Culture Integrated Language Courses on Foreign Language Education.
Gomaa et al. Arabic short answer scoring with effective feedback for students
CN112132536A (en) Post recommendation method, system, computer equipment and storage medium
Joundy Hazar et al. Automated scoring for essay questions in e-learning
Nugraha et al. Teaching grammar through data-driven learning (DDL) approach
CN107092593A (en) The sentence semantics role recognition method and system of elementary mathematics stratified sampling application topic
Otilia English for Specific Purposes: Past and Present.
Chen et al. A study of interlanguage fossilization in second language acquisition and its teaching implications
Charnine et al. Optimal automated method for collaborative development of universiry curricula
Crosthwaite Learner corpus linguistics in the EFL classroom
Li An English Writing Grammar Error Correction Technology Based on Similarity Algorithm
Barabanova Bilingualism, multicultural and comparative law in engineering education
Barker et al. ChatGPT as a text simplification tool to remove bias
Huang The role of L1 in Chinese college students' English learning: A study of Kellerman's theory of language transfer
Sabitzer et al. Brain-based programming: a new concept for computer science education
Hauksdóttir An Innovative World Language Centre: Challenges for the Use of Language Technology.
Hutchinson Conclusion: The Case for Non-Erasure in Actor Training

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20150715

Assignee: Guilin Ruisen Education Service Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000186

Denomination of invention: A Method of Analyzing the Content of English Composition

Granted publication date: 20171024

License type: Common License

Record date: 20221125

Application publication date: 20150715

Assignee: Guilin ruiweisaide Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000190

Denomination of invention: A Method of Analyzing the Content of English Composition

Granted publication date: 20171024

License type: Common License

Record date: 20221125