CN109785698A - Method, apparatus, electronic equipment and medium for spoken language proficiency evaluation and test - Google Patents

Method, apparatus, electronic equipment and medium for spoken language proficiency evaluation and test Download PDF

Info

Publication number
CN109785698A
CN109785698A CN201711111300.7A CN201711111300A CN109785698A CN 109785698 A CN109785698 A CN 109785698A CN 201711111300 A CN201711111300 A CN 201711111300A CN 109785698 A CN109785698 A CN 109785698A
Authority
CN
China
Prior art keywords
evaluated
topic
test
measured
pronunciation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711111300.7A
Other languages
Chinese (zh)
Other versions
CN109785698B (en
Inventor
林晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI LIULISHUO INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI LIULISHUO INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI LIULISHUO INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI LIULISHUO INFORMATION TECHNOLOGY Co Ltd
Priority to CN201711111300.7A priority Critical patent/CN109785698B/en
Publication of CN109785698A publication Critical patent/CN109785698A/en
Application granted granted Critical
Publication of CN109785698B publication Critical patent/CN109785698B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

Embodiments of the present invention provide a kind of method for spoken language proficiency evaluation and test, this method comprises: randomly selecting topic to be measured from exam pool;Acquisition is directed to the voice data to be evaluated of the topic to be measured;Corresponding text data to be evaluated and pronunciation character to be evaluated are obtained according to the voice data to be evaluated;Obtain the first semantic relevancy between the text data to be evaluated and the topic to be measured;Appraisal result is obtained according to first semantic relevancy and the pronunciation character to be evaluated.It solves the problems, such as directly calculate semantic relevancy according to topic text and voice data in the prior art, method of the invention can be used family and carry out oral test or examination on the internet, test and examination efficiency are substantially increased, user experience is improved.In addition, embodiment of the present invention additionally provides a kind of medium, for the device and electronic equipment of spoken language proficiency evaluation and test.

Description

Method, apparatus, electronic equipment and medium for spoken language proficiency evaluation and test
Technical field
Embodiments of the present invention are related to computer assisted instruction field, more specifically, embodiments of the present invention are related to Method, apparatus, electronic equipment and medium for spoken language proficiency evaluation and test.
Background technique
Background that this section is intended to provide an explanation of the embodiments of the present invention set forth in the claims or context.Herein Description not because not recognizing it is the prior art being included in this section.
Spoken assessment is carried out using artificial mostly at present, but artificial evaluation and test has the disadvantages that
1, marking is more subjective: score is mainly judged according to examiner personal inclination, larger by individual difference.
2, human cost is high: most of artificial spoken assessment is required to reserve and concentrate to carry out, by time, region, number It is larger with the limitation such as monetary cost.
3, professional low: the judging panel of mechanism for testing and the professional qualification and level of tutor are difficult to ensure.
4, low efficiency, poor repeatability: artificial assessment is mostly that one-to-many or a small amount of judging panel faces a large amount of examinees, is really tested and assessed Time proportion is lower, while examinee can not look back oneself examination paper repeatedly and carry out comparative evaluation's result.
There are also a kind of methods for carrying out evaluation analysis to user speech by program now.
Summary of the invention
But existing program assessment has the characteristics that or deficiency:
1, accuracy is insufficient: program assessment on the market is influenced by sound pick-up outfit, environment, user's accent etc., to Recognition success rate, that is, accuracy rate of family voice is very low;Most of accent mould, which examines software or even only relies on user and thumb up, carrys out screening high-quality (as shown in Figure 8) is answered without providing any evaluation.
2, dimensions are single: dimensions are most of to make evaluation only for voice length and fluency, sends out user Sound, grammer, pause, vocabulary and semantic dependency etc. can not make evaluation.
3, score low efficiency: scoring process low efficiency, longer to the process time for generating appraisal report from starting to score.
4, analysis content is deficient: spoken marking is only provided, and lack integral level evaluation, spoken language proficiency across comparison is commented Valence, spoken each dimension evaluation, mistake inscribe analysis, standard pronunciation and improve the contents such as direction.
On the other hand, method in the prior art is primarily directed to test or the exam question of canonical reference answer It scores, but there are a large amount of subjectivity examination paper in speaking test (such as IELTS), these subjectivity examination paper are not How canonical reference answer realizes that marking is exactly a technical problem urgently to be resolved by machine at this time.
Therefore in the prior art, how to realize that machine scores for the subjectivity spoken language examination question of no canonical reference answer And how from multiple and different dimensions to realize that this is very bothersome to the comprehensive score and evaluation of examinee's spoken language Technical problem.
Thus, it is also very desirable to which a kind of improved technical solution for spoken language proficiency evaluation and test, embodiment of the present invention pass through Topic to be measured is randomly selected from exam pool;Acquisition is directed to the voice data to be evaluated of the topic to be measured;According to the voice number to be evaluated According to the corresponding text data to be evaluated of acquisition;Obtain the first semantic correlation between the text data to be evaluated and the topic to be measured Degree, to obtain appraisal result according to first semantic relevancy.
In the present context, embodiment of the present invention be intended to provide it is a kind of for spoken language proficiency evaluation and test method, medium, dress It sets and electronic equipment.
In the first aspect of embodiment of the present invention, a kind of method for spoken language proficiency evaluation and test is provided, comprising: from Exam pool randomly selects topic to be measured;Acquisition is directed to the voice data to be evaluated of the topic to be measured;According to the voice data to be evaluated Obtain corresponding text data to be evaluated and pronunciation character to be evaluated;It obtains between the text data to be evaluated and the topic to be measured The first semantic relevancy;Appraisal result is obtained according to first semantic relevancy and the pronunciation character to be evaluated.
In one embodiment of the invention, the method also includes: obtained according to the type of the topic to be measured corresponding Evaluation and test dimension and standards of grading.
In yet another embodiment of the present invention, the evaluation and test dimension includes grammer evaluation and test dimension and/or vocabulary evaluation and test Dimension and/or evaluating pronunciation dimension and/or fluency evaluate and test dimension, and the corresponding standards of grading include grammer standards of grading And/or vocabulary standards of grading and/or pronunciation standards of grading and/or fluency standards of grading, the method also includes: according to institute It states text data to be evaluated and the grammer standards of grading obtains grammer scoring;And/or according to the text data to be evaluated and described Vocabulary standards of grading obtain vocabulary scoring;And/or it is obtained according to the pronunciation character to be evaluated and the pronunciation standards of grading Pronunciation scoring;And/or fluency scoring is obtained according to the pronunciation character to be evaluated and the fluency standards of grading.
In yet another embodiment of the present invention, the method also includes: according to grammer scoring and/or the vocabulary Amount scoring and/or pronunciation scoring and/or fluency scoring obtain the appraisal result.
In yet another embodiment of the present invention, the method also includes: obtain the text data to be evaluated and it is described to Survey the second semantic relevancy between the model answer of topic;The appraisal result is obtained according to second semantic relevancy.
In yet another embodiment of the present invention, further includes: carry out analysis to the appraisal result and obtain analysis result;Root Comprehensive evaluating report is generated according to the appraisal result and the analysis result.
In the second aspect of embodiment of the present invention, a kind of medium is provided, program is stored thereon with, which is located Each step realized in the above method embodiment when device executes is managed, for example, randomly selecting topic to be measured from exam pool;Acquisition is directed to The voice data to be evaluated of the topic to be measured;Corresponding text data to be evaluated and to be evaluated is obtained according to the voice data to be evaluated Pronunciation character;Obtain the first semantic relevancy between the text data to be evaluated and the topic to be measured;According to described first Semantic relevancy and the pronunciation character to be evaluated obtain appraisal result.
In the third aspect of embodiment of the present invention, a kind of device for spoken language proficiency evaluation and test is provided, comprising: topic Abstraction module, for randomly selecting topic to be measured from exam pool;Voice acquisition module, for acquire be directed to the topic to be measured to Comment voice data;Speech recognition module, for according to the voice data to be evaluated obtain corresponding text data to be evaluated and to Comment pronunciation character;First relatedness computation module, for obtaining between the text data to be evaluated and the topic to be measured One semantic relevancy;Grading module, for obtaining scoring according to first semantic relevancy and the pronunciation character to be evaluated As a result.
In one embodiment of the invention, described device further include: dimension standard obtains module, is used for according to The type of topic to be measured obtains corresponding evaluation and test dimension and standards of grading.
In yet another embodiment of the present invention, the evaluation and test dimension includes grammer evaluation and test dimension and/or vocabulary evaluation and test Dimension and/or evaluating pronunciation dimension and/or fluency evaluate and test dimension, and the corresponding standards of grading include grammer standards of grading And/or vocabulary standards of grading and/or pronunciation standards of grading and/or fluency standards of grading.Wherein, institute's scoring module also wraps Include grammer scoring unit and/or vocabulary scoring unit and/or pronunciation scoring unit and/or fluency scoring unit.
Wherein, the grammer scoring unit is used to obtain language according to the text data to be evaluated and the grammer standards of grading Method scoring.
The vocabulary scoring unit is used to obtain word according to the text data to be evaluated and the vocabulary standards of grading The scoring of remittance amount.
The pronunciation scoring unit is used to obtain pronunciation according to the pronunciation character to be evaluated and the pronunciation standards of grading and comment Point.
The fluency scoring unit is used to obtain stream according to the pronunciation character to be evaluated and the fluency standards of grading Sharp degree scoring.
In yet another embodiment of the present invention, institute's scoring module further includes general comment unit.Wherein, the general comment unit For being obtained according to grammer scoring and/or vocabulary scoring and/or pronunciation scoring and/or fluency scoring Take the appraisal result.
In yet another embodiment of the present invention, described device further includes the second relatedness computation module.Wherein described Two degree of correlation technology modules are used to obtain the second language between the text data to be evaluated and the model answer of the topic to be measured The adopted degree of correlation.Wherein the grading module is also used to obtain the appraisal result according to second semantic relevancy.
In yet another embodiment of the present invention, described device further includes analysis module and report generation module.Wherein, The analysis module is used to carry out the appraisal result analysis and obtains analysis result.The report generation module is used for according to institute It states appraisal result and the analysis result generates comprehensive evaluating report.
In the fourth aspect of embodiment of the present invention, a kind of electronic equipment is provided, specifically includes that memory, for depositing Store up computer program;Processor, for executing the computer program stored in the memory, and the computer program is held When row, following instructions are run: randomly selecting topic to be measured from exam pool;Acquisition is directed to the voice number to be evaluated of the topic to be measured According to;Corresponding text data to be evaluated and pronunciation character to be evaluated are obtained according to the voice data to be evaluated;Obtain the text to be evaluated The first semantic relevancy between notebook data and the topic to be measured;According to first semantic relevancy and the hair to be evaluated Sound feature obtains appraisal result.
Method, medium, device and the electronic equipment for spoken language proficiency evaluation and test that embodiment provides according to the present invention, By randomly selecting topic to be measured from exam pool;Acquisition is directed to the voice data to be evaluated of the topic to be measured;According to described to comment The corresponding text data to be evaluated of sound data acquisition and pronunciation character to be evaluated;Obtain the text data to be evaluated and the topic to be measured The first semantic relevancy between mesh, in this way, embodiment of the present invention can use first semantic relevancy and it is described to It comments pronunciation character to obtain the appraisal result of user, carries out spoken language on the internet so that family can be used in embodiment of the present invention and examine Examination, substantially increases examination efficiency, improves user experience.
In addition, certain inventive embodiments, the method for being used for spoken language proficiency evaluation and test pass through calculating topic to be measured according to the present invention The first semantic relevancy and pronunciation character to be evaluated between mesh and voice data to be evaluated obtain the scoring of user, solve existing There is the problem of semantic relevancy can not directly be calculated according to topic text and voice data in technology, the spoken language proficiency can be made The method of evaluation and test is applied to the machine scoring of the subjectivity examination question of no canonical reference answer, improves machine scoring and is widely used Degree and accuracy are conducive to popularization of the machine scoring in all kinds of speaking test or assessment.Meanwhile it is according to the present invention another A little embodiments, this method also pass through several dimensions such as pronunciation, fluency, grammer and vocabulary and beat user speech progress overall merit Point, solve several hang-ups of spoken language proficiency test all the time: standard is fuzzy, evaluation subjectivity is strong, intelligent automaticization journey Low, marking algorithm science difference etc. is spent, is the important breakthrough in the field of spoken assessment.
Detailed description of the invention
The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention , feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention Dry embodiment, in which:
What Fig. 1 schematically showed embodiment according to the present invention can be in the application scenarios schematic diagram wherein realized;
Fig. 2 schematically shows a kind of method flows for spoken language proficiency evaluation and test according to an embodiment of the invention Figure;
Fig. 3 schematically shows a kind of method flow for spoken language proficiency evaluation and test of another embodiment according to the present invention Figure;
Fig. 4 schematically shows the configuration diagrams according to an embodiment of the invention for spoken language proficiency evaluation and test;
Fig. 5 schematically shows a kind of structure of device for spoken language proficiency evaluation and test according to an embodiment of the invention Schematic diagram;
Fig. 6 schematically shows the structural schematic diagram of electronic equipment according to an embodiment of the invention;
Fig. 7 schematically shows the schematic diagram of medium according to an embodiment of the invention;
Fig. 8 is a kind of interface schematic diagram for spoken language proficiency evaluation and test in the prior art.
In the accompanying drawings, identical or corresponding label indicates identical or corresponding part.
Specific embodiment
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that providing this A little embodiments are used for the purpose of making those skilled in the art can better understand that realizing the present invention in turn, and be not with any Mode limits the scope of the invention.On the contrary, these embodiments are provided so that this disclosure will be more thorough and complete, and energy It is enough that the scope of the present disclosure is completely communicated to those skilled in the art.
One skilled in the art will appreciate that embodiments of the present invention can be implemented as a kind of equipment, method or computer journey Sequence product.Therefore, the present disclosure may be embodied in the following forms, it may be assumed that complete hardware or complete software (including it is solid Part, resident software, microcode etc.) or hardware and software combine form.
Embodiment according to the present invention proposes a kind of method, apparatus, equipment and Jie for spoken language proficiency evaluation and test Matter.
Herein, it is to be understood that any number of elements in attached drawing be used to example rather than limit and it is any Name is only used for distinguishing, without any restrictions meaning.Below with reference to several representative embodiments of the invention, in detail It is thin to illustrate the principle and spirit of the invention.
Summary of the invention
The inventors discovered that in the prior art for speaking test such as Oral English Exam mainly by by examinee's Answer recording is converted into content of text, then by the content of text and the text for the canonical reference answer being provided previously after the conversion It is compared, generally can extract in advance some keywords from the canonical reference answer, by by these keywords and this turn Content of text after changing is matched, in general, the more more then scores of matched quantity are higher, still, such method at least exists Following these problems, on the one hand, there are a large amount of subjectivity examination questions, the examinations of these subjectivities in many speaking test such as IELTS examination It inscribes and canonical reference answer is not present, thus machine marking can not be carried out using above-mentioned scheme in the prior art;Another party Face, even if for there are the examination examination questions of canonical reference answer, since it is carried out generally by the method for Keywords matching Marking, very likely for examinee by reciting some general paragraphs in advance, the inside contains the keyword of canonical reference answer for that, but It is that it is answered reality and does not meet the setting scene of examination paper, but is also possible to obtain high score, back in this way by the examinee of bottom line title Original intention from speaking test setting subjectivity examination question.Meanwhile machine assessment dimension in the prior art and scoring criterion are excessively It is single, it can not be evaluated and be given a mark from multiple dimensions.
Therefore for the technical problem of marking inaccuracy existing in the prior art, the present invention provides be used for spoken language proficiency Method, medium, device and the electronic equipment of evaluation and test, by randomly selecting topic to be measured from exam pool;Acquisition is for described to be measured The voice data to be evaluated of topic;Corresponding text data to be evaluated is obtained according to the voice data to be evaluated and pronunciation to be evaluated is special Sign;The first semantic relevancy between the text data to be evaluated and the topic to be measured is obtained, in this way, embodiment of the present invention Can use first semantic relevancy and the pronunciation character to be evaluated can obtain appraisal result, so as to make user mutual Speaking test is carried out in networking, is substantially increased examination efficiency, is improved user experience.Meanwhile the spoken language water can be made The method of flat evaluation and test is applied to the machine scoring of the subjectivity examination question of no canonical reference answer, and the application for improving machine scoring is wide General degree and accuracy are conducive to popularization of the machine scoring in all kinds of speaking test or assessment.In addition, according to the present invention another Some embodiments, this method also pass through several dimensions such as pronunciation, fluency, grammer and vocabulary and carry out overall merit to user speech Marking
After introduced the basic principles of the present invention, lower mask body introduces various non-limiting embodiment party of the invention Formula.
Application scenarios overview
Referring initially to Fig. 1, it is schematically shown that according to the present invention embodiment can be in the applied field wherein realized Scape.
In Fig. 1, terminal device 1, terminal device 2 ... be mounted in terminal device n and be able to access that online spoken language proficiency The application program for evaluating and testing the page provided by provider (for example, fluent in English is said), for example, being shown as in terminal device 1 desk-top In the case where computer or notebook computer, it is equipped in terminal device 1 and is able to access that online spoken language proficiency evaluation and test provides The application programs such as the applications client of the page provided by quotient or browser, for another example showing as intelligent sliding in terminal device 2 In the case where mobile phone or tablet computer, it is equipped in terminal device 2 and is able to access that online spoken language proficiency evaluation and test institute, provider The application programs such as the APP (Application, application program) of the page of offer or browser;Different user can use In the corresponding application programs access respective server installed in its terminal device provided by online spoken language proficiency evaluation and test provider The page, thus user can check provided by online spoken language proficiency evaluation and test provider from exam pool randomly select topic to be measured, Acquisition obtains corresponding text to be evaluated for the topic to be measured to flat voice data and according to the voice data to be evaluated The information such as data and pronunciation character to be evaluated;Further, different user can according to its actual demand and its recognize The information of corresponding speaking test is executed corresponding spoken based on respective page provided by the online spoken language proficiency evaluation and test provider Level evaluation and test flow operations, to obtain corresponding appraisal result provided by the online spoken language proficiency evaluation and test provider.However, ability What field technique personnel were appreciated that embodiment of the present invention completely is applicable in limitation of the scene not by any aspect of the frame.
The scene using the method evaluated and tested for spoken language proficiency in the embodiment of the present invention may include communication (nothing Line and/or wired) connection client (such as shown in Fig. 1 in terminal device 1, terminal device 2 ... terminal device n) with And server.
Wherein, computer, tablet computer, high-end smartphones etc. can be used as client of the invention, and client is necessary Have independent audio frequency and video playing function and independent audio input device.The client is mainly responsible for user and system Interaction realizes that the acquisition of voice messaging (such as can call recording plug-in unit record by webpage, and generate wav format Audio file), play the tested speech for depositing in client local and server respectively and received pronunciation, transmission wav format Audio file is to server and corpus text, appraisal result, overall merit report such as display feedback guidance opinion of pronouncing Function.Client can be used for examinee and carry out spoken language proficiency evaluation and test, including examination question publication, evaluation and test, winding etc., and processing and transmission Examinee answers audio to server, answers audio for examinee, and the format conversion for the audio that can also answer to examinee is mentioned with feature It takes.After evaluation and test, evaluation and test achievement, that is, appraisal result (or can also include that overall merit is reported) of examinee can also be issued On the client.The examinee that client uploads answer result may include read aloud the spoken evaluating result of topic (objective test) with And one of the spoken evaluating result of (subjectivity examination question) or two kinds are inscribed in spontaneous spoken statement.
Wherein, server is mainly responsible for arrangement, collection and the distribution of paper of evaluation result, machine automatic scoring, leads to It crosses communication module to export evaluation and test information to client, provides paper to client in specific time and control the evaluation and test time, Examinee is collected from client to answer audio, and examinee's test paper is identified, decode, is scored, leads to evaluation result after the completion of scoring It crosses communication module and timely feedbacks back client.Server has corpus collection, speech signal pre-processing, speech recognition and pronunciation matter The functions such as amount scoring.According to examinee's scale and calculating task amount, server can choose more high-performance computers and set up calculating The form of machine cluster, to accelerate scoring and decoded speed.It answers information and its scoring event after evaluation and test to examinee The analysis and processing concentrated are done, the Information Statistics such as examinee's total score, individual event score and ranking are come out, can also allow academics and students Inquire the information such as examinee's total score, individual event score and ranking at any time.
System may include the role of three kinds of different rights: examinee, teacher and administrator, and examinee is mainly responsible for evaluation and test and makees It answers;Teacher is mainly responsible for system volume, publication evaluation and test, management evaluation and test and checks evaluation result;Administrator is mainly responsible for the pipe of evaluation and test Reason, the time control of welltesting software and the maintenance of whole evaluating system.
Illustrative methods
Below with reference to application scenarios shown in FIG. 1, the use of illustrative embodiments according to the present invention is described with reference to Fig. 2-4 In the method for spoken language proficiency evaluation and test.It should be noted that above-mentioned application scenarios be merely for convenience of understanding spirit of the invention and Principle and show, embodiments of the present invention are not limited in this respect.On the contrary, embodiments of the present invention can be applied In applicable any scene.
The method of embodiment of the present invention may include: step S200, step S210, step S220, step S230 and Step S240;Optionally, the method for embodiment of the present invention can also include: step S300, step S310 and step S320.
Referring to fig. 2, it is schematically shown that the stream of the method according to an embodiment of the invention for spoken language proficiency evaluation and test Cheng Tu, this method usually execute in the equipment that can run computer program, for example, in desktop computer or server etc. It is executed in equipment, it is of course also possible to be executed in the equipment such as notebook computer even tablet computer.
In step s 200, topic to be measured is randomly selected from exam pool.
As an example, the spoken language proficiency evaluation and test of embodiment of the present invention can be any one language, for example, English, in The spoken language proficiency of text, French, German, Russian etc. is evaluated and tested, and spoken language proficiency evaluation and test can be by online website or using journey Sequence carries out spoken language proficiency simulation test, is also possible to formal spoken language proficiency test.With Oral English Practice in the following examples It is illustrated for level evaluation and test such as IELTS examination, but it's not limited to that for the disclosure.Correspondingly, being directed to different language And different speaking test types, it can have different examination pools, for example, IELTS examination has the examination pool of IELTS, As examinee or logging in system by user, random selects the topic to be measured from the examination pool.
In step S210, acquisition is directed to the voice data to be evaluated of the topic to be measured.
In the embodiment of the present invention, or by taking IELTS examination as an example, when system has randomly choosed this currently from corresponding exam pool Perhaps examinee or user start to answer after the topic to be measured of examination for examinee or user this test, are recorded by client The examinee or user are made for the voice data to be evaluated of each topic to be measured and server can be uploaded to.
In one preferred embodiment, the method can also include: to acquisition for the topic to be measured The voice data to be evaluated is pre-processed, and the voice data to be evaluated is processed into the data for meeting machine points-scoring system requirement Format.
In step S220, corresponding text data to be evaluated is obtained according to the voice data to be evaluated and pronunciation to be evaluated is special Sign.
In the embodiment of the present invention, the voice data to be evaluated can be converted into using automatic speech recognition technology corresponding The text data to be evaluated and the pronunciation character to be evaluated.Specific automatic speech recognition technology can be with reference in the prior art Content, expansion explanation is not carried out to it herein.
In step S230, the first semantic relevancy between the text data to be evaluated and the topic to be measured is obtained.
As an example, the method for embodiment of the present invention can also include: by the topic to be measured according to topic class Type is divided into subjectivity examination question and objective test.Wherein, the objective test refers to the examination with canonical reference answer The corresponding canonical reference answer of the text data to be evaluated when examination question, i.e. user or examinee are answered needs completely the same Can just get full marks or high score, such as speaking test in read aloud topic, i.e., given one section of English material, examinee is read aloud out Come.And the subjectivity examination question refers to the examination examination question of no canonical reference answer, can by each examinee or user into Row freely plays the examination question of statement, either provides a or more parts of Key for References, but answering for examinee does not need and one It causes, such as examinee is required to state that oneself is thought a very successful thing in English in speaking test.
In one preferred embodiment, which comprises be directed to the subjectivity examination question, obtain the subjectivity The text data to be evaluated of examination question;Calculate between the text data to be evaluated of the subjectivity examination question and its corresponding topic to be measured One semantic relevancy.
As an example, first semantic relevancy can be prepared by the following: calculating the subjectivity examination question The semantic relevancy score of each word in each of text data to be evaluated word and corresponding topic to be measured;Calculate institute State the semantic phase of each sentence in each of the text data to be evaluated of subjectivity examination question word and corresponding topic to be measured Pass degree score;Calculate each word and each in corresponding topic to be measured in the text data to be evaluated of the subjectivity examination question Semantic relevancy score maximum value/average value in sentence is as the semantic relevancy between word and sentence;Calculate the master The first semantic relevancy score between the text data to be evaluated and corresponding topic to be measured of the property seen examination question.
In step S240, appraisal result is obtained according to first semantic relevancy and the pronunciation character to be evaluated.
As an example, it is higher to can be set to first semantic relevancy in embodiment of the present invention, then comment accordingly Divide higher;Conversely, first semantic relevancy is lower, then corresponding scoring is lower.But the disclosure is not construed as limiting this.
Method provided by embodiment of the present invention for spoken language proficiency evaluation and test is made by calculating topic to be measured and examinee Relevance score situation between the voice data answered can obtain the corresponding spoken scoring of examinee, so that the present invention can Be applied to no canonical reference answer subjectivity examination examination question machine assessment in, allow user by internet into The evaluation and test of row online simulation or examination, substantially increase the efficiency and correctness of spoken language proficiency evaluation and test, avoid in the prior art Examinee remembers the phenomenon that some texts for interting various keywords or keyword go up for one's examination by memorize mechanicalling mode.
It should be noted that, although being illustrated so that subjectivity takes an examination examination question as an example in above-described embodiment, but in fact, originally Method described in inventive embodiments also can be applied to objectivity examination examination question.In this way, even if although examinee's answers in part Hold and coincide with keyword given in criterion for marking papers, but whole answer is not inconsistent with plot, is still unable to score or secures satisfactory grades.
As an example, the pronunciation character to be evaluated may include the pronouncing accuracy of examinee, pronunciation fluency etc., the disclosure This is not construed as limiting.
Further, the pronunciation character to be evaluated may include: fundamental frequency feature, formant, word speed, average energy etc..
In another embodiment of the present invention, the method can also include: to be obtained according to the type of the topic to be measured Take corresponding evaluation and test dimension and standards of grading.
For example, each topic to be measured can be divided into syntax testing topic, vocabulary test question, pronunciation test question and stream The types such as sharp degree test question.Correspondingly, different evaluation and test dimension and standards of grading are arranged according to the type of different topics to be measured.
In yet another embodiment of the present invention, the evaluation and test dimension may include grammer evaluation and test dimension and/or vocabulary It evaluates and tests dimension and/or evaluating pronunciation dimension and/or fluency evaluates and tests dimension, the corresponding standards of grading may include that grammer is commented Minute mark standard and/or vocabulary standards of grading and/or pronunciation standards of grading and/or fluency standards of grading.
In one preferred embodiment, the method can also include: according to the text data to be evaluated and described Grammer standards of grading obtain grammer scoring;And/or word is obtained according to the text data to be evaluated and the vocabulary standards of grading The scoring of remittance amount;And/or pronunciation scoring is obtained according to the pronunciation character to be evaluated and the pronunciation standards of grading;And/or according to institute It states pronunciation character to be evaluated and the fluency standards of grading obtains fluency scoring.
For example, grammer scoring can according to the type of the tense in the text data to be evaluated how many, various tenses Utilization it is whether correct etc. score, required above examinee state in English one oneself think very successful thing this In examination paper, master to be used should be past tense.For another example the vocabulary scoring can be according in the text data to be evaluated Vocabulary richness, whether appropriateness etc. scores for vocabulary statement.In general, vocabulary is abundanter, and vocabulary scoring is higher, but The disclosure is not intended to be limited thereto.
Wherein, the content information of the main examination pronunciation sentence of pronunciation scoring whether complete and accurate, whether pronunciation clear fluent, Whether pronunciation mistake is had.Specifically, the pronunciation scoring can be obtained by calculating pronouncing accuracy, the side of pronouncing accuracy Method can refer to the prior art, be not described in detail here.Such as paragraph accuracy in pitch degree can be evaluated and tested using deep learning algorithm, Obtain the pronunciation scoring of the voice data to be evaluated.
Specifically, the fluency scoring can be by word speed feature and pause duration characteristics etc. obtain in short-term.Wherein, The word speed feature can be obtained by following steps: be counted in the voice data to be evaluated according to the pronunciation character to be evaluated The corresponding frame number of each phoneme;Word speed feature is obtained using the ratio of phoneme total number and the duration of all phonemes.Wherein, The duration characteristics of pause in short-term can be obtained by following steps: be counted using the pronunciation character to be evaluated described to comment The corresponding frame number of each phoneme and the total frame number of audio in sound data;The synthesis for the duration paused in short-term using all audios and total The ratio of pronunciation duration obtain the duration characteristics that pause in short-term.
In yet another embodiment of the present invention, the method can also include: according to the grammer scoring and/or it is described Vocabulary scoring and/or pronunciation scoring and/or fluency scoring obtain the appraisal result.
In yet another embodiment of the present invention, the method can also include: to obtain the text data to be evaluated and institute State the second semantic relevancy between the model answer of topic to be measured;The scoring knot is obtained according to second semantic relevancy Fruit.
As an example, second semantic relevancy may include semantic similarity and syntactic structure similarity.
Specifically, the semantic similarity can be obtained by following steps: calculating every in the text data to be evaluated The semantic similarity score of each word in one word and canonical reference answer;It calculates every in the text data to be evaluated The semantic similarity score of each sentence in one word and canonical reference answer;It calculates each in the text data to be evaluated Semantic similarity score maximum value/average value in a word and canonical reference answer in each sentence is as word and sentence Between similarity score;Calculate the similarity score between the text data to be evaluated and canonical reference answer.
Specifically, the syntactic structure similarity can be prepared by the following: being respectively the text data to be evaluated Each sentence establish syntax sequence vector;Each of the text data to be evaluated sentence and canonical reference are found out respectively The syntactic structure similarity score of each of answer sentence takes each Sentence Grammar structure in the text data to be evaluated Syntactic structure similarity score of the similarity score maximum value as this sentence;By to each in the text data to be evaluated Syntactic structure between a Sentence Grammar structural similarity score weighted average calculation examinee answer and canonical reference answer is similar Spend feature.
The spoken assessment of topic is read aloud relative to tradition, is used for the method for spoken language proficiency evaluation and test not only described in embodiment of the present invention It is able to carry out the spoken assessment for reading aloud topic, the spoken assessment of spontaneous spoken statement topic can also be carried out;Scoring is more comprehensively public Just, the pronouncing accuracy and fluency that examinee can be investigated in the case where examinee spontaneously states, can more reflect examinee's reality Spoken language proficiency;Examinee's text of answering is no longer influenced by limitation, and automatic scoring evaluation and test topic type also will no longer be only limitted to read aloud topic, in this way Understanding, utilization and ability to express of the examinee in the case where spontaneous spoken statement to language can be investigated;It can examine in this way The semantic dependency of examinee's spoken language utilization is examined, and the syntactical level of examinee's spoken language can be investigated, can guarantee to evaluate and test efficient operation, together When take full advantage of the resource of whole system, greatly improve tissue oral evaluation efficiency, saved a large amount of manpower and material resources.Together When, the dimensions science diversification of embodiment of the present invention the method, scoring flow path efficiency height.
In yet another embodiment of the present invention, the method can also include: to carry out analysis to the appraisal result to obtain Take analysis result;Comprehensive evaluating report is generated according to the appraisal result and the analysis result.
As an example, the report of comprehensive evaluating described in embodiment of the present invention may include the aggregate level based on big data Evaluation and across comparison and joined particular problem detailed analysis (mistake topic and example Speech comparison, the analysis of causes and raising are built The contents such as view).
A specific example described in embodiment of the present invention for the method for spoken language proficiency evaluation and test it is for example following for Fig. 3 and The description of Fig. 4.
In step S300, recording processing and resource distribution stage.With reference to Fig. 4, the recording processing and resource distribution stage It may comprise steps of.
S301, user's input voice data is obtained after being collected into user recording, can pass through ASR (Automatic Speech Recognition, automatic speech recognition) technology, it quickly identifies user speech content and exports the corresponding text of voice Such as TEXT answers text and user pronunciation feature.
S302, after topic extraction, generate corresponding configuration file according to different topic classifications and standards of grading.Configuration text Analyzer (Analyzer) quantity required for part determines, each analysis device (Analyzer) is in other words corresponding to dimensions Feature (Feature) quantity and feedback (Feedback) content.
As an example, the configuration file in embodiment of the present invention be according to topic type come, in advance can be all topics Type needs the dimension fed back, such as specifically for the topic of grammar exercise, does not just feed back the score of pronunciation dimension.The configuration text There are various topic types about the explanation for extracting which feature in part.
S303, configuration resource manager (resource manager): content may include Model (scoring model+its His algorithm model), Data (external resource that assessment process is used such as dictionary), Question Database (exam pool) and Four parts Other Resources (other resources).
Wherein, the scoring model is statistical model, and the inside contains a large amount of parameter, such as linear regression method, In the study of case similarity assessment, correlation rule study, neural network or support vector machines etc. any one or it is a variety of. Other algorithm models for example may include audio quality detection.Other resources such as IELTS vocabulary, exam pool etc..
Specifically, the scoring model can be prepared by the following: choosing several examinees and carry out following five steps institute Then the feature is combined with teacher's scoring and carries out auto-scoring model training, forms Rating Model by the process stated;Collection is examined Raw audio of answering;It extracts the answer acoustic feature of audio of examinee and obtains acoustic model, and obtained according to topic information and training text To language model;Examinee's audio of answering is decoded to obtain recognition result according to the acoustic model of foundation and language model;It mentions Take the feature in recognition result;According to the feature of the extraction training Rating Model.It can use beating for training acquisition later Sub-model carries out automatic scoring.
In step s310, processing stage is analyzed.The analysis processing stage may comprise steps of.
S311, the voice generated in step S300 is corresponded into text and user pronunciation feature is pre-processed (Pre Process), information is subjected to arrangement screening, and the selection result submission analyzer (Analyzer) is analyzed.Such as to knowledge Other text punctuates, syntactic analysis etc..
Wherein, the pretreatment may include: to cut Oral English Practice audio file random division to be evaluated for equal length Piece such as 5 seconds;Preemphasis, voice framing, adding window and end-point detection are carried out to all audio slices again, cut for all audios Piece is sequentially completed time-domain analysis and (analysis and extracts the time domain charactreristic parameter in audio slice, may include short-time energy and in short-term Average amplitude, short-time average zero-crossing rate, in short-term auto-correlation coefficient and short-time average magnitade difference function), frequency-domain analysis (band logical can be passed through Filter group method, Short Time Fourier Transform method, frequency domain Pitch detection, when-frequency representation method, extract audio slice frequency spectrum, Power spectrum, cepstrum, spectrum envelope) and cepstrum domain analysis (cepstrum domain for passing through Homomorphic Processing analysis and extracting audio slice is special Parameter is levied, further effectively separates glottal excitation information and sound channel response message;Glottal excitation information is pure and impure for judging Sound seeks pitch period, and sound channel response message is for seeking formant, for the coding of voice, synthesis, identification);To audio slice Parameters,acoustic is analyzed and is calculated, and parameters,acoustic includes MEL frequency cepstral coefficient, linear prediction residue error and line spectrum pair system Number.
S312, according in step S300 resource manager and according to topic type generate configuration file, generate it is corresponding The analysis feature of analyzer and needs.
With reference to Fig. 4, for example, the Example characteristics 1 (feature instance 1) of analyzer 1 (analyzer 1), feature Example 2 (feature instance 2) ... Example characteristics n (feature instance n);Analyzer 2 (analyzer 2) Example characteristics 1 (feature instance 1), Example characteristics 2 (feature instance 2) ... Example characteristics n (feature instance n);…;The Example characteristics 1 (feature instance 1) of analyzer m (analyzer m), it is special Levies in kind example 2 (feature instance 2) ... Example characteristics n (feature instance n)).Wherein, m and n is big In the positive integer for being equal to 1.
S313, will through the voice document after pretreatment and feature investment algorithm analysis system progress signature analysis, together Shi Binghang (multithreading) is analyzed and processed the different characteristic of multiple topics;The feedback characteristic obtained is put after the completion of processing Enter and is arranged in Features Management device (Feature Manager).
Wherein, signature analysis is exactly the process given a mark by algorithm model.The Features Management device can be according to feedback Type arranges, for example belongs to which kind of syntax error.
For example, as shown in figure 4, feature 1 (Feature 1): name (title), value (value), info (information);Feature 2 (Feature 2): name, value, info;... feature n (Feature n): name, value, info.
In step s 320, give a mark and evaluate feedback stage.The marking and evaluation feedback stage may comprise steps of.
S321, it is given a mark according to the Features Management device in the configuration file and step S310 in step S300, and according to Each dimensions score answers user according to corresponding algorithm and carries out overall marking (Scoring);
Specifically, marking is all based on what the feature that step S310 is extracted carried out.
For example, for IELTS each dimension the corresponding scoring model of corresponding characteristic value feeding is calculated it is each The result of dimension.Scoring model is exactly a pile parameter, can be artificial neural network either other Statistical learning models.It obtains Four dimensions score after, total score can be being averaged for four dimensions score.For example, Fluency&Coherence 5 divides, Lexical Resource 6 divides, and Grammar 6 divides, and Pronunciation 4 divides, and Overall is exactly 5.0 points.
According to configuration file, Features Management device and user are every for S322, feedback manager (Feedback Manager) Score carries out evaluation extraction according to evaluation feedback form, and is carried out extracting overall assessment and corresponding point according to score and Features Management device Analyse
S323, according to configuration file and score, evaluation generates the i.e. final feedback (feedbacks) of final appraisal report.
Embodiment of the present invention the method can be to institute's application product (such as IELTS is fluently said, in examine fluently say) band It helps as follows:
1, profession oral evaluation report in detail is provided:
Application of the method described in embodiment in Related product process through the invention can be directed to each User forms unique proprietary test and evaluation report, for the pronunciation overall assessment of each user and across comparison, each dimension The contents carry out sections such as suggestion are analyzed and improve in score and evaluation, specific wrong recording reproduction and model answer comparison, type of error Complete evaluation is learned, all users is allowed to fully understand the every aspect of oneself oracy.
2, spoken assessment efficiency is increased substantially:
A, it completes to test and assess to each dimension of big length voice data in very short time:
In the process of method described in embodiment of the present invention, assessment engine can run simultaneously to multiple dimensions into The feature analyzer of row analysis has reached the assessment for being completed at the same time multiple dimensions, marking and evaluation feedback.In face of IELTS mould The big length recording such as the entire examination recording examined and middle college entrance examination arbitrarily all recording of set topic, can accomplish complete in 30 minutes At testing and assessing and show analysis report.
B, process compatibility and iterative are high:
Method described in embodiment of the present invention can be compatible with the spoken language of various types of spoken topic type and each length Answer recording, and can be updated at any time when the iteration of Rating Model and scoring criterion updates, it avoids because of topic type and comments The difference of minute mark standard needs replacing inefficient and increased costs caused by scoring process and algorithm.
C, it promotes user experience and forms product core competitiveness:
The contents such as assessment, practice, the mock examination that product applied by method described in embodiment of the present invention provides Professional, high efficiency and ease-to-operate significantly improve the usage experience of user, are substantially distinguished from market product, become production The important component of product core competitiveness.
Exemplary means
After describing the method for exemplary embodiment of the invention, next, with reference to Fig. 5 to the exemplary reality of the present invention The device for spoken language proficiency evaluation and test for applying mode is illustrated.
Referring to Fig. 5, it is schematically shown that the knot of the device according to an embodiment of the invention for spoken language proficiency evaluation and test Structure schematic diagram, which, which is generally disposed at, to run in the equipment of computer program, for example, the dress in the embodiment of the present invention Setting can be set in the equipment such as desktop computer or server, and certainly, which also can be set in notebook computer Even in the equipment such as tablet computer.
The device of embodiment of the present invention specifically includes that topic abstraction module 500, voice acquisition module 510, speech recognition Module 520, the first relatedness computation module 530 and grading module 540.Modules included by the device are distinguished below It is illustrated.
Topic abstraction module 500 can be used for randomly selecting topic to be measured from exam pool.
Voice acquisition module 510 can be used for acquiring the voice data to be evaluated for being directed to the topic to be measured.
Speech recognition module 520 can be used for being obtained according to the voice data to be evaluated corresponding text data to be evaluated and Pronunciation character to be evaluated.
First relatedness computation module 530 can be used for obtaining between the text data to be evaluated and the topic to be measured First semantic relevancy.
Grading module 540 can be used for obtaining scoring according to first semantic relevancy and the pronunciation character to be evaluated As a result.
In a preferred embodiment of the present invention, described device can also include: that dimension standard obtains module, use Dimension and standards of grading are evaluated and tested accordingly in obtaining according to the type of the topic to be measured.
In another preferred embodiment of the invention, the evaluation and test dimension may include grammer evaluation and test dimension and/or Vocabulary evaluates and tests dimension and/or evaluating pronunciation dimension and/or fluency evaluates and tests dimension, and the corresponding standards of grading include grammer Standards of grading and/or vocabulary standards of grading and/or pronunciation standards of grading and/or fluency standards of grading.Wherein, the scoring Module 540 can also include grammer scoring unit and/or vocabulary scoring unit and/or pronunciation scoring unit and/or fluency Score unit.
Wherein, the grammer scoring unit can be used for being obtained according to the text data to be evaluated and the grammer standards of grading Grammer is taken to score.
The vocabulary scoring unit can be used for being obtained according to the text data to be evaluated and the vocabulary standards of grading Vocabulary is taken to score.
The pronunciation scoring unit can be used for obtaining hair according to the pronunciation character to be evaluated and the pronunciation standards of grading Sound scoring.
The fluency scoring unit can be used for being obtained according to the pronunciation character to be evaluated and the fluency standards of grading Fluency is taken to score.
In another preferred embodiment of the invention, institute's scoring module 540 can also include general comment unit.Its In, the general comment unit can be used for being scored according to the grammer and/or the vocabulary scores and/or pronunciation scoring And/or the fluency scoring obtains the appraisal result.
In another preferred embodiment of the invention, described device can also include the second relatedness computation module. Wherein second degree of correlation technology modules can be used for obtaining the text data to be evaluated and the standard of the topic to be measured is answered The second semantic relevancy between case.Wherein the grading module 540 can be also used for being obtained according to second semantic relevancy Take the appraisal result.
In another preferred embodiment of the invention, described device can also include analysis module and report generation Module.Wherein, the analysis module can be used for carrying out the appraisal result analysis acquisition analysis result.The report generation Module can be used for generating comprehensive evaluating report according to the appraisal result and the analysis result.
Wherein, concrete operations performed by modules and/or unit may refer to each in above method embodiment The description of step, this will not be repeated here.
Fig. 6 shows the block diagram for being suitable for the exemplary computer system/server 60 for being used to realize embodiment of the present invention. The computer system/server 60 that Fig. 6 is shown is only an example, should not function and use scope to the embodiment of the present invention Bring any restrictions.
As shown in fig. 6, computer system/server 60 is showed in the form of universal electronic device.Computer system/service The component of device 60 can include but is not limited to: one or more processor or processing unit 601, system storage 602, even Connect the bus 603 of different system components (including system storage 602 and processing unit 601).
Computer system/server 60 typically comprises a variety of computer system readable media.These media, which can be, appoints What usable medium that can be accessed by computer system/server 60, including volatile and non-volatile media, it is moveable and Immovable medium.
System storage 602 may include the computer system readable media of form of volatile memory, such as deposit at random Access to memory (RAM) 6021 and/or cache memory 6022.Computer system/server 60 may further include it Its removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, ROM 6023 can be with For reading and writing immovable, non-volatile magnetic media (not showing in Fig. 6, commonly referred to as " hard disk drive ").Although not existing It is shown in Fig. 6, disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") and right can be provided The CD drive of removable anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these feelings Under condition, each driver can be connected by one or more data media interfaces with bus 603.In system storage 602 It may include at least one program product, which has one group of (for example, at least one) program module, these program moulds Block is configured to perform the function of various embodiments of the present invention.
Program/utility 6025 with one group of (at least one) program module 6024, can store in such as system In memory 602, and such program module 6024 includes but is not limited to: operating system, one or more application program, its It may include the realization of network environment in its program module and program data, each of these examples or certain combination. Program module 6024 usually executes function and/or method in embodiment described in the invention.
Computer system/server 60 can also be with one or more external equipment 604 (such as keyboard, sensing equipment, displays Device etc.) communication.This communication can be carried out by input/output (I/O) interface 605.Also, computer system/server 60 Can also by network adapter 606 and one or more network (such as local area network (LAN), wide area network (WAN) and/or Public network, such as internet) communication.As shown in fig. 6, network adapter 606 passes through bus 603 and computer system/service Other modules (such as processing unit 601) of device 60 communicate.It should be understood that department of computer science can be combined although being not shown in Fig. 6 System/server 60 uses other hardware and/or software module.
The computer program that processing unit 601 is stored in system storage 602 by operation, thereby executing various functions Using and data processing, for example, execute for realizing each step in above method embodiment instruction;Specifically, place Reason unit 601 can execute the computer program stored in system storage 602, and the computer program is performed, following Instruction is run: randomly selecting topic to be measured from exam pool;Acquisition is directed to the voice data to be evaluated of the topic to be measured;According to described Voice data to be evaluated obtains corresponding text data to be evaluated and pronunciation character to be evaluated;Obtain the text data to be evaluated and described The first semantic relevancy between topic to be measured;It is commented according to first semantic relevancy and the pronunciation character acquisition to be evaluated Divide result.
The performed concrete operations of each instruction may refer to the description that each step is directed in above method embodiment, herein It is not repeated to illustrate.
One specific example of medium of embodiment of the present invention is as shown in Figure 7.
The medium of Fig. 7 is CD 700, is stored thereon with computer program (i.e. program product), which is held by processor When row, documented each step in above method embodiment can be realized, for example, randomly selecting topic to be measured from exam pool;Acquisition For the voice data to be evaluated of the topic to be measured;According to the voice data to be evaluated obtain corresponding text data to be evaluated and Pronunciation character to be evaluated;Obtain the first semantic relevancy between the text data to be evaluated and the topic to be measured;According to described First semantic relevancy and the pronunciation character to be evaluated obtain appraisal result;The specific implementation of each step is no longer heavy herein Multiple explanation.
It should be noted that although being referred to several modules of the device for spoken language proficiency evaluation and test in the above detailed description And/or unit, but it is this division be only exemplary it is not enforceable.In fact, embodiment according to the present invention, Two or more above-described modules and/or the feature and function of unit can embody in a module and/or unit. Conversely, an above-described module and/or the feature and function of unit can be by multiple modules and/or list with further division Member embodies.
In addition, although describing the operation of the method for the present invention in the accompanying drawings with particular order, this do not require that or Hint must execute these operations in this particular order, or have to carry out shown in whole operation be just able to achieve it is desired As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/or by one Step is decomposed into execution of multiple steps.
Although detailed description of the preferred embodimentsthe spirit and principles of the present invention are described by reference to several, it should be appreciated that, this It is not limited to the specific embodiments disclosed for invention, does not also mean that the feature in these aspects cannot to the division of various aspects Combination is benefited to carry out, this to divide the convenience merely to statement.The present invention is directed to cover appended claims spirit and Included various modifications and equivalent arrangements in range.

Claims (9)

1. a kind of method for spoken language proficiency evaluation and test, comprising:
Topic to be measured is randomly selected from exam pool;
Acquisition is directed to the voice data to be evaluated of the topic to be measured;
Corresponding text data to be evaluated and pronunciation character to be evaluated are obtained according to the voice data to be evaluated;
Obtain the first semantic relevancy between the text data to be evaluated and the topic to be measured;
Appraisal result is obtained according to first semantic relevancy and the pronunciation character to be evaluated.
2. the method for claim 1, wherein further include: obtain corresponding evaluation and test according to the type of the topic to be measured Dimension and standards of grading.
3. method according to claim 2, wherein the evaluation and test dimension includes grammer evaluation and test dimension and/or vocabulary evaluation and test Dimension and/or evaluating pronunciation dimension and/or fluency evaluate and test dimension, and the corresponding standards of grading include grammer standards of grading And/or vocabulary standards of grading and/or pronunciation standards of grading and/or fluency standards of grading, the method also includes:
Grammer scoring is obtained according to the text data to be evaluated and the grammer standards of grading;And/or
Vocabulary scoring is obtained according to the text data to be evaluated and the vocabulary standards of grading;And/or
Pronunciation scoring is obtained according to the pronunciation character to be evaluated and the pronunciation standards of grading;And/or
Fluency scoring is obtained according to the pronunciation character to be evaluated and the fluency standards of grading.
4. method as claimed in claim 3, wherein further include: it is scored according to grammer scoring and/or the vocabulary And/or the pronunciation scoring and/or fluency scoring obtain the appraisal result.
5. such as the described in any item methods of Claims 1-4, wherein further include:
Obtain the second semantic relevancy between the text data to be evaluated and the model answer of the topic to be measured;
The appraisal result is obtained according to second semantic relevancy.
6. the method for claim 1, wherein further include:
Analysis is carried out to the appraisal result and obtains analysis result;
Comprehensive evaluating report is generated according to the appraisal result and the analysis result.
7. a kind of computer readable storage medium, is stored thereon with program, which realizes aforesaid right when being executed by processor It is required that method described in any one of 1-6.
8. a kind of device for spoken language proficiency evaluation and test, comprising:
Topic abstraction module, for randomly selecting topic to be measured from exam pool;
Voice acquisition module, for acquiring the voice data to be evaluated for being directed to the topic to be measured;
Speech recognition module, for obtaining corresponding text data to be evaluated and pronunciation to be evaluated spy according to the voice data to be evaluated Sign;
First relatedness computation module, for obtaining the first semantic phase between the text data to be evaluated and the topic to be measured Guan Du;
Grading module, for obtaining appraisal result according to first semantic relevancy and the pronunciation character to be evaluated.
9. a kind of electronic equipment, comprising:
Memory, for storing computer program;
Processor, for executing the computer program stored in the memory, and the computer program is performed, following Instruction is run:
Topic to be measured is randomly selected from exam pool;
Acquisition is directed to the voice data to be evaluated of the topic to be measured;
Corresponding text data to be evaluated and pronunciation character to be evaluated are obtained according to the voice data to be evaluated;
Obtain the first semantic relevancy between the text data to be evaluated and the topic to be measured;
Appraisal result is obtained according to first semantic relevancy and the pronunciation character to be evaluated.
CN201711111300.7A 2017-11-13 2017-11-13 Method, device, electronic equipment and medium for oral language level evaluation Active CN109785698B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711111300.7A CN109785698B (en) 2017-11-13 2017-11-13 Method, device, electronic equipment and medium for oral language level evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711111300.7A CN109785698B (en) 2017-11-13 2017-11-13 Method, device, electronic equipment and medium for oral language level evaluation

Publications (2)

Publication Number Publication Date
CN109785698A true CN109785698A (en) 2019-05-21
CN109785698B CN109785698B (en) 2021-11-23

Family

ID=66485327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711111300.7A Active CN109785698B (en) 2017-11-13 2017-11-13 Method, device, electronic equipment and medium for oral language level evaluation

Country Status (1)

Country Link
CN (1) CN109785698B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489756A (en) * 2019-08-23 2019-11-22 上海乂学教育科技有限公司 Conversational human-computer interaction spoken language evaluation system
CN110808038A (en) * 2019-11-11 2020-02-18 腾讯科技(深圳)有限公司 Mandarin assessment method, device, equipment and storage medium
CN110853679A (en) * 2019-10-23 2020-02-28 百度在线网络技术(北京)有限公司 Speech synthesis evaluation method and device, electronic equipment and readable storage medium
CN110853437A (en) * 2019-11-27 2020-02-28 墨子(深圳)人工智能技术有限公司 Intelligent device based on operation tutoring and correcting
CN111047932A (en) * 2020-01-02 2020-04-21 上海健坤教育科技有限公司 Voice interactive teaching system
CN111370029A (en) * 2020-02-28 2020-07-03 北京一起教育信息咨询有限责任公司 Voice data processing method and device, storage medium and electronic equipment
CN111415101A (en) * 2020-04-16 2020-07-14 成都爱维译科技有限公司 Automatic evaluation method and system for civil aviation English-Chinese bilingual radio communication capability grade
CN112507294A (en) * 2020-10-23 2021-03-16 重庆交通大学 English teaching system and teaching method based on human-computer interaction
CN112599115A (en) * 2020-11-19 2021-04-02 上海电机学院 Spoken language evaluation system and method thereof
CN112668883A (en) * 2020-12-29 2021-04-16 华侨大学 Small speech practice system for integrating Chinese speech and speech piece evaluation
CN112951207A (en) * 2021-02-10 2021-06-11 网易有道信息技术(北京)有限公司 Spoken language evaluation method and device and related product
CN113096690A (en) * 2021-03-25 2021-07-09 北京儒博科技有限公司 Pronunciation evaluation method, device, equipment and storage medium
CN113205729A (en) * 2021-04-12 2021-08-03 华侨大学 Foreign student-oriented speech evaluation method, device and system
CN113806516A (en) * 2021-09-22 2021-12-17 湖北天天数链技术有限公司 Matching degree determination method and device, electronic equipment and computer readable storage medium

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090012778A1 (en) * 2007-07-05 2009-01-08 Nec (China) Co., Ltd. Apparatus and method for expanding natural language query requirement
CN101740024A (en) * 2008-11-19 2010-06-16 中国科学院自动化研究所 Method for automatic evaluation based on generalized fluent spoken language fluency
CN101826263A (en) * 2009-03-04 2010-09-08 中国科学院自动化研究所 Objective standard based automatic oral evaluation system
WO2012070840A2 (en) * 2010-11-22 2012-05-31 고려대학교 산학협력단 Apparatus and method for consensus search
CN102509483A (en) * 2011-10-31 2012-06-20 苏州思必驰信息科技有限公司 Distributive automatic grading system for spoken language test and method thereof
CN103151042A (en) * 2013-01-23 2013-06-12 中国科学院深圳先进技术研究院 Full-automatic oral language evaluating management and scoring system and scoring method thereof
US20130260359A1 (en) * 2010-10-29 2013-10-03 Sk Telecom Co., Ltd. Apparatus and method for diagnosing learning ability
CN103678275A (en) * 2013-04-15 2014-03-26 南京邮电大学 Two-level text similarity calculation method based on subjective and objective semantics
CN103810211A (en) * 2012-11-15 2014-05-21 殷程 Intelligent voice knowledge base
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN103955874A (en) * 2014-03-31 2014-07-30 西南林业大学 Automatic subjective-question scoring system and method based on semantic similarity interval
CN104346389A (en) * 2013-08-01 2015-02-11 安徽科大讯飞信息科技股份有限公司 Scoring method and system of semi-open-ended questions of oral test
CN104485115A (en) * 2014-12-04 2015-04-01 上海流利说信息技术有限公司 Pronunciation evaluation equipment, method and system
CN104504023A (en) * 2014-12-12 2015-04-08 广西师范大学 High-accuracy computer automatic marking method for subjective items based on domain ontology
US20150161513A1 (en) * 2013-12-09 2015-06-11 Google Inc. Techniques for detecting deceptive answers to user questions based on user preference relationships
CN105224699A (en) * 2015-11-17 2016-01-06 Tcl集团股份有限公司 A kind of news recommend method and device
CN105632488A (en) * 2016-02-23 2016-06-01 深圳市海云天教育测评有限公司 Voice evaluation method and device
JP2017025052A (en) * 2015-07-27 2017-02-02 永喜 廣澤 Therapeutic methods for treating metabolic disease, vascular disease, and associated diseases thereof, as well as fundus disease and age related macular degeneration disease
CN106909572A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of construction method and device of question and answer knowledge base
CN106970912A (en) * 2017-04-21 2017-07-21 北京慧闻科技发展有限公司 Chinese sentence similarity calculating method, computing device and computer-readable storage medium
CN107240394A (en) * 2017-06-14 2017-10-10 北京策腾教育科技有限公司 A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090012778A1 (en) * 2007-07-05 2009-01-08 Nec (China) Co., Ltd. Apparatus and method for expanding natural language query requirement
CN101740024A (en) * 2008-11-19 2010-06-16 中国科学院自动化研究所 Method for automatic evaluation based on generalized fluent spoken language fluency
CN101826263A (en) * 2009-03-04 2010-09-08 中国科学院自动化研究所 Objective standard based automatic oral evaluation system
US20130260359A1 (en) * 2010-10-29 2013-10-03 Sk Telecom Co., Ltd. Apparatus and method for diagnosing learning ability
WO2012070840A2 (en) * 2010-11-22 2012-05-31 고려대학교 산학협력단 Apparatus and method for consensus search
CN102509483A (en) * 2011-10-31 2012-06-20 苏州思必驰信息科技有限公司 Distributive automatic grading system for spoken language test and method thereof
CN103810211A (en) * 2012-11-15 2014-05-21 殷程 Intelligent voice knowledge base
CN103151042A (en) * 2013-01-23 2013-06-12 中国科学院深圳先进技术研究院 Full-automatic oral language evaluating management and scoring system and scoring method thereof
CN103678275A (en) * 2013-04-15 2014-03-26 南京邮电大学 Two-level text similarity calculation method based on subjective and objective semantics
CN104346389A (en) * 2013-08-01 2015-02-11 安徽科大讯飞信息科技股份有限公司 Scoring method and system of semi-open-ended questions of oral test
US20150161513A1 (en) * 2013-12-09 2015-06-11 Google Inc. Techniques for detecting deceptive answers to user questions based on user preference relationships
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN103955874A (en) * 2014-03-31 2014-07-30 西南林业大学 Automatic subjective-question scoring system and method based on semantic similarity interval
CN104485115A (en) * 2014-12-04 2015-04-01 上海流利说信息技术有限公司 Pronunciation evaluation equipment, method and system
CN104504023A (en) * 2014-12-12 2015-04-08 广西师范大学 High-accuracy computer automatic marking method for subjective items based on domain ontology
JP2017025052A (en) * 2015-07-27 2017-02-02 永喜 廣澤 Therapeutic methods for treating metabolic disease, vascular disease, and associated diseases thereof, as well as fundus disease and age related macular degeneration disease
CN105224699A (en) * 2015-11-17 2016-01-06 Tcl集团股份有限公司 A kind of news recommend method and device
CN106909572A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of construction method and device of question and answer knowledge base
CN105632488A (en) * 2016-02-23 2016-06-01 深圳市海云天教育测评有限公司 Voice evaluation method and device
CN106970912A (en) * 2017-04-21 2017-07-21 北京慧闻科技发展有限公司 Chinese sentence similarity calculating method, computing device and computer-readable storage medium
CN107240394A (en) * 2017-06-14 2017-10-10 北京策腾教育科技有限公司 A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苗靖洁: "大学英语分级测试简答题计算机自动评分的误差分析", 《湖南大学》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489756A (en) * 2019-08-23 2019-11-22 上海乂学教育科技有限公司 Conversational human-computer interaction spoken language evaluation system
CN110853679B (en) * 2019-10-23 2022-06-28 百度在线网络技术(北京)有限公司 Speech synthesis evaluation method and device, electronic equipment and readable storage medium
CN110853679A (en) * 2019-10-23 2020-02-28 百度在线网络技术(北京)有限公司 Speech synthesis evaluation method and device, electronic equipment and readable storage medium
CN110808038A (en) * 2019-11-11 2020-02-18 腾讯科技(深圳)有限公司 Mandarin assessment method, device, equipment and storage medium
CN110853437A (en) * 2019-11-27 2020-02-28 墨子(深圳)人工智能技术有限公司 Intelligent device based on operation tutoring and correcting
CN111047932A (en) * 2020-01-02 2020-04-21 上海健坤教育科技有限公司 Voice interactive teaching system
CN111370029A (en) * 2020-02-28 2020-07-03 北京一起教育信息咨询有限责任公司 Voice data processing method and device, storage medium and electronic equipment
CN111415101A (en) * 2020-04-16 2020-07-14 成都爱维译科技有限公司 Automatic evaluation method and system for civil aviation English-Chinese bilingual radio communication capability grade
CN112507294A (en) * 2020-10-23 2021-03-16 重庆交通大学 English teaching system and teaching method based on human-computer interaction
CN112599115A (en) * 2020-11-19 2021-04-02 上海电机学院 Spoken language evaluation system and method thereof
CN112668883A (en) * 2020-12-29 2021-04-16 华侨大学 Small speech practice system for integrating Chinese speech and speech piece evaluation
CN112951207A (en) * 2021-02-10 2021-06-11 网易有道信息技术(北京)有限公司 Spoken language evaluation method and device and related product
CN112951207B (en) * 2021-02-10 2022-01-07 网易有道信息技术(北京)有限公司 Spoken language evaluation method and device and related product
CN113096690A (en) * 2021-03-25 2021-07-09 北京儒博科技有限公司 Pronunciation evaluation method, device, equipment and storage medium
CN113205729A (en) * 2021-04-12 2021-08-03 华侨大学 Foreign student-oriented speech evaluation method, device and system
CN113806516A (en) * 2021-09-22 2021-12-17 湖北天天数链技术有限公司 Matching degree determination method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN109785698B (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN109785698A (en) Method, apparatus, electronic equipment and medium for spoken language proficiency evaluation and test
Chen et al. Automated scoring of nonnative speech using the speechrater sm v. 5.0 engine
Jin et al. Distinguishing features in scoring L2 Chinese speaking performance: How do they work?
Weinberger et al. The Speech Accent Archive: towards a typology of English accents
CN101551947A (en) Computer system for assisting spoken language learning
Graham et al. Elicited Imitation as an Oral Proficiency Measure with ASR Scoring.
Nurani et al. Improving english pronunciation of adult esl learners through reading aloud assessments
Isaacs Fully automated speaking assessment: Changes to proficiency testing and the role of pronunciation
Rumberg et al. kidsTALC: A Corpus of 3-to 11-year-old German Children's Connected Natural Speech.
Suzuki et al. Measuring speaking and writing fluency: A methodological synthesis focusing on automaticity
Gaizauskas Evaluation in language and speech technology
Han et al. The modular design of an english pronunciation level evaluation system based on machine learning
Al-Ghezi et al. Automatic speaking assessment of spontaneous L2 Finnish and Swedish
Hönig Automatic assessment of prosody in second language learning
Herms et al. CoLoSS: Cognitive load corpus with speech and performance data from a symbol-digit dual-task
Hunte et al. Investigating the potential of NLP-driven linguistic and acoustic features for predicting human scores of children’s oral language proficiency
Price et al. Assessment of emerging reading skills in young native speakers and language learners
Demuth Exploiting corpora for language acquisition research
Wang et al. LAIX Corpus of Chinese Learner English: Towards a Benchmark for L2 English ASR.
Cai Chinese listening comprehension: research and pedagogy
CN113409768A (en) Pronunciation detection method, pronunciation detection device and computer readable medium
Tozlu The development of a listening test for learners of Turkish as a foreign language
Duan et al. An English pronunciation and intonation evaluation method based on the DTW algorithm
Bao et al. An Auxiliary Teaching System for Spoken English Based on Speech Recognition Technology
Xu et al. Application of Multimodal NLP Instruction Combined with Speech Recognition in Oral English Practice

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant