CN106598957A - Data analysis method and system of translated sentence - Google Patents

Data analysis method and system of translated sentence Download PDF

Info

Publication number
CN106598957A
CN106598957A CN201611186449.7A CN201611186449A CN106598957A CN 106598957 A CN106598957 A CN 106598957A CN 201611186449 A CN201611186449 A CN 201611186449A CN 106598957 A CN106598957 A CN 106598957A
Authority
CN
China
Prior art keywords
err
sentence
error type
scoring
translating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611186449.7A
Other languages
Chinese (zh)
Inventor
张芃
蔺伟
郭凤梅
周露义
刘丽颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Language Network (wuhan) Information Technology Co Ltd
Original Assignee
Language Network (wuhan) Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Language Network (wuhan) Information Technology Co Ltd filed Critical Language Network (wuhan) Information Technology Co Ltd
Priority to CN201611186449.7A priority Critical patent/CN106598957A/en
Publication of CN106598957A publication Critical patent/CN106598957A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/51Translation evaluation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a data analysis method and system of a translated sentence, and belongs to the technical field of translation. The analysis method comprises the steps of extracting the translated sentence from a translated text document; extracting an original sentence corresponding to the translated sentence from an original text document corresponding to the translated text document; pushing the translated sentence and the original sentence to at least one evaluation user; obtaining translated sentence error marks of the at least one evaluation user on the translated sentence; and counting the translated sentence error marks and determining the translation quality of the translated sentence according to the translated sentence error marks. According to the analysis method, a full text evaluation task is decomposed into fragmentary single sentence tasks, and the analysis method of the translated sentence is obtained through statistic analysis of marked samples to obtain an objective translated sentence evaluation result; and the fragmentary time of translators is effectively utilized to expand the number of persons meeting the requirements. The analysis method has very good performance in improvement of the translated text recognition efficiency, reduction of the cost and reduction of the expert resource dependence.

Description

A kind of data analysing method and system for translating sentence
Technical field
The present invention relates to translation technology field, more particularly to a kind of data analysing method and system for translating sentence.
Background technology
In traditional translation service flow process, translation service final output for final draft translation, translation content whether accurately, Whether whether smooth, form punctuate uses the factor relation such as correct the satisfaction of client with charge for expression.Due to translator The impact of itself translation ability, understanding of the different translators to same waiting for translating sheet has differences, therefore in order to improve translation Accuracy, during actual translations, need by increase manually examines and revises link, by translation expert read over translation, to exist The translation of problem is labeled;If it find that problem, then return translator and modify.
And in actual translations project, by translation cycle it is short, expense is few, expert is rare etc. that factor is limited for translation, for The evaluation number of the translation quality of final draft translation is less, it is difficult to realize to by the accurate evaluation of the translation quality of translation shelves.
The content of the invention
Embodiments provide a kind of data analysing method and system for translating sentence.For to the embodiment for disclosing A little aspects have a basic understanding, shown below is simple summary.The summarized section is not extensive overview, nor true Determine key/critical component or describe the protection domain of these embodiments.Its sole purpose is to be presented one with simple form A little concepts, in this, as the preamble of following detailed description.
According to an aspect of the invention, there is provided a kind of data analysing method for translating sentence, including:Carry from translation document Take and translate sentence;From the corresponding original text document of translation document, former sentence corresponding with sentence is translated is extracted;Sentence will be translated and former sentence will be pushed at least One evaluation user;Obtain at least one evaluation user's paginal translation sentence translates an error flag;Count and translate an error flag, and according to Translate an error flag and determine the translation quality for translating sentence.
Further, according to translating an error flag, it is determined that the translation quality of sentence is translated, including:Statistics at least one is evaluated and used The Error type I scoring and error type II scoring of family paginal translation sentence, determine each evaluation user's paginal translation sentence translates sentence mistake general comment Point;Translate sentence mistake overall score to calculate as follows:
Err=k1·Err_LC+k2Err_GE,
Wherein, to translate sentence mistake overall score, Err_LC is Error type I scoring, k to Err1For Error type I scoring Weight coefficient, Err_GE is error type II scoring, k2It is the weight coefficient with error type II scoring.
Further, before it is determined that translating sentence mistake overall score Err, analysis method also includes determining Error type I scoring Weight coefficient k1With the weight coefficient k of error type II scoring2, including:Extract in corpus and have determined that the multiple of translation quality The evaluating data of a sample is translated, analytical data includes that translate sentence mistake overall score ErrS, the Error type I of translating a sample score Err_LCS and error type II scoring Err_GES;Linear equation is built, linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that the Error type I scoring for translating sentence is right The weight coefficient k for answering1Weight coefficient k corresponding with error type II scoring2, wherein, multiple linear regression calculation method of parameters bag Include method of least square or gradient descent method.
Further, translating an error flag includes the Error type I option that each evaluation user's paginal translation sentence is selected, the One class wrong option includes:Smooth, syntax error, proper noun mistake, regular collocation mistake, word be not wrong for whole sentence mistranslation, reading It is not inconsistent object language and is accustomed to expression by mistake;Analysis method also includes:According to an error flag is translated, determine that Error type I scores Err_LC, Error type I scoring Err_LC is calculated as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+k15·Err_LC5 +k16·Err_LC6+k17Err_LC7,
Wherein, Err_LC1~Err_LC7 is that the one-to-one option of Error type I option scores, k11~k17For first The one-to-one weight coefficient of class wrong option.
Further, it is determined that before Error type I scoring Err_LC, analysis method also includes determining Error type I The weight coefficient k of option11~k17, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the Error type I scoring Err_LCS of a sample and the one-to-one option of Error type I option Scoring Err_LCS1~Err_LCS7;Linear equation is built, linear equation is:
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the Error type I option of sentence Weight coefficient k11~k17, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
Further, translating an error flag includes the error type II option that each evaluation user's paginal translation sentence is selected, the Two class wrong options include:Vocabulary leakage is translated, cacography, numeric error and Mistaken punctuations;Analysis method also includes:According to translating sentence Error flag, determines that error type II scoring Err_GE, error type II scoring Err_GE are calculated as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, Err_GE1~Err_GE4 is that the one-to-one option of error type II option scores, k21~k24For second The one-to-one weight coefficient of class wrong option.
Further, it is determined that before error type II scoring Err_GE, analysis method also includes determining error type II The weight coefficient k of option21~k24, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the error type II scoring Err_GES of a sample and the one-to-one option of error type II option Scoring Err_GES1~Err_GES4;Linear equation is built, linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the error type II option of sentence Weight coefficient k21~k24, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
Further, analysis method also includes:The all evaluation users' of statistics translates sentence mistake overall score, it is determined that described translate sentence Translate an overall score ScoreST;Translating a calculation of overall score ScoreST is:
Wherein, n is the number for evaluating translation quality the evaluation user that task is fed back, and F is evaluation full marks score value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
Further, analysis method also includes:It is determined that translate sentence translate an overall score before, determine each evaluation user Capacity factor C;Determine that capacity factor C includes:
Wherein, T is the self assessment grade of each evaluation user, and t is the translation calling hierarchy for translating sentence, and α and β is to adjust Number, it is 1/3 that α values are 1.8, β values.
According to the second aspect of the invention, there is provided a kind of data analysis system for translating sentence, including:Extraction unit, uses Sentence is translated in extracting from translation document;And from the corresponding original text document of translation document, extract former sentence corresponding with sentence is translated;Push away Unit is sent, for sentence and former sentence will to be translated at least one evaluation user is pushed to;Acquiring unit, for obtaining at least one use is evaluated Family paginal translation sentence translates an error flag;Determining unit, for statistics an error flag is translated, and is translated according to an error flag determination is translated The translation quality of sentence.
Further, determining unit is used for:Statistics at least one evaluates the Error type I scoring and the of user's paginal translation sentence Two class mistakes score, and determine each evaluation user's paginal translation sentence translates sentence mistake overall score;Sentence mistake overall score is translated according to such as lower section Formula is calculated:
Err=k1·Err_LC+k2Err_GE,
Wherein, to translate sentence mistake overall score, Err_LC is Error type I scoring, k to Err1For Error type I scoring Weight coefficient, Err_GE is error type II scoring, k2It is the weight coefficient with error type II scoring.
Further, determining unit was additionally operable to before it is determined that translating sentence mistake overall score Err, determined that Error type I scores Weight coefficient k1With the weight coefficient k of error type II scoring2, including:Extract in corpus and have determined that the multiple of translation quality The evaluating data of a sample is translated, evaluating data includes that translate sentence mistake overall score ErrS, the Error type I of translating a sample score Err_LCS and error type II scoring Err_GES;Linear equation is built, linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that the Error type I scoring for translating sentence is right The weight coefficient k for answering1Weight coefficient k corresponding with error type II scoring2, wherein, multiple linear regression calculation method of parameters bag Include method of least square or gradient descent method.
Further, translating an error flag includes the Error type I option that each evaluation user's paginal translation sentence is selected, the One class wrong option includes:Smooth, syntax error, proper noun mistake, regular collocation mistake, word be not wrong for whole sentence mistranslation, reading It is not inconsistent object language and is accustomed to expression by mistake;Analysis method also includes:According to an error flag is translated, determine that Error type I scores Err_LC, Error type I scoring Err_LC is calculated as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+k15·Err_LC5 +k16·Err_LC6+k17Err_LC7,
Wherein, Err_LC1~Err_LC7 is that the one-to-one option of Error type I option scores, k11~k17For first The one-to-one weight coefficient of class wrong option.
Further, determining unit is additionally operable to it is determined that before Error type I scoring Err_LC, determining Error type I The weight coefficient k of option11~k17, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the Error type I scoring Err_LCS of a sample and the one-to-one option of Error type I option Scoring Err_LCS1~Err_LCS7;Linear equation is built, linear equation is:
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the Error type I option of sentence Weight coefficient k11~k17, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
Further, translating an error flag includes the error type II option that each evaluation user's paginal translation sentence is selected, the Two class wrong options include:Vocabulary leakage is translated, cacography, numeric error and Mistaken punctuations;Determining unit is additionally operable to:According to translating sentence Error flag, determines that error type II scoring Err_GE, error type II scoring Err_GE are calculated as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, Err_GE1~Err_GE4 is that the one-to-one option of error type II option scores, k21~k24For second The one-to-one weight coefficient of class wrong option.
Further, determining unit is additionally operable to it is determined that before error type II scoring Err_GE, determining error type II The weight coefficient k of option21~k24, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the error type II scoring Err_GES of a sample and the one-to-one option of error type II option Scoring Err_GES1~Err_GES4;Linear equation is built, linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the error type II option of sentence Weight coefficient k21~k24, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
Further, determining unit is used for:Statistics is all to be evaluated users and translates sentence mistake overall score, it is determined that translates sentence translates sentence Overall score ScoreST;Translating a calculation of overall score ScoreST is:
Wherein, n is the number for evaluating translation quality the evaluation user that task is fed back, and F is evaluation full marks score value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
Further, determining unit is used for:It is determined that translate sentence translate an overall score before, determine it is each evaluation user energy Force coefficient C;Determine that capacity factor C includes:
Wherein, T is the self assessment grade of each evaluation user, and t is the translation calling hierarchy for translating sentence, and α and β is to adjust Number, it is 1/3 that α values are 1.8, β values.
Translation duties are pushed to user group and are evaluated by analysis method of the present invention, and each user can participate at any time commenting Estimate, the processing speed of assessment task has been effectively ensured;Assessment task belongs to the part-time task of fragment type, and task price relative translation is special Family to examine and revise task cheap;Assessment task requires the linguistic competence of participating user relatively low, effectively expands and meets the requirements Personnel amount;In terms of comprehensive, the analysis method is improving translation recognition efficiency, reduces cost, is reducing Expert Resources dependence journey There is very outstanding performance on degree.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not The present invention can be limited.
Description of the drawings
Accompanying drawing herein is merged in description and constitutes the part of this specification, shows the enforcement for meeting the present invention Example, and be used to explain the principle of the present invention together with description.
Fig. 1 is the flow chart of analysis method of the present invention.
Specific embodiment
The following description and drawings fully illustrate specific embodiments of the present invention, to enable those skilled in the art to Put into practice them.Other embodiments can include structure, logic, it is electric, process and it is other changes.Embodiment Only represent possible change.Unless explicitly requested, otherwise single components and functionality is optional, and the order for operating can be with Change.The part of some embodiments and feature can be included in or replace part and the feature of other embodiments.This The scope of bright embodiment includes the gamut of claims, and all obtainable equivalent of claims Thing.Herein, each embodiment individually or can be represented generally with term " invention ", this just for the sake of convenient, And if in fact disclosing the invention more than, the scope for being not meant to automatically limit the application is any single invention Or inventive concept.Herein, such as first and second or the like relational terms be used only for by an entity or operation with Another entity or operation make a distinction, and do not require or imply these entities or there is any actual relation between operating Or order.And, term " including ", "comprising" or its any other variant are intended to including for nonexcludability, so as to So that a series of process, method or equipment including key elements not only includes those key elements, but also including being not expressly set out Other key elements, or also include the key element intrinsic for this process, method or equipment.In the feelings without more restrictions Under condition, the key element limited by sentence "including a ...", it is not excluded that in the process including the key element, method or equipment In also there is other identical element.Herein each embodiment is described by the way of progressive, and each embodiment is stressed Be all difference with other embodiment, between each embodiment identical similar portion mutually referring to.For enforcement It is corresponding with method part disclosed in embodiment due to it for example disclosed method, product etc., so the comparison of description is simple Single, related part is referring to method part illustration.
As shown in figure 1, the invention provides translating the data analysing method of sentence, key step includes:
S101, from translation document extract translate sentence;
Each step of the technical scheme in embodiment is to translate sentence analysis stream for same piece translation document to be evaluated Journey, therefore after the completion of translator is to a certain document translation, can pass through to extract the sentence of translating of the translation document carries out translation quality Evaluate, and then judge the translation quality of entire chapter translation document;
Sentence is translated with the sentence in translation document as the minimum unit extracted, it can be wherein in translation document such as to translate sentence Or several;Translation precision of the extraction quantity of translation sample according to the translation ability of translator or required by translation shelves is true Fixed, for example, the translation ability of translator can be divided into primary, intermediate or senior, for Primary translational personnel, analysis method Middle the extracted quantity for translating sentence will be more than high level translation personnel;
Or, for a certain translation precision for being translated documentation requirements can be divided into the standard such as general, accurate, accurate, then for Require translation precision reach Precision criterion by translation shelves, the quantity for translating sentence extracted in analysis method will be more than general standard And accurate standard;
S102, from the corresponding original text document of translation document, extract former sentence corresponding with sentence is translated;
In the step, the corresponding relation of translation document and original text document is established in advance, be with sentence to translate in embodiment for example The minimum of sentence extracts unit, then to translating the paragraph marks paragraph at sentence place in translation document and original text document, and can divide The position of sentence is translated, such as the third line the 2nd of a certain paragraph;After the translation sample for selecting translation document, then an institute is translated according to this Paragraph paragraph and translate the position of sentence, extract correspondence paragraph from original text document and translate the former sentence of a position;
S103, sentence will be translated and former sentence is pushed at least one evaluation user;
In embodiment, it is to be pushed to the evaluation registered on the platform by many platforms of throwing that translation quality evaluates task User, evaluates user and throws platform reception translation quality evaluation by crowd, according to former sentence with translate sentence, make translation quality evaluation;
S104, obtain at least one and evaluate user paginal translation sentence and translate an error flag;
In embodiment, analysis method of the present invention evaluates task to the translation quality that evaluation user is pushed to be included evaluating user According to original text original sentence, the translation error part in the presence of paginal translation sentence is marked, and by each all error flags for translating sentence As the evaluating data that evaluation user is fed back;
The analysis method will translate evaluation task fragmentation, and translation quality evaluation task is pushed to into the huge evaluation of radix User determines, then the translation quality for evaluating user by fetching portion evaluates sample, can greatly reduce to translating Expert Resources Degree of dependence, increase evaluation number to translation quality, improve the accuracy evaluated;
S105, statistics translate an error flag, and determine the translation quality of translating sentence according to an error flag is translated.
Different from the mode directly scored translation translation quality in conventional method of analysis, technical scheme As a kind of negative analysis method of translation quality, equally also can determine according to the particular number and type of error flag sample The actual translations quality of translation document, and according to the error flag for evaluating user translator can be made definitely translated Translation error in the presence of journey, is conducive to lifting the translation ability of translator itself.
The analysis method of the present invention expands the number of groups for evaluating user by many platforms of throwing, and increases to translation document Number is evaluated, the problem for evaluating the too small caused evaluation quality error of radix is overcome, the standard that translation translation is evaluated is improve Exactness.
In one embodiment of the invention, obtain at least one evaluation user's paginal translation sentence translates an error flag, its tool Body process includes:
After it will translate sentence and corresponding former sentence as translation quality evaluation task push, obtain all right within the default time limit Translation quality evaluates the evaluating data of the evaluation user that task is fed back;For example, evaluate task in translation quality to put down in many throwing Platform issue after or by task be pushed to evaluation user group after, set effective evaluation when be limited to 24 hours, then in 24 hours The translation quality evaluation of all evaluation users for being fed back is as effectively evaluating data;
For the translation quality evaluation received after 24 hours is not then considered, by the way, can be effective Ensure that translation quality is evaluated ageing, and improve the processing progress of high-volume document translation quality evaluation task, reduce The resource occupation overstock with many throwing platforms of evaluation task;
Using evaluating data as an error flag is translated, the quantity for translating an error flag is all within the default time limit carrying out The total number of the evaluation user of feedback.
In another embodiment of the invention, obtain at least one evaluation user's paginal translation sentence translates an error flag, its tool Body process includes:
After it will translate sentence and corresponding former sentence as translation quality evaluation task push, according to evaluation user to translation quality The time sequencing that evaluation task is fed back, obtains the evaluation of the sample number respective amount that sample is evaluated with default translation quality The evaluating data of user;For example, the sample number that a certain translation quality evaluation required by task is wanted then is put down no less than 10 according to many throwing The user that evaluates of platform evaluates the translation quality time order and function order that task is fed back, by the translation quality of first 10 evaluation users Evaluate as effectively evaluating data;
Using evaluating data as an error flag is translated, the quantity and default translation quality for translating an error flag evaluates sample Sample keep count of it is identical.
In one embodiment of the invention, according to translating an error flag, it is determined that the translation quality of sentence is translated, its detailed process Including:
Statistics at least one evaluates the Error type I scoring and error type II scoring of user's paginal translation sentence, determines each commenting Valency user's paginal translation sentence translates sentence mistake overall score, wherein, Error type I scoring and error type II scoring are mainly according to translating The mistranslation problem typess of sentence and mistranslation degree are dividing;
Translate sentence mistake overall score Err to calculate as follows:
Err=k1·Err_LC+k2Err_GE,
Wherein, Err_LC is Error type I scoring, k1For the weight coefficient of Error type I scoring, Err_GE is second Class mistake scores, k2It is the weight coefficient with error type II scoring.
Due to k1And k2Two weight coefficients influence whether to translate the computational accuracy of sentence mistake overall score Err, therefore it is determined that translating Before the wrong overall score Err of sentence, the weight coefficient k of Error type I scoring is also calculated in advance1Comment with error type II The weight coefficient k for dividing2, the calculation process disclosed in embodiment includes:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, evaluating data includes described translating Sentence sample translate sentence mistake overall score ErrS, Error type I scoring Err_LCS and error type II scoring Err_GES, wherein, What is stored in corpus translates sentence mistake overall score ErrS, Error type I scoring Err_LCS and error type II scoring Err_ GES etc. is the related data by manually carrying out translation quality scoring, that is, translate sentence mistake overall score ErrS, the first kind wrong By mistake scoring Err_LCS and error type II scoring Err_GES etc. are separate and formerly complete the translation quality point for scoring Analysis data;
Structure is translated translate sentence mistake overall score ErrS, Error type I scoring Err_LCS and the error type II of a sample and is commented Linear equation between point Err_GES three, linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that translating the Error type I scoring of sentence Corresponding weight coefficient k1Weight coefficient k corresponding with error type II scoring2, wherein, multiple linear regression calculation method of parameters Including method of least square or gradient descent method, such that it is able to using calculated k1And k2, with reference to the Err_ for translating sentence to be analyzed LC is that Error type I scoring and Err_GE are error type II scoring, can obtain translating sentence mistake overall score Err.
In the above-described embodiments, an error flag of translating of acquisition includes each evaluation user according to the translation situation institute for translating sentence Selected Error type I option, Error type I option includes:Smooth, syntax error, proper noun be not wrong for whole sentence mistranslation, reading Mistake, regular collocation mistake, wrong wording and expression are not inconsistent object language custom.Evaluating user can be according to oneself to original text original sentence Understand, translation and the translation quality for translating sentence are evaluated, by some in above-mentioned Error type I option or several types As the evaluation content for translating sentence.
Accordingly, according to the Error type I option translated in an error flag, may further determine that Error type I is commented Err_LC, Error type I scoring Err_LC is divided to calculate as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+k15·Err_LC5 +k16·Err_LC6+k17Err_LC7,
Wherein, Err_LC1~Err_LC7 is that the one-to-one option of Error type I option scores, k11~k17For first The one-to-one weight coefficient of class wrong option;In embodiment, the option for being evaluated the Error type I option that user selectes is commented Value is divided to be 1, the option scoring value for not being evaluated the Error type I option that user selectes is 0.
Due to k11~k17Equal weight coefficient influences whether the computational accuracy of Error type I scoring Err_LC, therefore true Before determining Error type I scoring Err_LC, the weight coefficient k of Error type I option is also calculated in advance11~k17, embodiment Disclosed in calculation process include:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, evaluating data includes each translating The Error type I scoring Err_LCS of the sentence sample and one-to-one option scoring Err_LCS1~Err_ of Error type I option LCS7;Wherein, this for having stored in corpus translates Error type I scoring Err_LCS and Error type I option one of a sample One corresponding option scoring Err_LCS1~Err_LCS7 etc. is provided by manually carrying out the related data of translation quality scoring Material, i.e. Error type I scoring Err_LCS and its option scoring Err_LCS1~Err_LCS7 etc. are separate and formerly Complete the translation quality analytical data for scoring;
Build the linear side between Error type I scoring Err_LCS and its option scoring Err_LCS1~Err_LCS7 Journey, linear equation is:
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translating the Error type I choosing of sentence The weight coefficient k of item11~k17, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method, Such that it is able to using calculated k11~k17, with reference to the to be analyzed option scoring Err_LC1~Err_LC7 for translating sentence, can be with Obtain Error type I scoring Err_LC.
In certain embodiments, an error flag of translating of acquisition includes each described evaluating user's paginal translation sentence is selected the Two class wrong options, error type II option includes:Vocabulary leakage is translated, cacography, numeric error and Mistaken punctuations.Evaluate user Can be according to oneself understanding to original text original sentence, the translation quality of paginal translation sentence is evaluated, by above-mentioned error type II option Some or several types translate the evaluation content of sentence as this.
Accordingly, according to the error type II option translated in an error flag, may further determine that error type II is commented Err_GE, error type II scoring Err_GE is divided to calculate as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, Err_GE1~Err_GE7 is that the one-to-one option of error type II option scores, k21~k24For second The one-to-one weight coefficient of class wrong option;In embodiment, the option for being evaluated the error type II option that user selectes is commented Value is divided to be 1, the option scoring value for not being evaluated the error type II option that user selectes is 0.
Due to k21~k24Equal weight coefficient influences whether the computational accuracy of error type II scoring Err_GE, therefore true Before determining error type II scoring Err_GE, the weight coefficient k of error type II option is also calculated in advance21~k24, embodiment Disclosed in calculation process include:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, evaluating data includes each translating The error type II scoring Err_GES of the sentence sample and one-to-one option scoring Err_GES1~Err_ of error type II option GES4;Wherein, this for having stored in corpus translates error type II scoring Err_GES and error type II option one of a sample One corresponding option scoring Err_GES1~Err_GES4 etc. is provided by manually carrying out the related data of translation quality scoring Material, i.e. error type II scoring Err_GES and its option scoring Err_GES1~Err_GES4 etc. are separate and formerly Complete the translation quality analytical data for scoring;
Build the linear side between error type II scoring Err_GES and its option scoring Err_GES1~Err_GES4 Journey, linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the Error type I option of sentence Weight coefficient k21~k24, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method, so as to Calculated k can be utilized21~k24, with reference to the option scoring Err_GE1~Err_GE4 for translating sentence to be analyzed, can obtain Error type II scoring Err_LC.
In the above embodiment of the present invention, according to linear equation, determined by multiple linear regression calculation method of parameters The process of weight coefficient is as follows:
So that method of least square calculates the corresponding weight coefficient of Error type I option as an example, if
Y=Err_LC, X1=Err_LC1, X2=Err_LC2, X3=Err_LC3, X4=Err_LC4, X5=Err_LC5, X6=Err_LC6, X7=Err_LC7
For the n group sample datas for collecting:
Linguistic competence's error score of this n sample in correspondence corpus:
Obtain following system of linear equations:
Multiple linear regression coefficient can be obtained by method of least square:
Wherein,X ' is the transposed matrix of X
In another embodiment of the present invention, the corresponding weight coefficient of Error type I option is calculated with gradient descent method As a example by, order
Set up cost function:
So that the minimum weight coefficient { k of J (k)11, k12, k13, k14, k15, k16, k17, it is to calculate gained from translating sample Regression coefficient;
Concrete calculating process is as follows:
The following calculating process of repetition until convergence (Repeat until convergence)
}
Wherein:1≤i≤n, 1≤j≤7, the upper limit 7 of j is the corresponding weight coefficient of Error type I option in the embodiment Total quantity, and
Wherein:α is convergence coefficient, too small by setting by hand, causes algorithm the convergence speed excessively slow, excessive, can cause to receive Slowing down one's steps, it is too fast across region of convergence to cut down;
Algorithmic statement condition is:The variable for detecting the value of double J (k) meets the threshold value less than certain setting.
In some embodiments of the invention, the step of analysis method also includes:The all evaluation users' of statistics translates sentence mistake By mistake overall score, it is determined that translates sentence translates an overall score ScoreST;
Translating a calculation of overall score ScoreST is:
Wherein, n is the number for evaluating translation quality the evaluation user that task is fed back, and F is evaluation full marks score value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
In embodiment, the time sequencing that translation quality evaluation task is fed back can be used as evaluation according to user is evaluated Family sequence translate sentence, respectively numbering 1,2 ..., n.
In some embodiments of the invention, the step of analysis method also includes:It is determined that the general comment of translating for translating sentence divides it Before, determine the capacity factor C of each evaluation user;
The calculation of capacity factor C includes:
Wherein, T is the self assessment grade of each evaluation user, and t is the translation calling hierarchy for translating sentence, and α and β is to adjust Number, optionally, it is 1/3 that α values are 1.8, β values.
In some embodiments of the invention, the step of analysis method also includes evaluating sample according to translation quality, it is determined that The translation quality of translation document, its idiographic flow also includes:
All overall scores of translating for translating sentence are scored as the translation quality of translation document, and are scored according to translation quality, Determine the translation quality scoring of the corresponding translator of translation document.In embodiment, it is many that can will translate an overall score ScoreST point Individual evaluation score value is interval, each translation quality scoring for evaluating the interval correspondence translator of score value, for example, translates an overall score ScoreST point is evaluated score value interval for 5, and from high to low the translation quality scoring of corresponding translator is 1~5, so as to The translation quality scoring by the translation quality scoring of translation shelves and translator can be obtained by the evaluation task of translation document.
Present invention also offers a kind of data analysis system for translating sentence, the analysis system is using disclosed in above-described embodiment Analysis method, analysis system mainly includes:
Extraction unit, for extracting from translation document sentence is translated;And from the corresponding original text document of translation document, extract Former sentence corresponding with sentence is translated;
Push unit, for translating sentence and former sentence at least one evaluation user is pushed to;
Acquiring unit, for the evaluation user's paginal translation sentence of acquisition at least one error flag is translated;
Determining unit, for statistics an error flag is translated, and determines the translation quality of translating sentence according to an error flag is translated.
In embodiment, determining unit is used for:Statistics at least one evaluate user's paginal translation sentence Error type I scoring and Error type II scores, and determine each evaluation user's paginal translation sentence translates sentence mistake overall score;Sentence mistake overall score is translated according to as follows Mode is calculated:
Err=k1·Err_LC+k2Err_GE,
Wherein, to translate sentence mistake overall score, Err_LC is Error type I scoring, k to Err1For Error type I scoring Weight coefficient, Err_GE is error type II scoring, k2It is the weight coefficient with error type II scoring.
In embodiment, determining unit was additionally operable to before it is determined that translating sentence mistake overall score Err, determined that Error type I is commented The weight coefficient k for dividing1With the weight coefficient k of error type II scoring2, including:Extract in corpus and have determined that many of translation quality The individual evaluating data for translating a sample, evaluating data includes that translate sentence mistake overall score ErrS, the Error type I of translating a sample score Err_LCS and error type II scoring Err_GES;Linear equation is built, linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that the Error type I scoring for translating sentence is right The weight coefficient k for answering1Weight coefficient k corresponding with error type II scoring2, wherein, multiple linear regression calculation method of parameters bag Include method of least square or gradient descent method.
In embodiment, translating an error flag includes the Error type I option that each evaluation user's paginal translation sentence is selected, Error type I option includes:Whole sentence mistranslation, reading not smooth, syntax error, proper noun mistake, regular collocation mistake, word Mistake and expression are not inconsistent object language custom;Analysis method also includes:According to an error flag is translated, determine that Error type I scores Err_LC, Error type I scoring Err_LC is calculated as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+k15·Err_LC5 +k16·Err_LC6+k17Err_LC7,
Wherein, Err_LC1~Err_LC7 is that the one-to-one option of Error type I option scores, k11~k17For first The one-to-one weight coefficient of class wrong option.
In embodiment, determining unit is additionally operable to it is determined that before Error type I scoring Err_LC, determining that the first kind is wrong The weight coefficient k for distractering11~k17, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the Error type I scoring Err_LCS of a sample and the one-to-one option of Error type I option Scoring Err_LCS1~Err_LCS7;Linear equation is built, linear equation is:
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the Error type I option of sentence Weight coefficient k11~k17, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
In embodiment, translating an error flag includes the error type II option that each evaluation user's paginal translation sentence is selected, Error type II option includes:Vocabulary leakage is translated, cacography, numeric error and Mistaken punctuations;Determining unit is additionally operable to:According to translating Sentence error flag, determines that error type II scoring Err_GE, error type II scoring Err_GE are calculated as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, Err_GE1~Err_GE4 is that the one-to-one option of error type II option scores, k21~k24For second The one-to-one weight coefficient of class wrong option.
In embodiment, determining unit is additionally operable to it is determined that before error type II scoring Err_GE, determining that Equations of The Second Kind is wrong The weight coefficient k for distractering21~k24, including:Extract multiple evaluation numbers for translating a sample that translation quality is had determined that in corpus According to evaluating data includes translating the error type II scoring Err_GES of a sample and the one-to-one option of error type II option Scoring Err_GES1~Err_GES4;Linear equation is built, linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to linear equation, by multiple linear regression calculation method of parameters, it is determined that translate the error type II option of sentence Weight coefficient k21~k24, wherein, multiple linear regression calculation method of parameters includes method of least square or gradient descent method.
In embodiment, determining unit is used for:The all evaluation users' of statistics translates sentence mistake overall score, it is determined that translating translating for sentence Sentence overall score ScoreST;The calculation for translating an overall score is:
Wherein, n is the number for evaluating translation quality the evaluation user that task is fed back, and F is evaluation full marks score value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
In embodiment, determining unit is used for:It is determined that translate sentence translate an overall score before, determine each evaluation user Capacity factor C;Determine that capacity factor C includes:
Wherein, T is the self assessment grade of each evaluation user, and t is the translation calling hierarchy for translating sentence, and α and β is to adjust Number, it is 1/3 that α values are 1.8, β values.
In embodiment, determining unit is additionally operable to that translation scores as the translation quality of translation document and scores, and according to Translation scores, and determines the translation quality scoring of the corresponding translator of translation document.
It should be appreciated that the flow process and structure for being described above and being shown in the drawings is the invention is not limited in, And can without departing from the scope carry out various modifications and changes.The scope of the present invention is only limited by appended claim System.

Claims (18)

1. a kind of data analysing method for translating sentence, it is characterised in that include:
Extract from translation document and translate sentence;
From the corresponding original text document of the translation document, extract and translate the corresponding former sentence of sentence with described;
Sentence is translated by described and the former sentence is pushed at least one evaluation user;
Obtain at least one evaluation user to it is described translate sentence translate an error flag;
Statistics is described to translate an error flag, and according to it is described translate an error flag determination described in translate the translation quality of sentence.
2. analysis method according to claim 1 a, it is characterised in that error flag is translated according to described, it is determined that described translate The translation quality of sentence, including:
User is evaluated described in statistics at least one to the Error type I scoring for translating sentence and error type II scoring, it is determined that often Evaluate described in one user to it is described translate sentence translate sentence mistake overall score;
The sentence mistake overall score of translating is calculated as follows:
Err=k1·Err_LC+k2Err_GE,
Wherein, the Err translates sentence mistake overall score for described, and Err_LC is Error type I scoring, k1For the first kind The weight coefficient of mistake scoring, Err_GE is error type II scoring, k2It is the weight with error type II scoring Coefficient.
3. analysis method according to claim 2, it is characterised in that it is determined that it is described translate sentence mistake overall score Err before, The analysis method also includes determining the weight coefficient k of the Error type I scoring1With the institute of error type II scoring State weight coefficient k2, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating Sentence sample translate sentence mistake overall score ErrS, Error type I scoring Err_LCS and error type II scoring Err_GES;
Linear equation is built, the linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the first kind for translating sentence is wrong Score the corresponding weight coefficient k by mistake1The weight coefficient k corresponding with error type II scoring2, wherein, the polynary line Property regression parameter computational methods include method of least square or gradient descent method.
4. analysis method according to claim 2 a, it is characterised in that error flag of translating is including each evaluation User translates the Error type I option that sentence is selected to described, and the Error type I option includes:Whole sentence mistranslation, reading are not Freely, syntax error, proper noun mistake, regular collocation mistake, wrong wording and expression is not inconsistent object language custom;
The analysis method also includes:
An error flag is translated according to described, the Error type I scoring Err_LC, the Error type I scoring Err_ is determined LC is calculated as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+
k15·Err_LC5+k16·Err_LC6+k17Err_LC7,
Wherein, the Err_LC1~Err_LC7 is that the one-to-one option of the Error type I option scores, the k11~ k17For the one-to-one weight coefficient of the Error type I option.
5. analysis method according to claim 4, it is characterised in that it is determined that Error type I scoring Err_LC it Before, the analysis method also includes determining the weight coefficient k of the Error type I option11~k17, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating The Error type I scoring Err_LCS of the sentence sample and one-to-one option scoring Err_LCS1 of the Error type I option~ Err_LCS7;
Linear equation is built, the linear equation is:
E r r _ L C S = k 11 · E r r _ L C S 1 + k 12 · E r r _ L C S 2 + k 13 · E r r _ L C S 3 + k 14 · E r r _ L C S 4 + k 15 · E r r _ L C S 5 + k 16 · E r r _ L C S 6 + k 17 · E r r _ L C S 7 ;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the first kind for translating sentence is wrong The weight coefficient k for distractering11~k17, wherein, the multiple linear regression calculation method of parameters include method of least square or Gradient descent method.
6. analysis method according to claim 2 a, it is characterised in that error flag of translating is including each evaluation User translates the error type II option that sentence is selected to described, and the error type II option includes:Vocabulary leakage is translated, misspelling By mistake, numeric error and Mistaken punctuations;
The analysis method also includes:
An error flag is translated according to described, the error type II scoring Err_GE, the error type II scoring Err_ is determined GE is calculated as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, the Err_GE1~Err_GE4 is that the one-to-one option of the error type II option scores, the k21~ k24For the one-to-one weight coefficient of the error type II option.
7. analysis method according to claim 6, it is characterised in that it is determined that error type II scoring Err_GE it Before, the analysis method also includes determining the weight coefficient k of the error type II option21~k24, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating The error type II scoring Err_GES of the sentence sample and one-to-one option scoring Err_GES1 of the error type II option~ Err_GES4;
Linear equation is built, the linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the Equations of The Second Kind for translating sentence is wrong The weight coefficient k for distractering21~k24, wherein, the multiple linear regression calculation method of parameters include method of least square or Gradient descent method.
8. analysis method according to claim 2, it is characterised in that the analysis method also includes:
Count it is all it is described evaluate users it is described translate sentence mistake overall score determine described in translate sentence translate an overall score ScoreST;
A calculation of overall score ScoreST of translating is:
S c o r e S T = Σ i = 1 n ( F - Err i ) · C i Σ i = 1 n C i ,
Wherein, n is the number for evaluating the translation quality evaluation user that task is fed back, and F divides to evaluate full marks Value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
9. analysis method according to claim 8, it is characterised in that also include:
It is determined that it is described translate an overall score before, determine it is each it is described evaluate user the capacity factor C;
Determine that the capacity factor C includes:
C = α 1 + e - β ( T - t ) ,
Wherein, T is each self assessment grade for evaluating user, and t is the translation calling hierarchy for translating sentence, and α and β is tune Section coefficient, it is 1/3 that the α values are 1.8, β values.
10. a kind of data analysis system for translating sentence, it is characterised in that include:
Extraction unit, for extracting from translation document sentence is translated;And from the corresponding original text document of the translation document, extract The corresponding former sentence of sentence is translated with described;
Push unit, for translating sentence and the former sentence is pushed at least one evaluation user by described;
Acquiring unit, for obtain at least one evaluation user to it is described translate sentence translate an error flag;
Determining unit, for counting described an error flag is translated, and according to it is described translate an error flag determine described in translate turning over for sentence Translate quality.
11. evaluation systems according to claim 10, it is characterised in that the determining unit is used for:
User is evaluated described in statistics at least one to the Error type I scoring for translating sentence and error type II scoring, it is determined that often Evaluate described in one user to it is described translate sentence translate sentence mistake overall score;
The sentence mistake overall score of translating is calculated as follows:
Err=k1·Err_LC+k2Err_GE,
Wherein, the Err translates sentence mistake overall score for described, and Err_LC is Error type I scoring, k1For the first kind The weight coefficient of mistake scoring, Err_GE is error type II scoring, k2It is the weight with error type II scoring Coefficient.
12. analysis systems according to claim 11, it is characterised in that the determining unit is additionally operable to it is determined that described translate Before the wrong overall score Err of sentence, the weight coefficient k of the Error type I scoring is determined1With error type II scoring The weight coefficient k2, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating Sentence sample translate sentence mistake overall score ErrS, Error type I scoring Err_LCS and error type II scoring Err_GES;
Linear equation is built, the linear equation is:
ErrS=k1·Err_LCS+k2·Err_GES;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the first kind for translating sentence is wrong Score the corresponding weight coefficient k by mistake1The weight coefficient k corresponding with error type II scoring2, wherein, the polynary line Property regression parameter computational methods include method of least square or gradient descent method.
13. analysis systems according to claim 11 a, it is characterised in that error flag of translating is including the commentary of each institute Valency user translates the Error type I option that sentence is selected to described, and the Error type I option includes:Whole sentence mistranslation, reading are not Freely, syntax error, proper noun mistake, regular collocation mistake, wrong wording and expression is not inconsistent object language custom;
The analysis method also includes:
An error flag is translated according to described, the Error type I scoring Err_LC, the Error type I scoring Err_ is determined LC is calculated as follows:
Err_LC=k11·Err_LC1+k12·Err_LC2+k13·Err_LC3+k14·Err_LC4+
k15·Err_LC5+k16·Err_LC6+k17Err_LC7,
Wherein, the Err_LC1~Err_LC7 is that the one-to-one option of the Error type I option scores, the k11~ k17For the one-to-one weight coefficient of the Error type I option.
14. analysis systems according to claim 13, it is characterised in that the determining unit is additionally operable to it is determined that described Before one class mistake scoring Err_LC, the weight coefficient k of the Error type I option is determined11~k17, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating The Error type I scoring Err_LCS of the sentence sample and one-to-one option scoring Err_LCS1 of the Error type I option~ Err_LCS7;
Linear equation is built, the linear equation is:
E r r _ L C S = k 11 · E r r _ L C S 1 + k 12 · E r r _ L C S 2 + k 13 · E r r _ L C S 3 + k 14 · E r r _ L C S 4 + k 15 · E r r _ L C S 5 + k 16 · E r r _ L C S 6 + k 17 · E r r _ L C S 7 ;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the first kind for translating sentence is wrong The weight coefficient k for distractering11~k17, wherein, the multiple linear regression calculation method of parameters include method of least square or Gradient descent method.
15. analysis systems according to claim 11 a, it is characterised in that error flag of translating is including the commentary of each institute Valency user translates the error type II option that sentence is selected to described, and the error type II option includes:Vocabulary leakage is translated, misspelling By mistake, numeric error and Mistaken punctuations;
The determining unit is additionally operable to:
An error flag is translated according to described, the error type II scoring Err_GE, the error type II scoring Err_ is determined GE is calculated as follows:
Err_GE=k21·Err_GE1+k22·Err_GE2+k23·Err_GE3+k24Err_GE4,
Wherein, the Err_GE1~Err_GE4 is that the one-to-one option of the error type II option scores, the k21~ k24For the one-to-one weight coefficient of the error type II option.
16. analysis systems according to claim 15, it is characterised in that the determining unit is additionally operable to it is determined that described Before two class mistakes scoring Err_GE, the weight coefficient k of the error type II option is determined21~k24, including:
Multiple evaluating datas for translating a sample that translation quality is had determined that in corpus are extracted, the evaluating data includes described translating The error type II scoring Err_GES of the sentence sample and one-to-one option scoring Err_GES1 of the error type II option~ Err_GES4;
Linear equation is built, the linear equation is:
Err_GES=k21·Err_GES1+k22·Err_GES2+k23·Err_GES3+k24·Err_GES4;
According to the linear equation, by multiple linear regression calculation method of parameters, it is determined that the Equations of The Second Kind for translating sentence is wrong The weight coefficient k for distractering21~k24, wherein, the multiple linear regression calculation method of parameters include method of least square or Gradient descent method.
17. evaluation systems according to claim 11, it is characterised in that the determining unit is used for:
Count it is all it is described evaluate users and translate sentence mistake overall score, it is determined that it is described translate sentence translate an overall score ScoreST;
A calculation of overall score ScoreST of translating is:
S c o r e S T = Σ i = 1 n ( F - Err i ) · C i Σ i = 1 n C i ,
Wherein, n is the number for evaluating the translation quality evaluation user that task is fed back, and F divides to evaluate full marks Value, ErriSentence mistake overall score, C are translated for i-th evaluation useriFor the capacity factor C of i-th evaluation user.
18. analysis systems according to claim 17, it is characterised in that the determining unit is used for:
It is determined that it is described translate sentence translate an overall score before, determine it is each it is described evaluate user the capacity factor C;
Determine that the capacity factor C includes:
C = α 1 + e - β ( T - t ) ,
Wherein, T is each self assessment grade for evaluating user, and t is the translation calling hierarchy for translating sentence, and α and β is tune Section coefficient, it is 1/3 that the α values are 1.8, β values.
CN201611186449.7A 2016-12-21 2016-12-21 Data analysis method and system of translated sentence Pending CN106598957A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611186449.7A CN106598957A (en) 2016-12-21 2016-12-21 Data analysis method and system of translated sentence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611186449.7A CN106598957A (en) 2016-12-21 2016-12-21 Data analysis method and system of translated sentence

Publications (1)

Publication Number Publication Date
CN106598957A true CN106598957A (en) 2017-04-26

Family

ID=58602004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611186449.7A Pending CN106598957A (en) 2016-12-21 2016-12-21 Data analysis method and system of translated sentence

Country Status (1)

Country Link
CN (1) CN106598957A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537246A (en) * 2018-02-28 2018-09-14 成都优译信息技术股份有限公司 A kind of method and system that parallel corpora is classified by translation quality
CN109166594A (en) * 2018-07-24 2019-01-08 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537246A (en) * 2018-02-28 2018-09-14 成都优译信息技术股份有限公司 A kind of method and system that parallel corpora is classified by translation quality
CN109166594A (en) * 2018-07-24 2019-01-08 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing

Similar Documents

Publication Publication Date Title
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN106570179B (en) A kind of kernel entity recognition methods and device towards evaluation property text
US8301640B2 (en) System and method for rating a written document
CN104063387B (en) Apparatus and method of extracting keywords in the text
CN102662930B (en) Corpus tagging method and corpus tagging device
CN104756100A (en) Intent estimation device and intent estimation method
CN106598959A (en) Method and system for determining intertranslation relationship of bilingual sentence pairs
CN105045778A (en) Chinese homonym error auto-proofreading method
CN105678327A (en) Method for extracting non-taxonomy relations between entities for Chinese patents
CN105279252A (en) Related word mining method, search method and search system
CN103399901A (en) Keyword extraction method
CN101866337A (en) Part-or-speech tagging system, and device and method thereof for training part-or-speech tagging model
CN103678272B (en) The disposal route of unregistered word in the interdependent treebank of Chinese
CN105550170A (en) Chinese word segmentation method and apparatus
Jahangir et al. N-gram and gazetteer list based named entity recognition for urdu: A scarce resourced language
CN106776555B (en) A kind of comment text entity recognition method and device based on word model
Zhang et al. HANSpeller++: A unified framework for Chinese spelling correction
CN107463711A (en) A kind of tag match method and device of data
CN102760121B (en) Dependence mapping method and system
CN102646091A (en) Dependence relationship labeling method, device and system
CN103678288A (en) Automatic proper noun translation method
CN106598957A (en) Data analysis method and system of translated sentence
CN106779455A (en) The methods of risk assessment and system of a kind of translation project
CN113157860A (en) Electric power equipment maintenance knowledge graph construction method based on small-scale data
CN104317783A (en) SRC calculation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170426

RJ01 Rejection of invention patent application after publication