CN106844354A - A kind of webpage takes word Chinese interpretation method and its device - Google Patents
A kind of webpage takes word Chinese interpretation method and its device Download PDFInfo
- Publication number
- CN106844354A CN106844354A CN201710019958.9A CN201710019958A CN106844354A CN 106844354 A CN106844354 A CN 106844354A CN 201710019958 A CN201710019958 A CN 201710019958A CN 106844354 A CN106844354 A CN 106844354A
- Authority
- CN
- China
- Prior art keywords
- translation
- chinese
- word
- module
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Abstract
Webpage of the invention takes word Chinese's translating equipment and has merged machine translation and data retrieval, it is identified for webpage text acquisition module, corresponding cypher text is directly returned if it can be retrieved in Chinese's intertranslation DBM, otherwise recall machine translation module is carried out from paragraph to sentence again to the parsing step by step of word to the content for obtaining, after translation and show final result, it is allowed to which user is edited the more preferable translation of offer to translation result again.The present invention is not limited to the translation of word, and capable translation can be dropped into whole sentence and whole section, ensures the integrality of translation result;Using fusion machine translation and the method for data retrieval, without machine translation module is called every time, translation speed can be greatly improved;Using translation, editor module improves translation result again, and Chinese's intertranslation pair is expanded as the increase of access times is sustainable.
Description
Technical field
The present invention relates to Computer Applied Technology field, more particularly to a kind of webpage for merging machine translation and data retrieval
Take word Chinese interpretation method and its device.
Background technology
With the development of internet, increasing knowledge is propagated by webpage.China is unified multi-ethnic country,
The part concentrated area of nationality, however it remains the more difficult nationality compatriot of many use Chinese.Existing translation software is directed to the Chinese mostly
The majority languages such as English, lack the related interpretative function of native language;On the other hand, some screen word-selecting softwares, such as Kingsoft Powerword, only
Word can be translated, it is impossible to complete the translation of chapter, paragraph or sentence level so that indigestion is whole sometimes for user
The implication of individual paragraph or whole sentence.Therefore, how to obtain the given content on webpage and translate into required native language tool
There is realistic meaning.In recent years, natural language processing technique especially machine translation mothod sustainable development, native language information chemical industry
Also greater advance is achieved, certain native language resource is have accumulated, is to realize that Chinese translate using machine translation mothod to carry
Language basis and technical support are supplied.
The content of the invention
The present invention is for the information-based current demand of native language, there is provided a kind of fusion machine translation and data retrieval
Webpage takes word Chinese interpretation method and its device, obtains the word in Chinese web, to word from paragraph to sentence, merges machine
Translation is translated downwards step by step with data retrieval, realizes the effective integration of machine translation and data retrieval, is improve Chinese and is turned over
The speed and accuracy translated.
The present invention is achieved by the following technical solutions:
A kind of machine translation that merges takes word Chinese's interpretation method with the webpage of data retrieval, comprises the following steps:
Step S1:Set up language translation model, decoder, Chinese's fontlib and Chinese's input method;
Step S2:Chinese's bilingual teaching mode is set up, is preserved in man-to-man form;
Step S3:The bilingual comparison data storehouse of Chinese is set up, is preserved in man-to-man form;
Step S4:In the non-Web page text such as navigation bar, menu, title, the content of text in entire Web page element is obtained,
In Web page text part, with paragraph as the upper limit, the content of text at mouse is recognized and obtained in maximum length mode;
Step S5:The content of text of acquisition is compared with data in Chinese's bilingual teaching mode, if can find
In the presence of the consistent intertranslation of the content of text for obtaining to then returning to corresponding translation data, the text that will be obtained if it cannot find
Content carries out paragraph, sentence, word and parses step by step by decoder, is compared with the bilingual comparison data storehouse corresponding data of Chinese
It is right, the parsing data after comparison are returned;
Step S6:The translation data of return or parsing data are arranged again by language translation model, after arrangement
Translation result submit to, Chinese's fontlib is called according to translation languages and code identification, show final translation result;
Step S7:Translation is carried out to final translation result to edit again, it is allowed to which user calls Chinese's input method to carry out translation
Editor and modification, and webpage word and the amended translation that will be obtained as intertranslation to added to Chinese's bilingual teaching mode
In.Result carries out translation and edits again, it is allowed to which user calls Chinese's input method to edit and change translation, and the net that will be obtained
Page word and amended translation is as intertranslation to added in Chinese's bilingual teaching mode.
A kind of machine translation that merges takes word Chinese's translating equipment with the webpage of data retrieval, including webpage word obtains mould
Block, Chinese's intertranslation DBM, machine translation module, display module and translation editor module, Chinese's intertranslation data again
Library module includes data retrieval module, Chinese's bilingual teaching mode, the bilingual comparison data storehouse of Chinese;The machine translation module
Including language translation model, decoder, Chinese's fontlib and Chinese's input method.
The webpage that the present invention is provided takes word Chinese's translating equipment and has merged machine translation and data retrieval, for webpage word
Acquisition module is identified, and corresponding translation text is directly returned if it can be retrieved in Chinese's intertranslation DBM
This, otherwise recall machine translation module is carried out from paragraph to sentence again to the parsing step by step of word, translation to the content for obtaining
Afterwards and show final result, it is allowed to which user is edited the more preferable translation of offer to translation result again.The present invention is not limited to list
The translation of word, can drop into capable translation to whole sentence and whole section, ensure the integrality of translation result;Use fusion machine translation
With the method for data retrieval, without machine translation module is called every time, translation speed can be greatly improved;Compiled again using translation
Module is collected to improve translation result, Chinese's intertranslation pair is expanded as the increase of access times is sustainable.
Brief description of the drawings
Fig. 1 is the flow chart that webpage of the invention takes word Chinese's interpretation method
Fig. 2 is the structure chart that webpage of the invention takes word Chinese's translating equipment.
Specific embodiment
Technical scheme is elaborated below in conjunction with Fig. 1 and Fig. 2.
As depicted in figs. 1 and 2, webpage of the invention takes word Chinese's translating equipment, including webpage text acquisition module, Chinese
Intertranslation DBM, machine translation module, display module and translation editor module again.Chinese's intertranslation DBM includes number
According to retrieval module, Chinese's bilingual teaching mode, the bilingual comparison data storehouse of Chinese, machine translation module includes language translation mould
Type, decoder, Chinese's fontlib and Chinese's input method.Chinese's bilingual teaching mode and the intertranslation in the bilingual comparison data storehouse of Chinese
To being preserved in man-to-man form.
When needs are translated, after webpage text acquisition module starts, in the non-Web page text such as navigation bar, menu, title
In, obtain the content of text in entire Web page element;In Web page text part, with paragraph as the upper limit, known in maximum length mode
Not and obtain the content of text at mouse.The content of text that will be obtained again is bilingual parallel with the Chinese of Chinese's intertranslation DBM
Data are compared in corpus, and corresponding translation is returned to if in the presence of the consistent intertranslation of the content of text for obtaining if that can find
Data;The content of text of acquisition is carried out into paragraph, sentence, word by the decoder of machine translation module if it cannot find
Parse step by step, compare with the bilingual comparison data storehouse corresponding data of the Chinese of Chinese's intertranslation DBM, after comparison
Parsing data are returned.The translation data of return or parsing data are carried out by the language translation model of machine translation module whole again
Reason, the translation result after arrangement is submitted to.Display module calls Chinese's fontlib according to translation languages and code identification, and display is most
Whole translation result.User can call Chinese's input method of machine translation module to edit and change translation, and by obtain
Webpage text content and amended translation are as intertranslation to the bilingual parallel language of Chinese added to Chinese's intertranslation DBM
In material storehouse.
The above is only the preferred embodiment of the present invention, and protection scope of the present invention is not limited merely to above-mentioned implementation
Example, all technical schemes belonged under thinking of the present invention belong to protection scope of the present invention.It should be pointed out that for the art
Those of ordinary skill for, some improvements and modifications without departing from the principles of the present invention, these improvements and modifications
Should be regarded as protection scope of the present invention.
Claims (2)
1. a kind of webpage takes word Chinese's interpretation method, it is characterised in that:Comprise the following steps:
Step S1:Set up language translation model, decoder, Chinese's fontlib and Chinese's input method;
Step S2:Chinese's bilingual teaching mode is set up, is preserved in man-to-man form;
Step S3:The bilingual comparison data storehouse of Chinese is set up, is preserved in man-to-man form;
Step S4:In the non-Web page text such as navigation bar, menu, title, the content of text in entire Web page element is obtained, in net
Page body part, with paragraph as the upper limit, recognizes and obtains the content of text at mouse in maximum length mode;
Step S5:The content of text of acquisition is compared with data in Chinese's bilingual teaching mode, if presence can be found
The consistent intertranslation of the content of text of acquisition to then returning to corresponding translation data, the content of text that will be obtained if it cannot find
Paragraph, sentence, word are carried out by decoder to parse step by step, compare with the bilingual comparison data storehouse corresponding data of Chinese, will
Parsing data after comparison are returned;
Step S6:By language translation model by the translation data of return or parsing data arranged again, by arrangement after turn over
Result submission is translated, Chinese's fontlib is called according to translation languages and code identification, show final translation result;
Step S7:Translation is carried out to final translation result to edit again, it is allowed to which user calls Chinese's input method to edit translation
With modification, and webpage word and the amended translation that will be obtained as intertranslation to added in Chinese's bilingual teaching mode.
2. webpage takes the translating equipment of word Chinese's interpretation method according to claim 1, it is characterised in that:Including webpage word
Acquisition module, Chinese's intertranslation DBM, machine translation module, display module and translation editor module again, the Chinese are mutual
Translating DBM includes data retrieval module, Chinese's bilingual teaching mode, the bilingual comparison data storehouse of Chinese;The machine is turned over
Translating module includes language translation model, decoder, Chinese's fontlib and Chinese's input method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710019958.9A CN106844354A (en) | 2017-01-11 | 2017-01-11 | A kind of webpage takes word Chinese interpretation method and its device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710019958.9A CN106844354A (en) | 2017-01-11 | 2017-01-11 | A kind of webpage takes word Chinese interpretation method and its device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844354A true CN106844354A (en) | 2017-06-13 |
Family
ID=59118115
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710019958.9A Pending CN106844354A (en) | 2017-01-11 | 2017-01-11 | A kind of webpage takes word Chinese interpretation method and its device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844354A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112487791A (en) * | 2020-11-27 | 2021-03-12 | 江苏省舜禹信息技术有限公司 | Multi-language hybrid intelligent translation method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1661593A (en) * | 2004-02-24 | 2005-08-31 | 北京中专翻译有限公司 | Method for translating computer language and translation system |
CN102270198A (en) * | 2011-08-16 | 2011-12-07 | 上海交通大学出版社有限公司 | Computer assisted translation system |
CN102662933A (en) * | 2012-03-28 | 2012-09-12 | 成都优译信息技术有限公司 | Distributive intelligent translation method |
CN103020044A (en) * | 2012-12-03 | 2013-04-03 | 江苏乐买到网络科技有限公司 | Machine-aided webpage translation method and system thereof |
CN103631773A (en) * | 2013-12-16 | 2014-03-12 | 哈尔滨工业大学 | Statistical machine translation method based on field similarity measurement method |
CN103885939A (en) * | 2012-12-19 | 2014-06-25 | 新疆信息产业有限责任公司 | Uyghur-Chinese bi-directional translation memory system construction method |
-
2017
- 2017-01-11 CN CN201710019958.9A patent/CN106844354A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1661593A (en) * | 2004-02-24 | 2005-08-31 | 北京中专翻译有限公司 | Method for translating computer language and translation system |
CN102270198A (en) * | 2011-08-16 | 2011-12-07 | 上海交通大学出版社有限公司 | Computer assisted translation system |
CN102662933A (en) * | 2012-03-28 | 2012-09-12 | 成都优译信息技术有限公司 | Distributive intelligent translation method |
CN103020044A (en) * | 2012-12-03 | 2013-04-03 | 江苏乐买到网络科技有限公司 | Machine-aided webpage translation method and system thereof |
CN103885939A (en) * | 2012-12-19 | 2014-06-25 | 新疆信息产业有限责任公司 | Uyghur-Chinese bi-directional translation memory system construction method |
CN103631773A (en) * | 2013-12-16 | 2014-03-12 | 哈尔滨工业大学 | Statistical machine translation method based on field similarity measurement method |
Non-Patent Citations (1)
Title |
---|
潘学权主编: "《计算机辅助翻译教程》", 30 June 2016 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112487791A (en) * | 2020-11-27 | 2021-03-12 | 江苏省舜禹信息技术有限公司 | Multi-language hybrid intelligent translation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101197849B (en) | Method for commuting internet page into wireless application protocol page | |
CN102184189B (en) | Webpage core block determining method based on DOM (Document Object Model) node text density | |
KR101393794B1 (en) | Terminal and method for determining a type of input method editor | |
CN102065114A (en) | Method and device for mobile terminal to access webpage | |
US20130339840A1 (en) | System and method for logical chunking and restructuring websites | |
US20080172219A1 (en) | Foreign language translator in a document editor | |
Müller et al. | Multi-level annotation in MMAX | |
CN103064827A (en) | Method and device for extracting webpage content | |
Way et al. | On the Role of Translations in State‐of‐the‐Art Statistical Machine Translation | |
CN104142985B (en) | A kind of semi-automatic vertical reptile Core Generator and method | |
CN102467497A (en) | Method and system for text translation in verification program | |
CN102141868A (en) | Method for quickly operating information interaction page, input method system and browser plug-in | |
US9811505B2 (en) | Techniques to provide processing enhancements for a text editor in a computing environment | |
RU2579888C2 (en) | Universal presentation of text to support various formats of documents and text subsystem | |
Roudaki et al. | A classification of web browsing on mobile devices | |
CN106202066A (en) | The interpretation method of website and device | |
WO2013148351A1 (en) | System and method for analyzing an electronic documents | |
CN111831384A (en) | Language switching method and device, equipment and storage medium | |
CN110309457B (en) | Webpage data processing method, device, computer equipment and storage medium | |
US10198408B1 (en) | System and method for converting and importing web site content | |
CN103455572A (en) | Method and device for acquiring movie and television subjects from web pages | |
CN106844354A (en) | A kind of webpage takes word Chinese interpretation method and its device | |
US9594737B2 (en) | Natural language-aided hypertext document authoring | |
CN101425087A (en) | Method and system for constructing dictionary | |
KR102095703B1 (en) | An apparatus, method and recording medium for Markup parsing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |