CN106372065A - Method and system for developing multi-language website - Google Patents
Method and system for developing multi-language website Download PDFInfo
- Publication number
- CN106372065A CN106372065A CN201610958116.5A CN201610958116A CN106372065A CN 106372065 A CN106372065 A CN 106372065A CN 201610958116 A CN201610958116 A CN 201610958116A CN 106372065 A CN106372065 A CN 106372065A
- Authority
- CN
- China
- Prior art keywords
- data
- translation
- language
- website
- language website
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000013519 translation Methods 0.000 claims abstract description 165
- 230000008569 process Effects 0.000 claims abstract description 48
- 230000003068 static effect Effects 0.000 claims abstract description 42
- 238000012937 correction Methods 0.000 claims abstract description 18
- 238000009877 rendering Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000011161 development Methods 0.000 claims description 36
- 238000006243 chemical reaction Methods 0.000 claims description 16
- 238000013459 approach Methods 0.000 claims description 14
- 238000013500 data storage Methods 0.000 claims description 5
- 238000005538 encapsulation Methods 0.000 claims description 5
- 230000007246 mechanism Effects 0.000 abstract description 14
- 230000000694 effects Effects 0.000 abstract description 5
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 239000012160 loading buffer Substances 0.000 abstract 1
- 230000014616 translation Effects 0.000 description 136
- 238000007726 management method Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 8
- 230000008859 change Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 241000196324 Embryophyta Species 0.000 description 2
- 238000004883 computer application Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000007306 turnover Effects 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013497 data interchange Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 235000014347 soups Nutrition 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013068 supply chain management Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the technical field of natural language processing, in particular to a method and system for developing a multi-language website. The method for developing the multi-language website comprises a step a of developing a static web page of the multi-language website; a step b of calling a machine translation interface to perform multilingual translation processing on Chinese data dynamically added in the multi-language website; a step c of reading translated data, and loading and rendering dynamic web pages of the multi-language website according to translated data. A machine translation and manual intervention correction processing mode is adopted, translation errors are greatly reduced, and a web page display effect is higher in accuracy; by selecting a Unicode coded format of utf-8, messy code conditions generated during web page rendering are avoided; through a dynamic loading buffer mechanism, the problems of resource consumption and loading delay caused by calling of the machine translation interface each time in the process of real-time translation loading are solved, and manual intervention is reduced.
Description
Technical field
The present invention relates to natural language processing technique field, particularly to a kind of multi-language website development approach and system.
Background technology
Developing rapidly with Internet commercialization, e-commerce website emerges in multitude, and the market competition is growing more intense.In recent years
Come, China E-Commerce Business fast development, the application in each field is constantly expanded and deepened, and turnover hits new peak continuously, drive related
Industry flourishes, and associated support system is constantly optimized, and promotes the continuous enhancing of innovative impetus and ability.
It is known that Uighur is a kind of long ancient word of age, the books document write with Uighur at present,
Historical summary is many.It stores thousands of Uighur and life information, its historic significance and cultureal value
It is precious.Therefore, people's literary composition information processing technology is closely coupled with the development prospect in Uygur's language future.At present, with
The culture of the Uygur nationality people and stepping up of know-how, the people that can make Uighur webpage also continuously increases.?
Through having a lot of people or group to build various types of Uighur websites and propagated various information, these websites and common Chinese network
Stand and news browsing is equally all provided, the function such as information download, but the coding of the Uighur software due to using when setting up a web site
Different, this results in Uighur webpage and is constantly in surges ahead, incompatible ground condition, most Uighur
Info web all can not be shared, and carry out conversion simultaneously and have to expend substantial amounts of working time and scientific research money between different codings
Material.
Xinjiang Uygur Autonomous Regions be one multi-national multilingual occupy residence, e-commerce purchases become a kind of popular
Trend, this trend of successful confirmation of Taobao will be popular always, but in boundary, most shopping platforms are all Chinese editions
General Websites, the Uygur nationality compatriot being unfamiliar with Chinese for great majority uses difficult, therefore in the urgent need to one
The bilingual shopping platform of the normalized dimension Chinese.So, the electric business platform of the dimension language version sending out specification a of looking on the bright side of things is not simple
Static Web page Uighur, a mechanism of doing shopping perfectly, need dynamically manage in real time, dynamically additions and deletions change and look into,
Manually translation cannot meet the demand of this mass data dynamic change, it would therefore be desirable to machine translation assistance platform
Dynamic change.
Machine translation is the process using computer, a kind of natural language being converted into another kind of natural language.Machine translation
It is developed so far, occurred in that multiple machine translation systems based on different principle.Totally can be by machine translation system from method
On be roughly divided into four classes: rule-based machine translation, Case-based design, the machine translation based on statistics and mixing
Formula machine translation.Different machine translation systems has his own strong points.For example, rule-based machine translation system is good at translation symbol
Sentence normally, the quality of translation is higher;Versatility is had based on the machine translation system of statistics, automatically from corpus middle school
Practise linguistry.
Relevant references with regard to tieing up Chinese machine translation include:
[1] Lan Baixiong, Zheng Xiaona, Xu Xin. the supply chain management [j] of electronic commerce times. Chinese management science, 2000,
03:2-8.
[2] soup. China Electronic Data Interchange network network shopping platform analysis of Organization [d]. Wuhan University of Technology, 2012.
[3] Chen Yun, Zhang Penghua, Ren Lihua. machine translation Research Commentary [j]. value engineering, 2013,01:174-176.
[4] Zhu sea. the machine translation system control fusion [d] based on confusion network. China Science & Technology University, 2010.
[5]nagao m.a.framework of a mechanical translation between japanese
and english by analogy principle[m].north holland publications,1984.
[6] Mai Rehabaaili. some key issue research of the dimension Chinese machine translation [d] of Case-based Reasoning. Xinjiang University,
2014.
[7] Ali Fu Kuerban, A Buli meter Ti A not all hot according to wood, tell Er Genyi Bradley sound. dimension Chinese machine
The design [j] of translation electronic dictionary. computer engineering and application, 2006,20:76-78.
[8] the hot west of Ka Haer river Ah ratio carries. the research of the Han Weiweihan bi-directional MT system of Case-based Reasoning
[d]. Shanghai Communications University, 2012.
[9] Gu Lisongnasierding, buy carry match good fortune fourth. dimension Chinese machine translation system electronic dictionary research with set
Meter [j]. Xinjiang Normal University journal (natural science edition), 1997,01:32-36.
In order to solve the problems, such as to tie up Chinese machine translation, Chinese Patent Application No. 201310740830.3 discloses a kind of application
Electricity charge self-service payment terminal Uighur translation engine method, this patent from self-service payment terminal select display type such as Chinese,
Uighur;If selection Chinese, machine translation need not be carried out;If selection Uighur, start translation engine to data base
In information translated, and be shown on terminal interface, thus greatly reduce artificial intertranslation Chinese-Uighur cost and
Time.This patent suffers a disadvantage in that and carries out real-time machine translation although greatly reducing manually mutual when selecting Uighur
The cost translated and time, still lack caching mechanism or carry out Uighur database purchase in advance, reduce when webpage loads
Postpone.
Another Chinese Patent Application No. 201310197369.1 discloses a kind of enterprise comprehensive information management system, this patent
Submit to the request of information management to internationalization synchronization module by client, request bag contains the selection of language and application model;State
Border synchronization of modules receives asks and point languages management, is transmitted further to information unification management module;Information unification management module will
The different information in request after the management of point languages carry out judging simultaneously Classification Management;Different information transfers after Classification Management are given
History module;History module receives different information transmission client after Classification Management.This patent solves
Different language environment lower page synchronized update problem, the complete occurrences in human life of grasp enterprises of user, wage, archives, task and
The details of property etc.;The all operations step of user is all synchronously saved in the middle of history module, can no hinder at any time
The reduction that hinders and checking.But this patent suffers a disadvantage in that internationalization synchronization module sub-module point languages management, in client
When updating the data in a large number, modules need real-time synchronization to update, and on the one hand do not have preprocessing process, and data returns to exist and refreshes
Postpone;On the other hand, data updates and there may be error, does not manually participate in correction procedure.
In sum, the interpretive scheme of the existing dimension bilingual machine translation mothod of the Chinese is all more single, it is common to use dynamically real
When machine translation, there is no caching mechanism or process of data preprocessing, under b/c pattern, webpage renders and may there is Confused-code
With delay loading problem.
Content of the invention
The invention provides a kind of multi-language website development approach and system are it is intended at least solve existing to a certain extent
One of above-mentioned technical problem in technology.
In order to solve the above problems, the technical scheme is that a kind of multi-language website development approach, bag
Include:
Step a: the static Web page of exploitation multi-language website;
Step b: call machine translation interface, the Chinese data being dynamically added is carried out multilingual in described multi-language website
Translation is processed;
Step c: read translation data, load and render described multi-language website dynamic web page according to described translation data.
The technical scheme that the embodiment of the present invention is taken also includes: in described step a, described multi-language website at least includes
Chinese, Uighur or/and Kazak;The static Web page of described exploitation multi-language website is particularly as follows: pass through unicode character
The utf-8 coded format of collection carries out the static Web page exploitation of multi-language website.
The technical scheme that the embodiment of the present invention is taken also includes: in described step b, described to dynamic in multi-language website
The Chinese data adding carries out multilingual translation process and specifically includes:
Step b1: encapsulation translation interface, batch takes out the Chinese data being dynamically added in site databases, by described Chinese
Data storage in a document, is pressed row to the Chinese data in document and is read, and often reads a line and calls machine translation interface to carry out certainly
Dynamic translation;
Step b2: manual correction process is carried out to the translation data of described storage;
Step b3: the translation data that described manual correction is processed is stored in described site databases by corresponding form.
The technical scheme that the embodiment of the present invention is taken also includes: in described step c, described according to translation data load and wash with watercolours
Contaminate described multi-language website dynamic web page to specifically include: storage translation data when, by Uighur or Kazak each
Character code is converted into the 16 system character strings of four, when webpage renders, to the Uighur reading from site databases
Or Kazak tries again code conversion.
The technical scheme that the embodiment of the present invention is taken also includes: described step c also includes: described loading webpage is delayed
Deposit process;Described web cache processes and includes file cache and memory cache.
Another technical scheme that the embodiment of the present invention is taken is: a kind of multi-language website development system, comprising:
Static Web page development module: for developing the static Web page of multi-language website;
Machine translation module: be used for calling machine translation interface, to the Chinese number being dynamically added in described multi-language website
According to carrying out multilingual translation process;
Webpage rendering module: for reading translation data, load and render described multilingual net according to described translation data
Stand dynamic web page.
The technical scheme that the embodiment of the present invention is taken also includes: described multi-language website at least includes Chinese, Uighur
Or/and Kazak;Described static Web page development module develops the static Web page of multi-language website particularly as follows: passing through unicode
The utf-8 coded format of character set carries out the static Web page exploitation of multi-language website.
The technical scheme that the embodiment of the present invention is taken also includes website data library module, and described website data library module is used for
The Chinese data being dynamically added in storage multi-language website;Described machine translation module also includes:
Translation unit: for encapsulation translation interface, take out the Chinese being dynamically added in described website data library module in batches
Data, described Chinese data is stored in a document, presses row to the Chinese data in document and read, often read a line and call machine
Translation interface carries out automatic translation;
Error correction unit: for manual correction process is carried out to the translation data of described storage;
Memory element: the translation data for processing described manual correction stores described website data by corresponding form
In library module.
The technical scheme that the embodiment of the present invention is taken also includes: described webpage rendering module loads and wash with watercolours according to translation data
Contaminate described multi-language website dynamic web page to specifically include: storage translation data when, by Uighur or Kazak each
Character code is converted into the 16 system character strings of four, when webpage renders, to from website data library module read dimension I
Your language or Kazak try again code conversion.
The technical scheme that the embodiment of the present invention is taken also includes data cache module, and described data cache module is used for institute
State loading webpage and carry out caching process;Described web cache processes and includes file cache and memory cache.
With respect to prior art, what the embodiment of the present invention produced has the beneficial effects that: the multilingual net of the embodiment of the present invention
Development approach of standing and system take the template development of static Web page and dynamic data to call the combination of machine translation interface, greatly
Reduce greatly cost and the time of artificial intertranslation;Processing mode is corrected using machine translation and manual intervention, greatly reduces translation by mistake
Difference, makes webpage display effect accuracy rate higher;By selecting the unicode coded format of utf-8, it is to avoid produce when webpage renders
Mess code situation;By the caching mechanism of dynamic importing, during solving real time translation loading, need re invocation machine every time
Resource consumption problem and loading delay issue that translation interface causes, reduce manual intervention simultaneously.
Brief description
Fig. 1 is the flow chart of the multi-language website development approach of the embodiment of the present invention;
Fig. 2 is the general frame figure of the multi-language website of the embodiment of the present invention;
Fig. 3 is the training flow chart of statistic law machine translation;
Fig. 4 is the multilingual human assistance translation flow figure of the embodiment of the present invention;
Fig. 5 is the structural representation of the multi-language website development system of the embodiment of the present invention.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with drawings and Examples, right
The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only in order to explain the present invention, not
For limiting the present invention.
Refer to Fig. 1 and Fig. 2, Fig. 1 is the flow chart of the multi-language website development approach of the embodiment of the present invention, Fig. 2 is this
The general frame figure of the multi-language website of inventive embodiments.The multi-language website development approach of the embodiment of the present invention includes following step
Rapid:
Step 10: exploitation multi-language website template theme, carried out many by the utf-8 coded format of unicode character set
The static Web page exploitation of language website;
In step 10, the multi-language website in the embodiment of the present invention at least includes Chinese, Uighur, Kazak
Deng.During Uighur, the webpage development of Kazak, Unified coding process be a key technology, Uighur and
Kazak belongs to Altai family, and its word has all borrowed Arabic and part Farsi letter, and Uighur has 32
Individual letter, Kazak totally 33 words.Uygur and Kazak word are a kind of words of handwritten form, and each letter is according in list
Position in word is different, has 4 kinds of forms of expression such as form, final form in independent form, prefix form, word, by this when writing
Position in word for the character determines to manifest form.Therefore, Uighur and Kazak character have one in input, editor
A little particularitys, are embodied in: (1) presentation direction is from right to left, row to for from top to down, during input moving direction of cursor and
The Chinese, english writing are in opposite direction, and this makes treatment technology when Uighur, Kazak and the Chinese, English mixed editorial more complicated;
(2) Kazak has 33 letters, wherein has 9 vowels, 24 consonants.Kazak vowel harmony is tighter
Whole, consonant assimilation phenomenon is more.Uighur is made up of 32 letters, and has more than 120 character style, and each letter has 4
Kind different written forms, the head that afterbody is connected with next letter write form, the intermediate form being connected with adjacent letters from beginning to end,
The tail that stem is connected with a upper letter writes the absolute version that form and head and the tail are all not connected with adjacent letters, and according to word
Which kind of form position in mother determining using;(3) punctuation mark of Uighur, such as comma, question mark etc. and the Chinese, English symbol
Number in opposite direction.
There are a variety of identical character set in computer application field, and the user of different language is browsing difference
During language web page, often occur because character set used is different and mess code situation occurs.Make in general Chinese web station system
With simplified (gb2312) character set, and for Uighur and Kazak web station system, the character set of Chinese website is not
Support its language.So to provide dimension, Kazakhstan, for the website of Chinese multi-language version it should select a kind of to Chinese, Uygur
Language and the character set of Kazak all supports.Unicode character set has been formulated unified and only to each character of each language
One with two bytes (also having 4 bytes) come the coding to represent, meet at language, cross-platform Character decoder and conversion
Reason, but due to unicode character set incompatible is088:59-1 character set, the space of occupancy big (for English alphabet,
Unicode is also required to two bytes to represent), thus create utf character set.Utf character set includes 2 kinds: utf-8 and utf-
16, wherein, the coding criterion of utf-16 with unicode itself is consistent, and utf-8 is then different, and it defines a kind of " area
Between rule ", this rule can keep farthest compatible with is088:59-1 coding, may also be used for representing all simultaneously
The character of language, so in design and exploitation dimension, Kazakhstan, Chinese multi-language version website, utf-8 is optimal selection.Pass through
The coded format of utf-8, what the static Web page of multi-language website represented is the normal webpage not having mess code, and client selects to tie up me
That language version, system will load Uighur template theme.
Step 20: language version pattern is selected by client, corresponding template theme is shown according to language version pattern
Static Web page;
Step 30: encapsulation machine translation interface becomes webapi, calls machine translation interface, dynamically adds in site databases
The Chinese data entering carries out multilingual translation process;
In step 30, the multi-language website template development of the embodiment of the present invention includes exploitation and the dynamic number of static Web page
According to the invoked procedure of call on load machine translation interface, the multilingual e-commerce platform such as exploitation dimension Kazakhstan is except needing to develop
Static Web page, being dynamically added of batch content needs also exist for translating into multilingual version, and the embodiment of the present invention encapsulates machine translation
Interface, calls machine translation interface efficiently to solve multilingual translation when batch adds.For static Web page coding pretreatment it
Afterwards, for the Chinese data being dynamically added, it is stored in site databases, and adds Chinese data in site databases and correspond to
Uighur field or Kazak field, for storage translation after Uighur or Kazak, for dynamic web page wash with watercolours
Dye loads.
At present, machine translation mode includes regular method and statistic law;
First, regular method: according to language rule, text is analyzed, relends and help computer program to be translated.Most business
With machine translation system using regular method.The running of regular method machine translation system is passed through three continuous stages and is realized: analysis,
Conversion, generates;Three-level is divided into according to the complexity of three phases:
1. directly translate: the translation of word to word;
2. transition translation: translation process will with reference to and take into account morphology, the syntax and semantic information of original text.Because information is come
Source range is excessively wide in range, and grammatical ruless are excessive and there is contradiction and conflict each other, and transition translation is complex and error-prone;
3. interlingua translation.
2nd, statistic law smt: specifically as shown in figure 3, being the training flow chart of statistic law machine translation.By to substantial amounts of
Parallel corpora carries out statistical analysiss, builds statistical translation model (vocabulary, comparison or language mode), and then is entered using this model
Row translation, as translation, probabilistic algorithm is according to Bayes theorem typically to choose probability of occurrence highest entry in statistics.Assume
One English sentence a is translated into Chinese, all Chinese sentence b, are the possible or non-possible potential translations of a.pr
A () is the probability that similar a expression occurs, and pr (b | a) it is the probability that a translates into b appearance.Find the maximum of two parameters, just
Sentence and its scope of corresponding translation retrieval can be reduced, thus finding out most suitable translation.Smt is according to text analyzing degree rank
Difference be divided into two kinds: word-based smt and phrase-based smt, latter one commonly uses at present, and google is just
It is this.Cypher text is divided into the sequence of terms of regular length automatically, more each sequence of terms is counted in corpus
, corresponding probability highest translation to find in analysis.
Specifically, see also Fig. 4, be the multilingual human assistance translation flow figure of the embodiment of the present invention.To website
The Chinese data being dynamically added in data base carries out multilingual translation and processes specifically including following steps:
Step 31: batch takes out the Chinese data (such as commodity data) being dynamically added in site databases, by Chinese number
According to storage in a document, coding is pressed row to the Chinese data in document and is read, and often reads a line and calls machine translation interface
It is automatically translated into Uighur or Kazak data, and the Uighur after translation or Kazak data are taken
Unicode coded format is stored in result document;
In step 31, first translation interface is packaged, input character string one by one, takes out in site databases
Chinese data is stored in a document, calls translation interface successively by row reading, returning result list is stored in result document,
It is sequentially inserted in site databases corresponding field, this process is the process that an automatization is periodically executed, all data pass again
Defeated and emphasize the unification of character encoding format during calling translation interface.
Step 32: manual correction process is carried out to the Uighur after automatic translation or Kazak data;
In the step 32, because dimension breathes out the quantity limitation of dictionary, all of Chinese phrase can not be accomplished very accurate
Really translate, in order to improve the accuracy of translation, the present invention participates in correcting a small amount of error in translation process by artificial, significantly
Reduce translation error, make webpage display effect accuracy rate higher.
Step 33: the Uighur that manual correction is crossed or Kazak data, read storage by corresponding form and arrive website
In the corresponding field of data base, by the cycle, whole operation flow process is automatically processed, complete all static and dynamic two-way wash with watercolours
Dye process.
Step 40: read translation data from site databases, load and render corresponding template theme according to translation data
Dynamic web page, and to load webpage carry out caching process;
In step 40, because computer is in the restriction of Uighur, so Uighur website is when webpage renders, just
There is a unicode coding and corresponding operating system, the coded format conversion of browser and site databases support is defeated
The problem enter, exporting, for the driver of almost all of site databases, acquiescence passes between program and site databases
The coded format of iso-8859-1 is all adopted during delivery data.Then, website platform by Uighur data storage in website data
During storehouse, site databases driver will be stored unicode converting coding formats for iso-8859-1 form.In webpage
When rendering, the Uighur data reading from site databases just becomes mess code.In order to solve Uighur, Kazak
The read-write of language and Chinese and the incompatible Confused-code causing of storage mode, the embodiment of the present invention propose a kind of Uighur,
The code conversion method of Kazak, in storage translation data, by each character code conversion of Uighur, Kazak
Cheng Siwei 16 system character strings (such as:After code conversion: " 062a 0648 064a "), when webpage renders, right
From site databases, the Uighur reading or Kazak try again code conversion, thus there is not Confused-code
?.
Caching process generally includes: (1) data buffer storage: refers to website data library inquiry php caching mechanism, accesses page every time
When face, all can first detect data cached accordingly whether there is, if it does not, just connecting data base, obtain data, and
It is saved in after Query Result is serialized in file, later same Query Result just directly obtains from cache table or file.
(2) page cache: every time when accession page, all can first detect that corresponding caching pagefile whether there is, if do not deposited
Just connecting data base, obtaining data, the display page is simultaneously generating caching pagefile, page when so next time accesses simultaneously
Face file has just played a role (template engine and some common on the net php caching mechanism classes generally have this function).(3) time
Trigger caching: check that file whether there is and timestamp is less than the expired time arranging, if the timestamp ratio of file modification
It is big that current time stamp deducts expired time stamp, then just with caching, otherwise updates caching.(4) content trigger caching: when insertion number
According to or when updating the data, force to update php caching mechanism.(5) static cache: static cache refers to static, directly generates
The texts such as html or xml, have when renewal and re-generate once, are suitable for the page of less change.(6) memory cache:
Memcached is high performance, distributed memory object php caching mechanism system, for reducing data in dynamic application
Storehouse loads, and lifts access speed.(7) php caches web caching (10) dns wheel based on reverse proxy for (8) mysql caching (9)
Ask.This caching process of the embodiment of the present invention mainly includes file cache and memory cache.The Main Function of caching is to reduce number
According to the pressure in storehouse and php arithmetical unit, reduce the delay calling machine translation site databases data to bring when webpage renders, solve
During real time translation is loaded into, the resource consumption problem that needs re invocation machine translation interface to cause every time, also can subtract simultaneously
Few artificial intervention.The data inquiring is stored directly in inside caching, without repetition query web data base, the pressure of mysql
Power can mitigate;And the computing of php is mainly reflected in, the result that such as complicated to one recursive operation obtains enters row cache, no
Carry out the computing of complexity with wasting cpu every time.During carrying out caching process, the coding process of Uighur is still
Take above-mentioned unicode coding criterion.
The embodiment of the present invention is not limited in the coding compatibling problem solving Wei Han, breathing out between the Chinese, between similar Ke Han
Equally using similar numeralization processing method, entirely call machine translation interface to site databases data processing, artificial ginseng
With process be applied equally to other multilingual work.
Refer to Fig. 5, be the structure chart of the multi-language website development system of the embodiment of the present invention.The embodiment of the present invention many
Language Website development structure includes static Web page development module, static Web page display module, website data library module, machine translation
Module, webpage rendering module data cache module;
Static Web page development module is used for developing multi-language website template theme, is compiled by the utf-8 of unicode character set
Code form carries out the static Web page exploitation of multi-language website;Wherein, the multi-language website in the embodiment of the present invention at least includes the Chinese
Language, Uighur, Kazak etc..During Uighur, the webpage development of Kazak, it is one that Unified coding is processed
Key technology, Uighur and Kazak belong to Altai family, and its word has all borrowed Arabic and part Farsi
Letter, Uighur has 32 letters, Kazak totally 33 words.Uygur and Kazak word are a kind of literary compositions of handwritten form
Word, each letter is different according to the position in word, has 4 kinds of form, final form etc. in independent form, prefix form, word
The form of expression, when writing, by this character, the position in word determines to manifest form.Therefore, Uighur and Kazak character
In input, editor, there are some particularitys, be embodied in: (1) presentation direction is from right to left, row to for from top to down,
During input, moving direction of cursor is in opposite direction with the Chinese, english writing, and this makes Uighur, Kazak mix volume with the Chinese, English
When collecting, treatment technology is more complicated;(2) Kazak has 33 letters, wherein has 9 vowels, 24 consonants.Breathe out
Sa Ke language vowel harmony is more in neat formation, and consonant assimilation phenomenon is more.Uighur is made up of 32 letters, and has more than 120
Character style, each letter has 4 kinds of different written forms, the head that afterbody is connected with next letter write form, from beginning to end with phase
The tail that the intermediate form of adjacent letter connection, stem are connected with a upper letter writes form and head and the tail are all not connected with adjacent letters
Absolute version, and determined using which kind of form according to the position in letter;(3) punctuation mark of Uighur, for example
Comma, question mark etc. are in opposite direction with the Chinese, English symbol.
There are a variety of identical character set in computer application field, and the user of different language is browsing difference
During language web page, often occur because character set used is different and mess code situation occurs.Make in general Chinese web station system
With simplified (gb2312) character set, and for Uighur and Kazak web station system, the character set of Chinese website is not
Support its language.So to provide dimension, Kazakhstan, for the website of Chinese multi-language version it should select a kind of to Chinese, Uygur
Language and the character set of Kazak all supports.Unicode character set has been formulated unified and only to each character of each language
One with two bytes (also having 4 bytes) come the coding to represent, meet at language, cross-platform Character decoder and conversion
Reason, but due to unicode character set incompatible is088:59-1 character set, the space of occupancy big (for English alphabet,
Unicode is also required to two bytes to represent), thus create utf character set.Utf character set includes 2 kinds: utf-8 and utf-
16, wherein, the coding criterion of utf-16 with unicode itself is consistent, and utf-8 is then different, and it defines a kind of " area
Between rule ", this rule can keep farthest compatible with is088:59-1 coding, may also be used for representing all simultaneously
The character of language, so in design and exploitation dimension, Kazakhstan, Chinese multi-language version website, utf-8 is optimal selection.Pass through
The coded format of utf-8, what the static Web page of multi-language website represented is the normal webpage not having mess code, and client selects to tie up me
That language version, system will load Uighur template theme.
The language version pattern that static Web page display module is used for according to client selects shows the quiet of corresponding template theme
State webpage;
Website data library module is used for storing the Chinese data being dynamically added in multi-language website;
Machine translation module is used for encapsulating machine translation interface and becomes webapi, calls machine translation interface, to website data
The Chinese data being dynamically added in library module carries out multilingual translation process;Wherein, the multi-language website mould of the embodiment of the present invention
Plate exploitation includes the exploitation of static Web page and the invoked procedure of dynamic data call on load machine translation interface, and exploitation dimension is breathed out etc.
Except needing to develop static Web page, being dynamically added of batch content needs also exist for translating into multi-lingual multilingual e-commerce platform
Plant version, the embodiment of the present invention encapsulates machine translation interface, call machine translation interface efficiently to solve multi-lingual when batch adds
Plant translation.After static Web page coding pretreatment, for the Chinese data being dynamically added, it is stored in DBM,
And add Chinese data corresponding Uighur field or Kazak field in DBM, after storage translation
Uighur or Kazak, render loading for dynamic web page.
Specifically, machine translation module includes translation unit, error correction unit and memory element;
Translation unit is used for taking out the Chinese data in DBM in batches, Chinese data is stored in a document, compiles
Program writing is pressed row to the Chinese data in document and is read, and often reads a line and calls machine translation interface to be automatically translated into Uighur
Or Kazak data, and the Uighur after translation or Kazak data are taken unicode coded format to be stored in
Result document;Wherein, first translation interface is packaged, input character string one by one, takes out the Chinese in DBM
Data storage, in a document, calls translation interface by row reading, returning result list is stored in result document successively, then according to
Secondary insertion DBM corresponding field in, this process is the process that an automatization is periodically executed, all data transfers and
The unification of character encoding format is emphasized during calling translation interface.
Error correction unit is used for carrying out manual correction process to the Uighur after automatic translation or Kazak data;Its
In, because dimension breathes out the quantity limitation of dictionary, very accurate translation can not be accomplished to all of Chinese phrase, turn over to improve
The accuracy translated, the present invention, by manually participating in correct a small amount of error in translation process, greatly reduces translation error, makes net
Page bandwagon effect accuracy rate is higher.
Memory element is used for Uighur or the Kazak data that manual correction is crossed, and reads storage by corresponding form and arrives
In the corresponding field of DBM, by the cycle, whole operation flow process is automatically processed, complete all static and dynamically double
To render process.
Webpage rendering module is used for reading translation data from website data library module, loads and renders according to translation data
The dynamic web page of corresponding template theme;Wherein, because computer is in the restriction of Uighur, so Uighur website is in webpage
When rendering, there is a unicode coding and corresponding operating system, the coded format of browser and data base's support turns
Change input, the problem of output, for the driver of almost all of data base, acquiescence transmits number between program database
According to when all adopt the coded format of iso-8859-1.Then, website platform by Uighur data storage in DBM,
Database device will be stored unicode converting coding formats for iso-8859-1 form.When webpage renders, from
The Uighur data reading in DBM just becomes mess code.In order to solve Uighur, Kazak and Chinese
Read-write and the incompatible Confused-code causing of storage mode, the embodiment of the present invention proposes a kind of Uighur, Kazak
Code conversion method, storage translation data when, each character code of Uighur, Kazak is converted into four
16 system character strings (such as:After code conversion: " 062a 0648 064a "), when webpage renders, to from data base
In module, the Uighur reading or Kazak try again code conversion, thus there is not Confused-code.
Data cache module is used for carrying out caching process to loading webpage;Caching process generally includes: (1) data buffer storage:
Refer to data base querying php caching mechanism, when each accession page, all can be first detected data cached accordingly whether depositing
If it does not, just connecting data base, obtaining data, and be saved in file, equally after Query Result is serialized later
Query Result just directly obtain from cache table or file.(2) page cache: every time when accession page, all can first examine
Surveying the corresponding pagefile that caches whether there is, if it does not, just connecting data base, obtaining data, the display page simultaneously
Generate caching pagefile, when so next time accesses, pagefile has just played a role (template engine and common on the net
Some php caching mechanism classes generally have this function).(3) Time Triggered caching: check that file whether there is and timestamp is less than
The expired time of setting, if the timestamp of file modification deducts expired time stamp greatly than current time stamp, then just with caching,
Otherwise update caching.(4) content trigger caching: when inserting data or updating the data, force to update php caching mechanism.(5) quiet
State caches: static cache refers to static, directly generates the texts such as html or xml, has when renewal and re-generates once,
It is suitable for the page of less change.(6) memory cache: memcached is high performance, distributed memory object php caching
Mechanism system, for reducing database loads in dynamic application, lifts access speed.(7) php caching (8) mysql caching
(9) web based on reverse proxy caches (10) dns poll.This caching process of the embodiment of the present invention mainly includes file cache
And memory cache.The Main Function of caching is the pressure reducing data base and php arithmetical unit, reduces when webpage renders and calls machine
The delay that translation database data is brought, during solving real time translation loading, needs re invocation machine translation interface every time
The resource consumption problem causing, also can reduce artificial intervention simultaneously.The data inquiring is stored directly in inside caching, without
Repeat to inquire about data base, the pressure of mysql can mitigate;And the computing of php is mainly reflected in, such as complicated to one recurrence is transported
The result obtaining enters row cache, carries out complicated computing without wasting cpu every time.During carrying out caching process, dimension
I still takes above-mentioned unicode coding criterion by your coding process of literary composition.
The multi-language website development approach of the embodiment of the present invention and system take template development and the dynamic number of static Web page
According to the combination calling machine translation interface, greatly reduce cost and the time of artificial intertranslation;Using machine translation with manually
Intervene and correct processing mode, greatly reduce translation error, make webpage display effect accuracy rate higher;By selecting utf-8's
Unicode coded format, it is to avoid the mess code situation producing when webpage renders;By the caching mechanism of dynamic importing, solve in real time
During translation is loaded into, needs resource consumption problem that re invocation machine translation interface causes every time and load delay issue,
Reduce manual intervention simultaneously.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention.
Multiple modifications to these embodiments will be apparent from for those skilled in the art, as defined herein
General Principle can be realized without departing from the spirit or scope of the present invention in other embodiments.Therefore, the present invention
It is not intended to be limited to the embodiments shown herein, and be to fit to and principles disclosed herein and features of novelty phase one
The scope the widest causing.
Claims (10)
1. a kind of multi-language website development approach is it is characterised in that include:
Step a: the static Web page of exploitation multi-language website;
Step b: call machine translation interface, multilingual translation is carried out to the Chinese data being dynamically added in described multi-language website
Process;
Step c: read translation data, load and render described multi-language website dynamic web page according to described translation data.
2. multi-language website development approach according to claim 1 is it is characterised in that in described step a, described multi-lingual
Speech website at least includes Chinese, Uighur or/and Kazak;Described exploitation multi-language website static Web page particularly as follows:
Carry out the static Web page exploitation of multi-language website by the utf-8 coded format of unicode character set.
3. multi-language website development approach according to claim 2 is it is characterised in that in described step b, described to many
The Chinese data being dynamically added in language website carries out multilingual translation process and specifically includes:
Step b1: encapsulation translation interface, batch takes out the Chinese data being dynamically added in site databases, by described Chinese data
Storage in a document, is pressed row to the Chinese data in document and is read, and often reads a line and calls machine translation interface to carry out automatic turning
Translate;
Step b2: manual correction process is carried out to the translation data of described storage;
Step b3: the translation data that described manual correction is processed is stored in described site databases by corresponding form.
4. multi-language website development approach according to claim 2 is it is characterised in that in described step c, described basis is turned over
Translate data and load and render described multi-language website dynamic web page and specifically include: in storage translation data, by Uighur or
Each character code of Kazak is converted into the 16 system character strings of four, when webpage renders, to from site databases
The Uighur reading or Kazak try again code conversion.
5. multi-language website development approach according to claim 4 is it is characterised in that described step c also includes: to described
Load webpage and carry out caching process;Described web cache processes and includes file cache and memory cache.
6. a kind of multi-language website development system is it is characterised in that include:
Static Web page development module: for developing the static Web page of multi-language website;
Machine translation module: be used for calling machine translation interface, the Chinese data being dynamically added in described multi-language website is entered
Row multilingual translation is processed;
Webpage rendering module: for reading translation data, load and render described multi-language website according to described translation data and move
State webpage.
7. multi-language website development system according to claim 6 is it is characterised in that described multi-language website at least includes
Chinese, Uighur or/and Kazak;The static Web page that described static Web page development module develops multi-language website is concrete
For: carry out the static Web page exploitation of multi-language website by the utf-8 coded format of unicode character set.
8. multi-language website development system according to claim 7 is it is characterised in that also include website data library module,
Described website data library module is used for storing the Chinese data being dynamically added in multi-language website;Described machine translation module is also wrapped
Include:
Translation unit: for encapsulation translation interface, take out the Chinese data being dynamically added in described website data library module in batches,
Described Chinese data is stored in a document, row is pressed to the Chinese data in document and reads, often read a line and call machine translation
Interface carries out automatic translation;
Error correction unit: for manual correction process is carried out to the translation data of described storage;
Memory element: the translation data for processing described manual correction stores described site databases mould by corresponding form
In block.
9. multi-language website development system according to claim 7 is it is characterised in that described webpage rendering module is according to turning over
Translate data and load and render described multi-language website dynamic web page and specifically include: in storage translation data, by Uighur or
Each character code of Kazak is converted into the 16 system character strings of four, when webpage renders, to from site databases mould
In block, the Uighur reading or Kazak try again code conversion.
10. multi-language website development system according to claim 9 is it is characterised in that also include data cache module, institute
State data cache module for caching process is carried out to described loading webpage;Described web cache processes and includes file cache and interior
Deposit caching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610958116.5A CN106372065B (en) | 2016-10-27 | 2016-10-27 | Multi-language website development method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610958116.5A CN106372065B (en) | 2016-10-27 | 2016-10-27 | Multi-language website development method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106372065A true CN106372065A (en) | 2017-02-01 |
CN106372065B CN106372065B (en) | 2020-07-21 |
Family
ID=57893794
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610958116.5A Expired - Fee Related CN106372065B (en) | 2016-10-27 | 2016-10-27 | Multi-language website development method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106372065B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021423A (en) * | 2017-12-15 | 2018-05-11 | 语联网(武汉)信息技术有限公司 | A kind of Multilingual website generating method, system and computer-readable recording medium |
CN108280219A (en) * | 2018-02-07 | 2018-07-13 | 深圳壹账通智能科技有限公司 | Text interpretation method, device, computer equipment and storage medium |
CN108563645A (en) * | 2018-04-24 | 2018-09-21 | 成都智信电子技术有限公司 | The metadata interpretation method and device of HIS systems |
CN108664247A (en) * | 2018-04-26 | 2018-10-16 | 微梦创科网络科技(中国)有限公司 | A kind of method and device of Page Template data interaction |
CN109088995A (en) * | 2018-10-17 | 2018-12-25 | 永德利硅橡胶科技(深圳)有限公司 | Support the method and mobile phone of global languages translation |
CN109684096A (en) * | 2018-12-29 | 2019-04-26 | 北京超图软件股份有限公司 | A kind of software program recycling processing method and device |
CN109783579A (en) * | 2019-01-22 | 2019-05-21 | 南京焦点领动云计算技术有限公司 | A kind of method of quick copy and translation web site |
CN109828775A (en) * | 2018-12-06 | 2019-05-31 | 中国电子进出口有限公司 | A kind of WEB management system and method for multilingual translation content of text |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000330992A (en) * | 1999-05-17 | 2000-11-30 | Nec Software Shikoku Ltd | Multilinguistic www server system and its processing method |
CN1295292A (en) * | 1999-11-05 | 2001-05-16 | 国际商业机器公司 | Method and system for multi-language wide world web service device thereof |
CN101957815A (en) * | 2009-07-13 | 2011-01-26 | 白劲实 | Automatic translation method and system based on correct translation result and corresponding relation |
CN102193914A (en) * | 2011-05-26 | 2011-09-21 | 中国科学院计算技术研究所 | Computer aided translation method and system |
CN102508878A (en) * | 2011-10-18 | 2012-06-20 | 深圳市共进电子股份有限公司 | Method for generating standard foreign language page by means of machine translation system |
CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
CN102929865A (en) * | 2012-10-12 | 2013-02-13 | 广西大学 | PDA (Personal Digital Assistant) translation system for inter-translating Chinese and languages of ASEAN (the Association of Southeast Asian Nations) countries |
CN103823796A (en) * | 2014-02-25 | 2014-05-28 | 武汉传神信息技术有限公司 | System and method for translation |
CN104375808A (en) * | 2013-07-11 | 2015-02-25 | 携程计算机技术(上海)有限公司 | Method and device for displaying interfaces |
-
2016
- 2016-10-27 CN CN201610958116.5A patent/CN106372065B/en not_active Expired - Fee Related
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000330992A (en) * | 1999-05-17 | 2000-11-30 | Nec Software Shikoku Ltd | Multilinguistic www server system and its processing method |
CN1295292A (en) * | 1999-11-05 | 2001-05-16 | 国际商业机器公司 | Method and system for multi-language wide world web service device thereof |
CN101957815A (en) * | 2009-07-13 | 2011-01-26 | 白劲实 | Automatic translation method and system based on correct translation result and corresponding relation |
CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
CN102193914A (en) * | 2011-05-26 | 2011-09-21 | 中国科学院计算技术研究所 | Computer aided translation method and system |
CN102508878A (en) * | 2011-10-18 | 2012-06-20 | 深圳市共进电子股份有限公司 | Method for generating standard foreign language page by means of machine translation system |
CN102929865A (en) * | 2012-10-12 | 2013-02-13 | 广西大学 | PDA (Personal Digital Assistant) translation system for inter-translating Chinese and languages of ASEAN (the Association of Southeast Asian Nations) countries |
CN104375808A (en) * | 2013-07-11 | 2015-02-25 | 携程计算机技术(上海)有限公司 | Method and device for displaying interfaces |
CN103823796A (en) * | 2014-02-25 | 2014-05-28 | 武汉传神信息技术有限公司 | System and method for translation |
Non-Patent Citations (3)
Title |
---|
XML中国论坛: "《XML实用进阶教程》", 31 March 2001 * |
王业 等: "一种多语言网站解决方案", 《计算机系统应用》 * |
黄河清 等: "基于动态数据库的多国语言网站开发", 《计算机工程》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021423A (en) * | 2017-12-15 | 2018-05-11 | 语联网(武汉)信息技术有限公司 | A kind of Multilingual website generating method, system and computer-readable recording medium |
CN108021423B (en) * | 2017-12-15 | 2021-05-04 | 语联网(武汉)信息技术有限公司 | Multilingual website generation method and system and computer readable storage medium |
CN108280219A (en) * | 2018-02-07 | 2018-07-13 | 深圳壹账通智能科技有限公司 | Text interpretation method, device, computer equipment and storage medium |
CN108280219B (en) * | 2018-02-07 | 2021-06-22 | 深圳壹账通智能科技有限公司 | Text translation method and device, computer equipment and storage medium |
CN108563645A (en) * | 2018-04-24 | 2018-09-21 | 成都智信电子技术有限公司 | The metadata interpretation method and device of HIS systems |
CN108664247A (en) * | 2018-04-26 | 2018-10-16 | 微梦创科网络科技(中国)有限公司 | A kind of method and device of Page Template data interaction |
CN108664247B (en) * | 2018-04-26 | 2022-02-01 | 微梦创科网络科技(中国)有限公司 | Page template data interaction method and device |
CN109088995B (en) * | 2018-10-17 | 2020-11-13 | 永德利硅橡胶科技(深圳)有限公司 | Method and mobile phone for supporting global language translation |
CN109088995A (en) * | 2018-10-17 | 2018-12-25 | 永德利硅橡胶科技(深圳)有限公司 | Support the method and mobile phone of global languages translation |
CN109828775A (en) * | 2018-12-06 | 2019-05-31 | 中国电子进出口有限公司 | A kind of WEB management system and method for multilingual translation content of text |
CN109828775B (en) * | 2018-12-06 | 2021-12-07 | 中国电子进出口有限公司 | WEB management system and method for multilingual translation text content |
CN109684096A (en) * | 2018-12-29 | 2019-04-26 | 北京超图软件股份有限公司 | A kind of software program recycling processing method and device |
CN109783579B (en) * | 2019-01-22 | 2020-06-02 | 南京焦点领动云计算技术有限公司 | Method for quickly copying and translating website |
CN109783579A (en) * | 2019-01-22 | 2019-05-21 | 南京焦点领动云计算技术有限公司 | A kind of method of quick copy and translation web site |
Also Published As
Publication number | Publication date |
---|---|
CN106372065B (en) | 2020-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106372065A (en) | Method and system for developing multi-language website | |
Diab | Second generation AMIRA tools for Arabic processing: Fast and robust tokenization, POS tagging, and base phrase chunking | |
Way et al. | On the Role of Translations in State‐of‐the‐Art Statistical Machine Translation | |
Zong | Research on the relations between machine translation and human translation | |
Sikos | Web Standards: Mastering HTML5, CSS3, and XML | |
Mo | Design and Implementation of an Interactive English Translation System Based on the Information‐Assisted Processing Function of the Internet of Things | |
Van Der Goot et al. | Norm It!: Lexical Normalization for Italian and Its Downstream Effects for Dependency Parsing | |
CN102929865A (en) | PDA (Personal Digital Assistant) translation system for inter-translating Chinese and languages of ASEAN (the Association of Southeast Asian Nations) countries | |
US9779083B2 (en) | Functioning of a computing device by a natural language processing method comprising analysis of sentences by clause types | |
CN109871516A (en) | A kind of method of bilayer PDF Mass production WORD | |
Wu et al. | Adapting attention-based neural network to low-resource Mongolian-Chinese machine translation | |
CN109828775B (en) | WEB management system and method for multilingual translation text content | |
JP7064871B2 (en) | Text mining device and text mining method | |
Singh et al. | Intelligent System for Automatic Transfer Grammar Creation Using Parallel Corpus | |
Lu et al. | Language model for Mongolian polyphone proofreading | |
Bhatti et al. | Sindhi Text Corpus using XML and Custom Tags | |
Topping | Using MathType to create TeX and MathML equations | |
Yu | [Retracted] English Characteristic Semantic Block Processing Based on English‐Chinese Machine Translation | |
Abudouwaili et al. | Morphological Analysis Corpus Construction of Uyghur | |
Aihua | Man-Machine Translation—Future of Computer-Assisted Translation | |
Li | The Application of Multimedia Network Technology in Network Technology | |
Chakrawarti et al. | Phrase-Based Statistical Machine Translation of Hindi Poetries into English by incorporating Word Sense Disambiguation | |
Suganthi et al. | Semantic based orthographic with prepositional phrase for English-Tamil translation | |
Liang et al. | Tibetan-BERT-wwm: A Tibetan Pretrained Model With Whole Word Masking for Text Classification | |
Zhou | Super-Function Based Machine Translation System for Business User |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200721 |