CN101916248A - Method for translating internet webpage - Google Patents
Method for translating internet webpage Download PDFInfo
- Publication number
- CN101916248A CN101916248A CN 201010271775 CN201010271775A CN101916248A CN 101916248 A CN101916248 A CN 101916248A CN 201010271775 CN201010271775 CN 201010271775 CN 201010271775 A CN201010271775 A CN 201010271775A CN 101916248 A CN101916248 A CN 101916248A
- Authority
- CN
- China
- Prior art keywords
- text
- content
- language
- translated
- webpage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Machine Translation (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a method for translating an internet webpage. The method comprises the following steps of: analyzing the structural standardization content and separating the content into a frame part and a content part, wherein the frame part comprises the content such as the attribute name of a webpage content unit, and the frame content part comprises the content such as the content of the webpage content unit; translating the frame part once, storing an original text in a text A and storing a translated text in a text B; establishing two tables corresponding to a table C of the language of the original text and a table D of a target language in a database for the content part; storing the content in a corresponding entry of the original text language table if specific webpage content unit records exist; and correspondingly translating the contents filled in the original text language table one to one and filling the contents in corresponding entries of a to-be-translated language table. By using the method provided by the invention, the distortion in meaning of the translated text caused by different languages and different word orders during the translation of long sentences is prevented, and meanwhile the frame part is only translated once, so the time for partially processing the translated text is saved and the efficiency is improved.
Description
Technical field
The present invention relates to a kind of method for translating internet webpage, be applied to Internet of Things Information technology field, internet.
Background technology
Existing translating internet webpage technology, such as Google's translation, the web page translation system of Google has obtained using widely in current Internet circles, is mainly used in to paste character translation and the translation of webpage entire chapter.The annexation of its each several part is more directly perceived, and essential part has three parts.First is a blank box, and pasting to write for the user needs the translation content, or web page address; Second portion is a background process, perhaps network address webpage full text translation in the user is pasted; Third part is the display part, shows the translation result that is transferred by second portion.Google's translation system is simple and practical, and its weak point is content not to be added the translation of differentiation monoblock return, and to the translation than the long article field, because each languages word order custom is different, looks like through mixed and disorderly stack of regular meeting or meaning distortion.Particularly for the webpage that the standardization content structure is arranged, as the introduction of e-commerce website to product, many webpages all are a kind of framed structures, and the content of framework (Frame) is constant usually, and concrete content (Content) is becoming, and at this moment can produce to repeat translation.
Summary of the invention
The present invention is directed to the problem that translation at present automatically exists, a kind of method for translating internet webpage is provided, designed translation system that a framework separates with content improving the accuracy rate and the efficient of automatic translation system.
The present invention is to the translation of webpage from a kind of language to another kind of language, especially for the webpage that the standardization content structure is arranged, provides technical scheme as e-commerce website to the introduction of product, and step is as follows:
A, Webpage is analyzed, page text is separated into framework text and content text;
B, the framework text is only translated once, original text is stored in first text, and translation is stored in second text;
C, to content text, in database, set up two tables, be respectively original text language table and language to be translated table;
D, read concrete web page contents text, each unit of content text is stored in respective items in the original text language table;
E, translate one by one, translation result is inserted respective items in the language to be translated table for the content that is filled in the original text language table;
F, first text connect by database gets a record from the original text language table, combine with the webpage format framework, forms former web page text; Second text connects by database gets a record from the language to be translated table, combine with the webpage format framework, forms the translation webpage.
Corresponding one by one on described first text and second text structure.
Described first text and second text all have the database connection mechanism, can get the record of same sequence number from described original text language table and language to be translated table.
Described original text language table and language to be translated table are two pre-designed tables of lane database, have same structure, and each is recorded unique sequences number.The sequence number of described original text language table and language to be translated table is the generation of progressively increasing automatically in the original text language table, in the language to be translated table be to duplicate gained when content is done its appropriate translation in according to the original text language table.
When split-frame text and content text, as the framework text, the particular content of content element is as content text with the Property Name of content element in the page text.
Advantage of the present invention is:
1, because characteristics such as each languages word order, word ambiguity, automatic language translation is to be difficult to accurately, divide framework and content to handle respectively to the information that cannonical format and content are arranged, to will be more accurate from the reception and registration of meaning to general full text translation method, to reduce fastening one person's story upon another person of automatic translation system, especially to the text message of cannonical format is arranged.
2, as previously mentioned, after branch framework and the content, framework only need be translated once, has improved efficient.
Description of drawings
Fig. 1 is a principle of the invention synoptic diagram.
Embodiment
The invention will be further described below in conjunction with drawings and Examples.The present invention has used software engineering, database technology and the Internet technology in present information field.Faster, more accurate when being applied between different language translation, more effectively finish translating of information, be particularly useful for having the various information individuality with content of standard.
The present invention can be used for internet system, also can be used for other automatic translation system, is particularly useful for having the information of cannonical format and content, such as in the e-commerce system to the introduction of product, with Property Name and content separate processes, such as, " commodity object ", " purchase mode " or the like keyword belongs to Property Name, it is framework, and its property content is a variable content as " whole world ", " direct deal " etc., in case in have change, replaced, promptly can be produced a new product introduction.According to these characteristics, adopt the translation system of the method for the invention that frame content is handled respectively, framework only need be translated once, be stored in framework translation result memory, the content basis is without product, each all needs to translate according to original language again, by compositor as a result framework result and content results is combined then to show the user.
Structure as shown in Figure 1, the method for the invention have been successfully applied to the language translation of a novel electron business system, and overall plan is that property content is placed in the pre-designed database table.One web user interface is arranged, and form is continuous therewith, inserts or revise the content part translation result for the translator.The framework attribute is translated in advance then places the translation result compositor, and compositor can call a record of content part translation result in addition from database, and synthetic through correspondence, bearing results passes in end-user interface.Framework and content be translation flow respectively, and label is corresponding to following steps among Fig. 1.
For example the text of the webpage of one group of norm structure wherein one page comprise following content:
Trade name knitted underwear crew neck lovers suit
The whole world, North America, area, Hong Kong, Macao and Taiwan, commodity object continent
The direct deal of purchase mode
So-and-so company of supply of material businessman
Commercial specification
Shandong, area, Zibo
So-and-so road of place Shandong Zibo so-and-so number
Issuing time 2009-11-9
Then concrete treatment step is:
1, Webpage is analyzed, page text is separated into framework text and content text, and the framework text comprises: " trade name ", " commodity object ", " purchase mode ", " supply of material businessman ", " commercial specification ", " area ", " place ", " issuing time ".Content text comprises: " knitted underwear crew neck lovers suit ", " Hong Kong, Macao and Taiwan, continent area North America the whole world ", " direct deal ", " so-and-so company ", " Shandong, Zibo ", " Shandong Zibo so-and-so road so-and-so number ", " 2009-11-9 ".
2, the framework text is only translated once, original text is stored in the first text A, and translation is stored in the second text B; Preserve as the first text A:
" trade name
The commodity object
The purchase mode
Supply of material businessman
Commercial specification
The area
The place
Issuing time "
Second text is preserved the corresponding translation of above text.Text A, the form of B can define voluntarily, can enough written in software goes into and reads wherein certain content.
3, to the content text of the page, in database, set up two tables, be respectively original text language table C and language to be translated table D; Table C, D comprise the item that is used to fill in the content text of underframe: trade name, commodity object, purchase mode, supply of material businessman, commercial specification, area, place, issuing time.
4, read concrete web page contents text, each unit of content text is stored in respective items in the original text language table C, the content text of one page is a unit, is about to the respective items that the described content text of step 1 is inserted step 3, as a record of table.
5, translate one by one for the content that is filled in the original text language table C, translation result is inserted respective items in the language to be translated table D.
6, the first text A gets a record by the database connection from original text language table C, combines with the web page frame part, forms former web page text; The second text B gets a record by the database connection from language to be translated table D, combine with the web page frame part, forms the translation webpage.
The content of language to be translated table D, be to produce according to the record among the original text language table C is in time corresponding, can be mechanical translation, such as the common used method of looking up the dictionary, in advance dictionary for translation is placed database or other attachable electronical records, or the user interface that connects original text language table C, language to be translated table D table, value from original text language table C are arranged, by human translation, be stored in again among the language to be translated table D.
Adopt translation method provided by the invention, the distortion because the translation that different language word order difference causes looks like when having avoided long statement to translate, frame part is only translated once simultaneously, has omitted the section processes translation time, has improved efficient.
Claims (6)
1. method for translating internet webpage, it is characterized in that: this method may further comprise the steps:
A, Webpage is analyzed, page text is separated into framework text and content text;
B, the framework text is only translated once, original text is stored in first text, and translation is stored in second text;
C, to content text, in database, set up two tables, be respectively original text language table and language to be translated table;
D, read concrete web page contents text, each unit of content text is stored in respective items in the original text language table;
E, translate one by one, translation result is inserted respective items in the language to be translated table for the content that is filled in the original text language table;
F, first text connect by database gets a record from the original text language table, combine with the webpage format framework, forms former web page text; Second text connects by database gets a record from the language to be translated table, combine with the webpage format framework, forms the translation webpage.
2. method for translating internet webpage according to claim 1 is characterized in that: corresponding one by one on described first text and second text structure.
3. method for translating internet webpage according to claim 1, it is characterized in that: described first text and second text all have the database connection mechanism, can get the record of same sequence number from described original text language table and language to be translated table.
4. method for translating internet webpage according to claim 1, it is characterized in that: described original text language table and language to be translated table are two pre-designed tables of lane database, have same structure, and each is recorded unique sequences number.
5. as method for translating internet webpage as described in the claim 4, it is characterized in that: the sequence number of described original text language table and language to be translated table is the generation of progressively increasing automatically in the original text language table, is to duplicate gained when content is done its appropriate translation in according to the original text language table in the language to be translated table.
6. method for translating internet webpage according to claim 1 is characterized in that: as the framework text, the particular content of content element is as content text with the Property Name of content element in the page text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201010271775 CN101916248A (en) | 2010-09-03 | 2010-09-03 | Method for translating internet webpage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201010271775 CN101916248A (en) | 2010-09-03 | 2010-09-03 | Method for translating internet webpage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101916248A true CN101916248A (en) | 2010-12-15 |
Family
ID=43323762
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201010271775 Pending CN101916248A (en) | 2010-09-03 | 2010-09-03 | Method for translating internet webpage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101916248A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982127A (en) * | 2012-11-15 | 2013-03-20 | 深圳市共进电子股份有限公司 | Method of replacing characters in batch to achieve multi-language version and batch processing device |
CN104978683A (en) * | 2015-07-24 | 2015-10-14 | 沈阳云鼎科技有限公司 | Usage method of China and Russia wood e-commerce platform |
CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
CN107329958A (en) * | 2017-06-08 | 2017-11-07 | 努比亚技术有限公司 | Language transfer method and device based on webpage |
CN109783579A (en) * | 2019-01-22 | 2019-05-21 | 南京焦点领动云计算技术有限公司 | A kind of method of quick copy and translation web site |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101023425A (en) * | 2004-06-07 | 2007-08-22 | 株式会社日本英柏斯 | WEB page translation device and WEB page translation method |
CN101470705A (en) * | 2007-12-29 | 2009-07-01 | 英业达股份有限公司 | Dynamic web page translation system and method |
CN101615181A (en) * | 2008-06-27 | 2009-12-30 | 国际商业机器公司 | Create the system and method for internationalization network application |
-
2010
- 2010-09-03 CN CN 201010271775 patent/CN101916248A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101023425A (en) * | 2004-06-07 | 2007-08-22 | 株式会社日本英柏斯 | WEB page translation device and WEB page translation method |
CN101470705A (en) * | 2007-12-29 | 2009-07-01 | 英业达股份有限公司 | Dynamic web page translation system and method |
CN101615181A (en) * | 2008-06-27 | 2009-12-30 | 国际商业机器公司 | Create the system and method for internationalization network application |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982127A (en) * | 2012-11-15 | 2013-03-20 | 深圳市共进电子股份有限公司 | Method of replacing characters in batch to achieve multi-language version and batch processing device |
CN102982127B (en) * | 2012-11-15 | 2015-10-21 | 深圳市共进电子股份有限公司 | Batch substitute character string realizes method and the batch-processed devices of multi-lingual version |
CN104978683A (en) * | 2015-07-24 | 2015-10-14 | 沈阳云鼎科技有限公司 | Usage method of China and Russia wood e-commerce platform |
CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
CN107329958A (en) * | 2017-06-08 | 2017-11-07 | 努比亚技术有限公司 | Language transfer method and device based on webpage |
CN107329958B (en) * | 2017-06-08 | 2021-03-26 | 努比亚技术有限公司 | Language conversion method and device based on webpage |
CN109783579A (en) * | 2019-01-22 | 2019-05-21 | 南京焦点领动云计算技术有限公司 | A kind of method of quick copy and translation web site |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107451296B (en) | A kind of Website Module rendering intent component-based | |
Veselinova | Negative existentials: A cross-linguistic study | |
US6397232B1 (en) | Method and system for translating the format of the content of document file | |
US8756495B2 (en) | Computer-implemented system and method for tagged and rectangular data processing | |
US20010014900A1 (en) | Method and system for separating content and layout of formatted objects | |
US20130174024A1 (en) | Method and device for converting document format | |
CN110083805A (en) | A kind of method and system that Word file is converted to EPUB file | |
CN107391500A (en) | Text interpretation method, device and equipment | |
CN101916248A (en) | Method for translating internet webpage | |
EP1225516A1 (en) | Storing data of an XML-document in a relational database | |
CN107122434A (en) | A kind of method and system that reconciliation file is imported to database | |
CN108052619A (en) | A kind of method based on configuration information matching and similarity extraction webpage information | |
CN108520065B (en) | Method, system, equipment and storage medium for constructing named entity recognition corpus | |
CN106873971B (en) | Multi-language display method and system for flash application | |
CN108959330B (en) | Database processing and data query method and device | |
CN102446206B (en) | A kind of cross-platform switch and method of three-dimensional data | |
CN111597292A (en) | Text formatting cleaning method based on webpage label position | |
CN103020032B (en) | Report form generation method in cloud computing system | |
CN109739504A (en) | A method of the H5 business handling page is automatically generated according to backstage configuration | |
CN107203525B (en) | Database processing method and device | |
CN110019433A (en) | A kind of report form inquiring method and device | |
CN101916247A (en) | Internet multilingual simultaneous translation method based on single kernel language | |
CN105224642B (en) | The abstracting method and device of entity tag | |
CN114860946A (en) | Method and device for generating map network | |
CN103778117B (en) | A kind of method and system of information of mobile terminal load |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20101215 |