WO2017159906A1 - Structure de données pour déterminer un ordre de traduction de mots compris dans un texte de langue source, programme pour générer une structure de données, et support informatique lisible par ordinateur le stockant - Google Patents

Structure de données pour déterminer un ordre de traduction de mots compris dans un texte de langue source, programme pour générer une structure de données, et support informatique lisible par ordinateur le stockant Download PDF

Info

Publication number
WO2017159906A1
WO2017159906A1 PCT/KR2016/002909 KR2016002909W WO2017159906A1 WO 2017159906 A1 WO2017159906 A1 WO 2017159906A1 KR 2016002909 W KR2016002909 W KR 2016002909W WO 2017159906 A1 WO2017159906 A1 WO 2017159906A1
Authority
WO
WIPO (PCT)
Prior art keywords
translation
order
small
units
unit
Prior art date
Application number
PCT/KR2016/002909
Other languages
English (en)
Korean (ko)
Inventor
이시용
Original Assignee
이시용
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 이시용 filed Critical 이시용
Publication of WO2017159906A1 publication Critical patent/WO2017159906A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present invention relates to determining a translation word order between an original sentence and a translated sentence when the original language sentence is translated into a target language sentence.
  • the present invention relates to translation word order pattern data that is divided into small translation units that are translation subunits that match the translation, and determines the translation word order between the small translation units.
  • a translation order pattern data structure for determining a translation order in small translation units constituting translation subunits of an original sentence, a computer readable storage medium storing computer executable instructions for generating the translation sequence, and a translation program using the instructions It is about.
  • the translator can translate the original text into multiple translations that convey the same meaning, whereas the existing automatic translation outputs only one universal translation result that can understand the original text for the same sentence.
  • a professional translator has a problem that cannot be automatically translated while reflecting his / her translation personality in the process of automatic translation.
  • the present invention prior to translation by analyzing the original language stemming from the original sentence in the original language to the translation sentence in the target language and by dividing the original sentence into small translation units which are translation subunits to determine the translation word order in advance.
  • the purpose is to provide a predetermined translation order.
  • the purpose of the present invention is to determine the translation order in order to increase the accuracy of automatic translation up to the level of automatic translation between languages of the same system such as Japanese and Korean, in order to increase the accuracy of automatic translation between languages with different translation order. have.
  • a translation word order pattern data structure stored in a computer-readable storage medium which is used for translation in a translation apparatus for translating an original sentence into a translated sentence according to an aspect of the present invention, includes a plurality of small translation units from the first sentence to the end of the original sentence.
  • Sub-translational unit division pattern data for dividing into sub-translation units wherein the sub-translational unit division pattern data includes one or more parts of morphemes obtained by morphological analysis of the original sentence, and each of the plurality of sub-translation unit division pattern data. Contains the specified translation sequence number data.
  • a translation order pattern data structure stored in a computer-readable storage medium translates small translation units divided from original text by small translation unit division pattern data, and the small translation units are translated according to translation order sequence data. It further includes output data outputting the translated text of the original text by sorting the small translations of each translation unit.
  • each of the small translation unit division pattern data includes a division pattern instructing the start of division into each small translation unit, a division part indicating the beginning of the division, and an end of division into each small translation unit.
  • a small translation unit segmentation pattern beginning portion comprising one or more parts of speech located at the beginning of the morphological sequences of the small translation unit segmentation pattern data, and a segmentation pattern ending portion being positioned last in the morphological sequence of the small translation unit segmentation pattern data It may include one or more parts of speech, and the middle portion may be omitted.
  • the small translation unit division pattern start part is composed of parts of speech located at the front of the morphological sequences of the small translation unit division pattern data
  • the division pattern end part is the last of the morphological sequences of the small translation unit division pattern data. Consists of positioned parts of speech, and the intermediate portion may be omitted.
  • the beginning of the sub-translational unit division pattern consists of one or two of the morphological sequences of the sub-translational unit division pattern data
  • the end of the division pattern is the last of the morphological sequences of the sub-translation unit division pattern data. It consists of parts of speech from the noun to the end of the morphological sequence of the sub-translational unit division pattern data, and the middle part may be omitted.
  • a sentence translation including a survey, ending, or prepositional band word to be positioned at the end of a small translation unit of a small translation unit to be divided by the small translation unit division pattern data includes the plurality of small translation unit division pattern data. Specified in one or more of these.
  • the original text sentence (1) is used for translation in a translation apparatus for translating the translated text sentence (2), and the original text sentence is divided into small translation units which are translation subunits, and the translation order of the small translation units
  • a computer readable storage medium having stored thereon computer executable instructions for generating a translation order pattern data structure for determining a structure, the computer executable instructions comprising dividing an original sentence into a plurality of small translation units from the beginning to the end and Generating small translation unit division pattern data (300) from translation units, wherein the small translation unit division pattern data (300) includes at least one part-of-speech (part of speech) morphologically analyzed from the original text; And generating translation word order data (5) designated in each of the plurality of small translation unit division pattern data (300).
  • a computer-readable storage medium including instructions for generating a translation order pattern data structure, the instructions translate small translation units divided from original text by small translation unit division pattern data, And sorting the small translations, which are translations of each of the small translation units, according to the translation order number data, and outputting a translation sentence of the original sentence.
  • Computer readable storage of computer executable instructions for generating a translation order pattern data structure stored in a computer readable storage medium used for translation in a translation apparatus for translating original text into translated text according to an aspect of the present invention Generating, by the computer executable instructions, small translation unit division pattern data for dividing the original sentence into a plurality of small translation units from the beginning to the end; the small translation unit division pattern data stemming the original sentence. Generating at least one part-of-speech of the analyzed morpheme sequences; and generating translation order sequence data specified in each of the plurality of sub-translational unit division pattern data.
  • a computer readable storage medium having stored thereon computer executable instructions for generating a translation pattern data structure, the computer executable instructions comprising: a) retrieving an original text; b) morphologically analyzing the original sentence and tagging the parts of speech into the analyzed morphemes to form a morphological sequence-stemming the original sentence and tagging the parts of speech includes stemming the original code and tagging the original code.
  • Computer executable instructions for dividing the original text into small translation units which are translation subunits, used in a translation apparatus for translating the original text sentence into a translation sentence, and for determining the translation order of the small translation units according to an aspect of the present invention.
  • Computer readable storage medium having stored thereon the computer executable instructions comprising: a) retrieving a text; b) morphologically analyzing the original sentence and tagging the parts of speech into the analyzed morphemes to form a morphological sequence-stemming the original sentence and tagging the parts of speech includes stemming the original code and tagging the original code.
  • Small translation unit segmentation pattern data A translation order sequence data structure that generates a plurality of small translation unit division pattern data including one or more parts of speech, and a translation sequence sequence data structure including a translation sequence number assigned to each of the plurality of small translation unit division pattern data. To generate;
  • the original text is divided into small translation units which are translation subunits, and the translation order of the small translation units is used for translation in the translation apparatus for translating the original text sentence into a translation sentence according to an aspect of the present invention.
  • a computer readable storage medium having stored computer executable instructions for determining, the computer executable instructions comprising: a) retrieving a text sentence and displaying it to a user; b) stemming the original text and tagging parts of speech into the analyzed morphemes, wherein stemming the original text and tagging the parts of speech comprises stemming the original code and tagging the original code; c) a first signal including the specific position information in the original sentence and indicating a sequential translation order in ascending order to the front and rear sub-translation units divided based on the specific position, and the specific in the original sentence or sub-translation unit Receiving a second signal including position information and indicating a reverse translation order in descending order to front and rear sub-translation units divided based on the specific location;
  • a translation word order pattern data structure for generating a plurality of small translation unit division pattern data including at least one part of speech, and a translation order sequence data structure including a translation order number assigned to each of the plurality of small translation unit division pattern data Generating a structure.
  • the step of dividing the original text into small translation units and determining the translation order of the small translation units may include: a small translation unit to which a specific position indicated by the first signal belongs according to the first signal; Alternatively, the original sentence is divided into two front and rear sub-translation units based on the specific position, and the sequential order of ascending order is determined in the two sub-translation units divided, and the sequential order between the divided front and rear sub-translation units is divided. It is determined by the translation order number originally possessed by the small translation unit and the translation order number by which 1 is increased to the translation language sequence, and other small translation unit other than the small translation unit to which the specific position indicated by the first signal belongs.
  • Another translation unit having the same or greater order if the translation order is the same or greater than the increased order Increase the number of translation order specified in the field by 1, and convert the sub-translation unit or original sentence belonging to a specific position indicated by the second signal according to the second signal into two front and rear sub-translation units based on the specific position.
  • the translation order number of the small translation unit other than the small translation unit to which the specific position indicated by the first signal belongs is determined by the word order number and the translation order number originally possessed, and is equal to or greater than the increased order number. Increasing the translation order assigned to other small translation units having the same or larger order by 1 It should.
  • a computer-readable storage medium storing computer executable instructions for dividing an original text into sub-translation units that are translation sub-units and determining translation order of the sub-translation units, according to an aspect of the present invention.
  • Heard a) retrieving the original text; b) morphologically analyzing the original sentence and tagging the parts of speech into the analyzed morphemes to form a morphological sequence-stemming the original sentence and tagging the parts of speech includes stemming the original code and tagging the original code.
  • Hamm And c) comparing the morphological sequences of the sub-translational unit division pattern data of the translated word-order pattern data stored in the translated word-order pattern DB including the translated word-order pattern data structure to retrieve matching word-order pattern data.
  • a translation program according to an aspect of the present invention is stored in a computer-readable storage medium that divides the above original text into small translation units which are translation subunits and stores computer executable instructions for determining the translation order of the small translation units. Perform the translation using the commands.
  • the original text sentence (1) is used for translation in a translation apparatus for translating the translated text sentence (2), and the original text sentence is divided into small translation units which are translation subunits, and the translation order of the small translation units
  • the translation order pattern data may determine the translation order between heterogeneous languages, such as refractive art, interlocking words, and independent languages, as well as translation order between homogeneous languages (eg, Korean and Japanese). Since the translation order pattern data of the present invention plays a role of treating the languages of heterogeneous lines like the word order of the same language, such as English and Korean, English and Japanese, the accuracy of automatic translation between heterogeneous line languages is the same. It can increase the accuracy of automatic translation.
  • heterogeneous languages such as refractive art, interlocking words, and independent languages
  • homogeneous languages eg, Korean and Japanese
  • the translator when the translator divides the original text into sub-translational units which are translation subunits and determines the translation sequence order in these sub-translation units, the translator stores the translation sequence stored by the user in a later translation. It is available.
  • a translation beginner may complete a translation sentence according to a translation word order divided into small translation units before translation using a translation word order DB accumulated by another person such as a translation expert.
  • the translation order pattern data structure includes the small translation unit division pattern data and the translation order sequence data, and the small translation unit division pattern data and the translation order sequence data are information on each small translation unit division pattern in the translation order structure.
  • the term refers to data having data and data having information on a sequence number of a translation.
  • “translation order sequence data”, “small translation unit division pattern data” and “translation order sequence data” are referred to herein as “translation order sequence pattern”, “small translation unit division pattern”, and "word order”. Used in combination with
  • the translation word order pattern of the present invention can be used regardless of the target language because the translation order is assigned to the division patterns for dividing the original text into small translation units which are translation subunits. For example, if the original language is English and the target language is Korean, the original English sentence is analyzed and the Korean translation order for the original English is specified. Applicable That is, the translation order pattern obtained when translating English into Korean can be used as it is in translation from English to Japanese.
  • FIG. 1A is a diagram illustrating an example of dividing an original text included in a translation apparatus for translating an original text into a translation into small translation units, which are translation subunits, and storing computer executable instructions for determining the translation order of the small translation units, according to an embodiment of the present invention.
  • FIG. 1B is an enlarged view of a small translation unit and a translation word order determining unit of FIG. 1A.
  • FIG. 1C is an enlarged view of a small translation and a translation generation unit of FIG. 1A.
  • FIG. 1D is an enlarged view of the user interface unit of FIG. 1A.
  • FIG 2A illustrates the original and translated sentences used for embodiments of the present invention.
  • 2B illustrates one embodiment of a compound sentence division word order according to the present invention.
  • 2C illustrates one embodiment of a short division word order rule in accordance with the present invention.
  • FIG. 2D illustrates an embodiment morphologically analyzing the original text of FIG. 2A and tagging the part-of-speech.
  • 3A to 3H illustrate a process of dividing the original text of FIG. 2A into a plurality of small translation units according to a division word order rule, determining a translation order number for each small translation unit, and then generating a small translation.
  • a division word order rule determining a translation order number for each small translation unit, and then generating a small translation.
  • An example is shown.
  • FIG. 3I illustrates an embodiment of dividing an original sentence into three small translation units in FIG. 3C, determining a translation order, and translating a small translation for each small translation unit to complete a translation.
  • FIG 4A illustrates an embodiment of translation word pattern data according to the present invention.
  • 4B illustrates exemplary embodiments of expanded word order pattern data in which a middle portion of a split pattern is omitted based on the word order pattern data and the word order pattern data according to an embodiment of the present invention.
  • FIG. 4C illustrates an embodiment in which the translation order pattern data including the split pattern data and the order data specified therein further includes a sentence.
  • FIG. 4D illustrates embodiments of translation word pattern data generated after division into small translation units of FIGS. 3B to 3F according to an embodiment of the present invention.
  • FIG. 4E illustrates embodiments of translation order pattern data extendable from the translation order pattern of FIG. 4A in accordance with an embodiment of the present invention.
  • 5A to 5M illustrate that the translation order is divided into small translation units for the original text in the small translation unit and the translation order determination unit, and the translation order is determined for each small translation unit according to an embodiment of the present invention.
  • Figure 5n is divided into small translation units for the original text according to an embodiment of the present invention after determining the translation order number for each small translation unit after the translation is generated by the automatic translation in the small translation and translation generation unit Illustrated.
  • 6A to 6H illustrate a process of generating a translation sentence based on the stored translation order pattern data after the original text of FIG. 2A is stored in the translation order pattern DB as the translation order patterns of FIGS. 4A to 4D.
  • FIG. 7A illustrates an embodiment of a method for dividing an original text into small translation units, which are translation subunits, and determining a translation word order of the small translation units, according to an exemplary embodiment of the present invention.
  • FIG. 7B illustrates an embodiment of a method and a translation apparatus using the same for dividing an original text into small translation units, which are translation subunits, and determining the translation word order of the small translation units, according to an embodiment of the present invention.
  • 7C is another embodiment of a method for dividing an original text into small translation units, which are translation subunits, and determining a translation word order of the small translation units, according to an embodiment of the present invention, and a translation apparatus using the same.
  • the translation order pattern data structure stored in a computer-readable storage medium used for translation in a translation apparatus for translating an original sentence (1) into a translation sentence (2) is composed of a plurality of small translation units from the beginning to the end of the original sentence.
  • the small translation unit division pattern data 300 for division, and the translation order sequence data 5 specified in each of the plurality of small translation unit division patterns 300 are included.
  • it translates the small translation units divided from the original sentence sentence 1 by the small translation unit division pattern data 300, and the small translation sentences (4) which are the translations of each of the small translation units according to the translation order.
  • the output data outputting the translated sentence 2 of the original sentence may be stored in a computer readable storage medium, for example, a memory, a hard disk, etc., and displayed on the display unit 141 for display to the user, or in the order of translation. In order to generate the pattern data, it may be transmitted to the translation word order pattern generation unit 150 or the small translation memory and the translation memory DB.
  • 4A is a translation word order pattern (401, 405) specifying a translation order in order for a morphological analysis of original texts and small translation unit division patterns obtained by converting the small translation units divided in FIG. 3F into morpheme sequences according to an embodiment of the present invention; Exemplary embodiments of the expanded translation order patterns 402-404 and 406-408, which omit the split pattern middle part based on the translation order patterns, are illustrated.
  • FIG. 4A AR AANN 301, VV AR AN 321, R AR DAN PP N 361, PL PP VAN PR ( 371), V PV SY 381 and D PP N RV SY 391 are the six small translation unit division patterns for the small translation units 30, 32, 36, 37, 38, 39 of FIG. 300, translation word sequence numbers 5 are assigned to each of the small translation unit division patterns 300.
  • FIG. The small translation unit (3) divided by the small translation unit division pattern (300) in the translation apparatus or translation program of the present invention is translated, and the small translation unit (3) according to the translation order (5) When the small translations 4 which are the translations of each of the two are arranged, the translated sentence 2 of the original sentence 1 is obtained.
  • Each of the sub-translational units (3) includes a sub-translational unit division pattern that includes one or more parts of speech in the morphological sequence of the sub-translational unit through morphological analysis and part-of-speech tagging for use in dividing other original text into sub-translational units. 300).
  • each of the small translation units 3 may be represented by a small translation unit division pattern 300 including one or more of a source word, a code, and a part-of-speech that parsed them (including 'for'). 4b second row).
  • Each of the small translation unit division patterns 300 is applied as a division rule for dividing the small translation unit into small translation units when there are morpheme sequences that match each of the small translation unit division patterns among the morpheme sequences obtained by morphological analysis of another sentence.
  • the translation word order of the divided sub-translation units is determined according to the relations with the parts of speech of the front-rear sub-translation units forming a boundary with the sub-translation unit division pattern.
  • the translation word order may be determined by designating the word order between the small translation units that are divided while the user divides one original text into a plurality of small translation units.
  • the user when the sub-translation units divided by the original text and the translation order assigned thereto are displayed to the user, and the user corrects an incorrect word order and inputs the translation-translation order assigned to the small translation units and each small translation unit, the user The translation order sequence is determined from the sub-translation units entered by the user and the translation order assigned to each sub-translation unit.
  • the original text is divided into sub-translational units (3), which are translation sub-units, and each division pattern 300 which is morphologically analyzed as / 1, / 6, / 3, / 2, / 5, and / 4.
  • a translation word order (5) is specified.
  • These translation order patterns are stored in the translation order pattern DB 160, and when the same small translation unit division pattern appears when translating another sentence, the other sentences are divided into small translation units by the same small translation unit division pattern. If all six small translation unit division patterns are matched, the same original translation order pattern is applied, and the original sentence is divided into small translation units only by the translation word order without the division word order decision rule that can be mechanically divided.
  • the order of translation can be determined between units.
  • the translation order patterns 402 to 404 and 406 to 408 derived based on the translation order patterns 401 and 405 may be generated in the translation order pattern generating unit 150.
  • the small translation unit division pattern included in the translation word order pattern of FIG. 4A includes a division pattern start part including one or more parts of the division pattern morpheme sequences, a division pattern middle part, and an end division pattern including one or more parts of the division pattern morpheme sequences. It shows what consists of wealth.
  • the division pattern start part and the division pattern end part may be composed of one or more parts of speech included in the small translation unit division pattern, and the middle part of the division pattern may be omitted.
  • the translation word order pattern generating unit 150 omits the middle part of the split pattern, and includes small translation unit split patterns including a split pattern start part and a split pattern end part, and a translation word sequence number specified in the small translation unit split patterns.
  • a translation word order pattern can be generated.
  • the beginning of the division pattern includes one or more parts of speech located at the beginning of the morphological sequence of the sub-translational unit, from which the division into the sub-translational units from the beginning to the end of the original sentence begins.
  • the segmentation pattern end part includes one or more parts of speech located last in the morphological sequence of the small translation unit, in which the division into the small translation units is terminated.
  • the segmentation part is between the pre-positional part and the post-partial part-of-speech.
  • the original text can be divided into sub-translational units by the sub-translational unit division pattern.
  • the beginning of the sub-translational unit division pattern corresponds to the beginning of the division pattern of the sub-translational unit including post-parts of the front and rear sub-translation units divided by the divisional word order division pair.
  • the end portion of the sub-translational unit division pattern corresponds to the end portion of the sub-translational unit division pattern including prepositions among the front and rear sub-translation units divided by the divisional word order division pair.
  • the first small translation unit split pattern AR *** N has one split pattern start part AR and one split pattern end part N, respectively. It consists of parts of speech, and the middle part of the division pattern (***) indicates that three parts of speech are omitted.
  • Split pattern middle part skipped In the first row 403 of the extended translation order pattern 2 the first sub-translational unit split pattern AR A *** N has two parts-of-speech at the beginning of the split pattern AR A and the split pattern end part ( N) is a part-of-speech part N to the end of the morpheme sequence of the split pattern, and the middle part (**) of the split pattern indicates that two parts of speech are omitted.
  • the split pattern end part N PR of the fourth split pattern is represented by a part-of-speech PR (bracket) from the noun N to the end. Omitting the split pattern middle part
  • the first row 404 of the translation word order pattern 3 includes the split pattern start part (AR, VR, PL, V, D) 302 and the split pattern end part (N, N). , N, PR, SY, RV) 303, and the middle part of the division pattern is represented by a form (-) in which all the division patterns are omitted.
  • each division pattern is represented by a number counting from the beginning to the part-of-speech position 395 where the division unit ends.
  • the translation word order pattern may be represented by the division unit 395 and the order number 5 assigned to the division unit. This notation is useful for retrieving translation order patterns with morphological sequences that match the morphological sequences of the original text.
  • the original text can be divided into sub-translational units.
  • FIG. 4B is derived based on a translation word order pattern and a translation word order pattern including a translation word order in a small translation unit division pattern divided into small translation units of FIG. 3F according to an embodiment of the present invention. Embodiments of the expanded translation order pattern omitting the middle part of the division pattern.
  • FIG. 4B is a translation order in which the last small translation unit division pattern D PP N RV SY 391 of the translation word order pattern of FIG. 4A is changed to a small translation unit division pattern including a preposition for and one or more parts of speech. Patterns 409-416 are shown.
  • the translation pattern including one or more original words and parts of speech may be a small translation unit division pattern because the translation subunit of the original sentence may be a translation unit.
  • FIG. 4C illustrates embodiments of a segmented word order pattern further including a lateral translation in a translation word order pattern generated after translation of small translations of each of the small translation units in FIG. 3H according to an exemplary embodiment of the present invention.
  • Translations of the terminology in Korean are investigated such as 'silver', 'a', 'this', 'a', 'a', and 'a', a ending such as 'a', 'a', and 'being', or May include prepositional translations such as for and about.
  • the literary translation means the translation of the small translations, which are the translations of each of the small translation units.
  • the translated word order is defined by the translation order, and the translation order of the small translation units is determined.
  • the sentence translation 396 is not specified for each division pattern like the word order, and may be omitted as necessary if the translation sentence 396 is the same as the sentence translation automatically translated. Through the sentence translation, it is possible to grasp the sentence component of the small translation unit division pattern. For example, if the grammar translation 396 in Korean is' silver 'and' is', it can be seen that the sub-translational unit is the main subject, and if 'ul' ''' 'is the object.
  • FIG. 4E illustrates embodiments of the derived translation word order pattern, such as the split pattern middle portion omitted extended word order patterns 402-404, 406-408 of FIG. 4A.
  • the extended translation word order pattern generating unit 150 may generate the extended translation word order pattern.
  • These extended translation order patterns are stored in the translation order patterns DB 160.
  • the translation order pattern DB 160 may be analyzed to find a regularity in the morpheme sequences of the division pattern and the translation order, thereby expanding the translation order pattern.
  • the generation of the extended translation order pattern can find a regularity by analyzing the translation order patterns of the translation order pattern DB 160, merging the split patterns from the already determined translation order pattern, and then assigning the missing number to the next largest order. It can be generated by decreasing one by one so that the order of the small translation unit division pattern having.
  • the arithmetically divided translation sequence pattern divided into six sub-translational units is merged into two sub-translational units and merged into five, four, three, and two sub-translational unit division patterns.
  • 4D illustrates translation word order patterns 430-441 having two to six division patterns from the original sentence of FIG. 2A, and an abbreviated translation translation order pattern 442-453 in the middle of the division pattern.
  • FIG. 4E illustrates that the translation order is again determined by omitting the parentheses in FIG. 4A that do not affect the translation order.
  • PL PP V A N PR / 2 is omitted, translation order numbers larger than 2 are reduced by one, thereby determining a new translation order number.
  • the fifth row (480-484) of FIG. 4E shows the AR AN / 1, V AR AN / by merging the relative rounds of the translation order sequential / 2, / 3 and / 4 from the translation order sequence pattern of FIG. 4A into one small translation unit.
  • An extended translation order pattern consisting of 3 and R Clause / 3 is shown.
  • the extended translation order pattern can be extended not only in relation clauses but also in rules of translation order patterns specified in sub-translational unit division patterns including specific words such as specific that clauses, which clauses, noun clauses, adverb phrases, prepositional phrases, and verb phrases. have.
  • FIG. 1A is a diagram illustrating a computer executable instruction for dividing a source sentence into sub-translation units, which are translation sub-units, and for determining translation order of the sub-translation units in an apparatus or program for translating the original sentence into a translation sentence, according to an embodiment of the present invention.
  • One embodiment is a storage medium, program, or device.
  • a storage medium, a translation program, a method and an apparatus for dividing the original text into sub-translation units, which are translation sub-units, and storing computer executable instructions for determining the translation order of the sub-translation units may be a personal computer, a mobile device, a server, or Two or more devices of may be used on a combined network.
  • the translation order pattern data structure of the present invention, a storage medium storing the computer executable instructions, or a translation program may be distributed through a server in a downloadable form.
  • a computer readable storage medium refers to a computer readable physical component or material in which data is stored optically, magnetically, or by other means, such as a computer hard disk, memory, SSD, USB, or the like.
  • FIG. 2A shows the original sentence 1 used in the present invention and the translated sentence 1 for the original.
  • the original sentence receiving unit 100 calls the original sentence 1 to be translated.
  • the morpheme analysis and part-of-speech tagging unit 110 receives the original sentence from the original sentence receiving unit 100 to perform morphological analysis and tag the part-of-speech.
  • Morphological analysis is the first step in the analysis of natural language.
  • the morphological analysis is the process of converting the original text into a morphological sequence.
  • a morpheme is the smallest unit of meaning, the smallest semantic element that can no longer be analyzed.
  • morphemes are words or parts of words that represent grammatical or relational meanings, as well as roots, primitives, surveys, prefixes, and suffixes of simple words.
  • the text symbols that make up the original sentence such as punctuation marks (',') and semicolons (';'), are also tagged with punctuation and semicolons and are treated like parts of speech.
  • Korean when Korean is the original language, parts of Korean morphemes are divided into 'nouns, pronouns, verbs, adjectives, adjectives, adverbs, interjections, investigations, endings, affixes, roots, signs, and other Korean parts of speech.' Can be tagged.
  • the parts of Chinese morphemes can be divided into 'nouns, pronouns, rhetoric, quantums, verbs, adjectives, adverbs, prepositions, investigations, signs, other than Chinese', and tag each part of speech.
  • the morpheme analysis and part-of-speech tagging unit 110 processes multiple nouns such as nouns and morphological sequences of nouns into one noun, or chunking a plurality of parts-of-speech into a part-of-speech by tying the morphological sequences of prepositions and related parties into related parts. ) May include a unit.
  • the morpheme sequence obtained by morphological analysis of the original sentence and part-of-speech tagging is inputted into the small translation unit and the translation word order determination unit 120, and is divided into the small translation units, and the translation order is determined in each small translation unit.
  • the small translation unit and the translation word order determination unit 120 include a translation word order pattern matching unit 121 and a mechanical segmentation and word order determination unit 122.
  • the translation word order matching unit 121 checks whether there is a translation word order pattern (401-484) including a small translation unit segmentation pattern that matches the morpheme sequence of the original sentence.
  • a translation word order pattern (401-484) including a small translation unit segmentation pattern that matches the morpheme sequence of the original sentence.
  • the pattern DB 160 is searched and there is a translation word order pattern 401-484 that is matched in whole or in part with the translation word order pattern DB 160, the small translation included in the matched translation word order pattern 401-484
  • the original text is divided into small translation units according to the unit division pattern, and the translation order is determined in each small translation unit.
  • the translation order pattern DB 160 is one of computer-readable storage media recording the translation order patterns 401-484.
  • the morphological sequence of the original sentence is matched globally, it means that the morphological sequence of all the sub-translational unit division patterns of the translated word sequence pattern matches the morphological sequence of the original text.
  • it means a translation word order pattern in which the morphological sequences of the remaining small translation unit division patterns except one of all the small translation unit division patterns are matched. Even if one of all the sub-translational unit division patterns is different, other matching division patterns except the other one and their sequence numbers are matched, so that the one that does not match from the matching division patterns and sequence numbers is matched. This is because the division pattern and the sequence number can be inferred.
  • the partially matched translation word order pattern may include a small translation unit division pattern beginning and a small translation unit composed of one or more of the morphological sequences of each of the small translation units when the original text is divided into small translation units.
  • the translation word order pattern is matched with the end of the small translation unit division pattern consisting of one or more positions positioned at the end of each morpheme sequence.
  • the small translation unit division patterns composed of the small translation unit division pattern start part and the small translation unit division pattern end part, and the translation word order pattern including the sequence number assigned to each of these division patterns are not matched with the morphological sequence of the original sentence as a whole. If only the start and end portions of each of the small translation unit division patterns are matched, the original text can be divided into small translation units, and the small translations are divided according to the translation word order assigned to the division patterns including the beginning and end sections.
  • the order of translation can be determined in units. Therefore, when the beginning and end of the morphological sequence of each sub-translation unit of the original sentence match all the division patterns including the division pattern start part and the division pattern end part included in the matching translation order pattern, the original sentence is divided. It can be used as the word order determination pattern to determine the order of division.
  • the mechanical division and word order determination unit 122 divides the original text into a plurality of small translation units according to the division word order rules as shown in FIGS. 2B and 2C and determines the translation word order number in the divided small translation units (FIG. 3F). Reference). Mechanical partitioning and translation ordering are further described in the section describing FIGS. 3A-3I.
  • the translation word order pattern matching unit 121 and the mechanical segmentation and word order determination unit 122 may be applied to each other. That is, when the translation order pattern matching unit 121 searches the translation order pattern DB 160 and there is no matched translation order pattern, the original sentence is mechanically translated into small translation units through the mechanical segmentation and word order determination unit 122. Can be divided and the translation order of each small translation unit can be determined. Further, after mechanically dividing the original text into small translation units through the mechanical segmentation and word order determination unit 122 and determining the translation order of each small translation unit, the translation order pattern DB is translated into the translation order pattern matching unit 121.
  • the small translation units determined through the mechanical segmentation and word order determination unit 122 and their respective designated order numbers are reset and retrieved from the translation word order matching unit 121.
  • Translation order pattern can be applied.
  • the original text is mechanically divided into small translation units through the mechanical segmentation and word order determination unit 122, and divided into the small translation units divided through the translation word pattern matching unit 121. The order of translation can be determined.
  • the translation sequence pattern matching unit 121 searches for the translation sequence pattern DB 160, and if there is a translation sequence pattern including the matching morpheme sequences, the small translation units
  • the division into a row is performed through the mechanical division and word order determination unit 122, and the translation word order of the divided small translation units may be determined as the translation word order of the translation word order pattern imported from the translation word order pattern matching unit 121. .
  • the small translation unit and the translation word order determination unit 120 output the original text sentence into a plurality of small translation units for the original text and a designated translation word sequence number for the divided small translation unit.
  • the plurality of small translation units outputted from the small translation unit and the translation word order determining unit 120 and the translation word order designated for the divided small translation unit are input to the user interface unit 140 to display the display unit 141 to the user.
  • the translation word order determining unit 120 or the translation word order pattern generating unit 150 may receive a small translation unit and a translation word order as the input items, respectively, or generate a translation word order pattern.
  • the translated word order pattern generating unit 150 may store the translated word order pattern modified by the user in the translated word order pattern DB 160.
  • the small translation unit and the translation order determination unit 120 may mechanically divide one original sentence or the small translation unit into two small translation units in the mechanical division and word order determination unit 121.
  • the display unit 141 may display the determined small translation units and the order assigned to each of them.
  • the small translation units determined mechanically in the mechanical division and word order determination unit 121.
  • a small translation unit and a translation language sequence number input through the input unit 142 in preference to the sequence number assigned to each of them, as the small translation units for the original text and their respective translation language sequence.
  • the user can directly input the signal for determining the order number for each of the small translation units and the divided small translation units through the input unit 142, as well as the location information of the divided small translation units and the divided small translations.
  • the small translation unit and the translation word order determining unit 120 are determined in the display unit 141 whenever the small translation unit and the small translation unit are divided into two small translation units by the first signal or the second signal, and these.
  • the order assigned to each can be displayed to the user.
  • the first signal is mouse left click and the second signal is consecutive mouse left click and right click.
  • the plurality of small translation units output from the small translation unit and the translation word order determining unit 120 and the translation word order designated for each small translation unit are input to the user interface unit 140.
  • the small translation and the translation generation unit 130 may include a translation memory matching unit 131 and an automatic translation unit 132, and further include a sentence translation processing unit 133.
  • the translation memory matching unit 131 may search for the translation memory DB 170 and retrieve a translation memory matching the small translation units to generate a small translation.
  • the small translation units which failed to load the translation memory because there is no matching translation memory are automatically translated through the automatic translation unit 132 (see FIG. 3G).
  • the automatic translation unit 132 can use a commercially available automatic translation engine, such as an automatic translation engine of Google Inc.
  • the small translation and the translation generation unit 130 may apply the translation memory matching unit 131 and the automatic translation unit 132 overlapping.
  • the small translation and the translation generation unit 130 generate the small translations for all the small translation units through the automatic translation in the automatic translation unit 132 and then search for the translation memory matching unit 131 and match the translation memory or the small. If there is a translation memory, the small translations may be generated for the small translation unit in which the small translation memory exists in preference to the automatically translated small translations.
  • the small translation unit and the translation order determination, and the small translation and the translation sentence generation is divided into small translation units and the original text sentence once in the small translation unit and translation language order determination unit 120 Whenever the decision is made, small translations may be generated for the plurality of small translation units divided by the small translation and the translation generation unit 130, and the total translation sentences may be generated by sorting according to the translation order assigned to the small translation units. have.
  • the small translation unit and the translated sentence generation unit 130 After generating the small translation and generating the translated sentence, transmit the small translation units, the order of translation, the generated small translations, and the translated sentence to the user interface unit 140.
  • the display unit 141 of the interface unit 140 determines the small translation units and the order designated to each of them, and the small translations for each of the small translation units and the translation order assigned to each of the small translation units. Sort them as they are, and display the completed translated sentence to the user.
  • the user modifies and inputs the divided translation order and the translation order specified for the divided small translation unit into the input unit 142, the small translation unit and the translation order, the small translation unit. You can save the translations in descending order. Whenever the user divides the original text into small translation units and determines the translation order, the user can determine the further division of the small translations, checking the translated small translations and the translated sentences for each small translation unit.
  • the translation order matching unit 121 may search the translation order pattern DB 160 for each small translation unit, and the user may match the translated translation. According to the order of the translations, the small translations and the translations generated through the translation unit 130 and the translation sentences in which the small translations are sorted, are further divided, or no longer divided, the translated small translations and translations You can decide whether to use.
  • the sentence translation unit 133 may include the small translation sentences, which are translations of the small translation units, wherein the small translation units of the original text matching the translation sequence pattern of the translation sequence pattern DB 160 are included in the translation sequence pattern.
  • the translation sentence is processed by processing a search, a mother word, or a preposition translation in accordance with the translation order.
  • the sentence translation processing unit 133 may generate a survey, a mother and a preposition translation, etc. assigned to the divided sub-translation units when determining the translation word order according to the relationship between the pre-part and the post-part of the division pair of the translation order rule. have.
  • a main search may be created in the first sub-translation unit belonging to the noun that is a prepositional part, and a verb ending ending may be created in the later sub-translation unit.
  • a verb ending ending may be created in the later sub-translation unit.
  • it may be predetermined that adjectives or affiliates modify the nouns in reverse sequential order. You can create and specify a formula ending such as'.
  • the grammar translation unit 131 also automatically translates each of the small translation units, sorts them according to the translation order, and then converts the necessary qualifiers between the small translations generated by the automatic translation for each small translation unit. And modify and add spelling to complete the translation sentence.
  • the small translation and translation generation unit 130 outputs the small translations and the translated sentences for the small translation units.
  • the small translations and the translated sentences outputted from the small translation and the translation generation unit 120 are input to the user interface unit 140 and displayed to the user through the display unit 141, and the user is provided with respect to the plurality of small translation units.
  • the small translation units and the small translation sentences for the small translation units are input through the small translation memory generation unit 130. It includes and stores the translation memory including the translation order number between each small translation unit in the translation memory DB (170).
  • the correspondence of the small translations for each of the small translation units is indicated by the subunit correspondence indicators.
  • the sub-translation unit position information of the sub-unit correspondence indicators may indicate the location subdivided into sub-units in the original text of each of the sub-translation unit, and the sub-translational position information of the sub-unit correspondence indicators Each of the translations may indicate a location within the translation sentence.
  • the translation word order pattern generating unit 150 may also generate an extended translation word order pattern that may be extended based on the translation word order patterns stored in the translation word order pattern DB 160 and store the translation word order pattern DB 160.
  • the extended translation order pattern can be used to specify the division pattern and the translation order by matching the morphological sequences of the stemmed parts of the original sentence.
  • the expanded translation order pattern will be described with reference to FIGS. 4A to 4E.
  • the small translation and the translation generation unit 130 analyze the small translation memories that are pairs of the small translation unit and the small translation from the translation memory DB 170 to generate an extended translation memory and store the translation memory in the translation memory DB 170.
  • FIG. 2B shows one embodiment of a compound sentence division word order 6 according to the invention
  • FIG. 2C shows an embodiment of a sentence division word order rule 6 according to the present invention.
  • the original sentence is first divided into small translation units by first applying the compound sentence division word order first, and then the fragment division word order rule. It takes place in the order of applying. It is necessary to first divide the clauses and clauses connecting the clauses to divide the sentence into short sentences, and then split them for each of the short sentences. Because it is word order. Also, the division pairs of nouns and verbs are determined to be divided first among the short division word order rules. This is because the predicate in the short sentence has a translation order that is translated before the predicate.
  • the divisional word order rule includes a list of split pairs of pre-parts and post-parts that are divided into two sub-translational units in the morphological sequence of the original text.
  • the pre-partial part of speech is divided between the pre-partial and post-partial parts specified in the division rule, and the pre-partial part is the last part of the previous sub-translational unit among the two sub-translational units that are divided, and the post-partial unit is the second sub-translational unit of the divided sub-translational units.
  • the division word order rule is divided into two sub-translational units according to the pre-part and post-part-of-speech of division pairs. After the part-of-speech belongs to the information on the translation order between the small translation unit and may include the priority information between the pairs of division.
  • FIG. 2D shows the result 20 of stemming and tagging the part-of-speech for the original text of FIG. 2A.
  • the symbols included in the original sentence (left brackets, brackets, commas, and periods) are treated the same as parts of speech in morphological analysis and tagged as a part of speech.
  • '(' And ')' are part-of-speech tags in left and right parentheses and are marked with PL and PR, and commas and periods are tagged with punctuation and marked with SY.
  • the division word order decision rule includes a list for the partitioning pairs of the two specific parts of speech that are to be divided between the pre-part of speech and the post-part of speech.
  • Dividing the original text into a plurality of small translation units according to the division word order determination rule and determining the translation word order in each sub-translation unit may include morphological sequence of the original sentence if there is an arrangement of parts of speech matching the division pair of the division word order determination rule.
  • the morphological sequence of the parts of speech and the code constituting the original sentence morphological analysis and part-of-speech tagging through the morphological analysis and the part-of-speech tagging unit 110 are converted into a plurality of small translation units from the beginning to the end.
  • the divided two sub-translation groups according to the sequential translation order or the reverse translation order specified in the division pair of the division word order determination rule. Determine the order of translation of the above.
  • the morphological sequence of the original text matching the pairs of pre-part and post-partial parts included in the split-pair list is performed on the entire text until the text is split from the front to the end.
  • FIG. 2B includes a list of five partition pairs
  • FIG. 2C includes a list of fourteen partition pairs.
  • the division pairs not checked in the sequential order column indicate the reverse translation order.
  • the sequential order the sequential order between the divided back and forth sub-translation units is determined by the translation order number originally possessed by the small translation unit and the translation order number increased by 1 to the translation order, before being divided. If the translation order number of a small translation unit other than the small translation unit has a sequence number equal to or greater than the increased sequence number, the translation sequence number assigned to the other translation units having the same or larger sequence number is 1; Determine the reverse order in descending order of the two sub-translational units that are divided, according to the reverse translation order.
  • 1 is determined by the incremented translation order and the original translation order, and are not divided subtotal translation units. If the translation sequence number of another small translation unit has a sequence equal to or greater than the increased sequence number, the translation sequence number assigned to the other translation units having the same or greater sequence number is increased by one.
  • 2B is an example of a compound sentence segmentation rule, and illustrates a translation order sequence number for a segmentation pair.
  • the conjunctions (AND / OR) can be internally programmed to represent only concatenations connecting clauses and clauses, rather than representing all equivalence conjunctions. That is, if the first AND / OR in the morphological sequence of the original sentence is found in the parts of speech before and after the AND / OR equivalence conjunction, the verb exists before the first AND / OR, and the next conjunction including the second AND / OR after the first AND / OR. Or it can be programmed to identify the conjunction 'AND / OR' only if a verb exists between the end of the original sentence.
  • the verb does not exist before AND / OR, or after the AND / OR until the next conjunction containing the next AND / OR, or until the end of the original sentence, it is not a conjunction (AND / OR) in the division word order rule.
  • the segmentation rule includes a segmentation pair of all parts of a part-and-or (AND / OR) and all parts of a part of a conjunction (AND / OR) and all parts of a part, the AND / OR linking clauses to clauses is split back and forth to form a subtranslation unit alone. Can be configured.
  • FIG. 2C is an example of a short division word order rule, and illustrates a translation word order number for a division pair.
  • the split word order rule of FIG. 2C is an embodiment of the short word order rule, and since the short word split rule may have a difference in translation word order according to the order of word order between them when there are split pairs for sequential translation and reverse translation, Priority can be set between the pairs of partitions. For example, noun-verb, past participle-verb, present participle-verb, and noun-pronoun give priority to division pairs when dividing the morphological sequence of the original sentence that has been parsed to prioritize the subject part and the narrative part of the original sentence. You can decide.
  • the unpaired partitioned pairs are sequentially matched from the beginning to the end of the morphological sequences of the stemmed parts of the original sentence, and divided according to the partitioned pairs in the partitioned word order, and the translation order is determined. If there is a partitioning pair with priority, the partitioning word order rule takes precedence over the sequential partitioning from the beginning to the end. For example, if priorities 1 and 2 are assigned to a conjunction-all part-noun and noun-verb, even if the division pair of the conjunction-all part-of-speech is located after the divisional pair of the noun-verb, the noun-all part-of-speech must first be divided and the nouns first.
  • the verb is split into the next, and the noun-verb split pair is split before the other split pairs in the split pair list.
  • the priority among divisional pairs of noun-verb, past participle-verb, present participle-verb, and noun-pronoun can be set to 1, 2, 3, or 4, and the rest of the other division pairs Partitioning is done according to the rules of division word order from beginning to end.
  • the pre- and post-part-of-speech parts of the divisional word order rule may vary according to the language, and the translator may adjust the sub-translational units that are split by adding or deleting the division pair to be divided into the division pair list.
  • the order of translations of the divided subtranslation units is determined to be sequentially increased, in which case the segmentation order is determined so that the conjunctive conjunction is a prepositional or postpartum.
  • the small translation of the small translation unit located at the rear of the small translation unit may have a translation order of the original translation of the small translation unit located in front of the divided translation unit. It means that it is determined to have an increased translation order by 1 in the translation order.
  • the divisional word order rule may be determined such that the post-partial speech is a dependent conjunction, and the order of translation word order of the divided sub-translation units may be determined to be reduced in reverse order. It is determined that the order of decreasing in reverse order is that the small translation unit located at the rear of the divided small translation unit has the translation order number originally possessed by the small translation unit, and the small translation unit located at the front of the small translation unit before the division It means that it is determined to have the translation order number increased by 1 more than the translation order number that the small translation unit has.
  • the nouns are, for example, prepositional parts, and are divided so that the latter part of the noun is included in the suffix of the preceding sub-translational unit. This is because by dividing on the basis of the noun, the translation order can be mechanically determined after the mechanical division by the division order rule.
  • the split word order rule can increment or decrement the list of split pairs as needed by the translator. As the list of split pairs in the split word order rule increases, the number of sub-translational units constituting the original sentence increases, so that the number of translations that need to be sorted in the order of translation increases, and the number of words of the small-translation units that need to be automatically translated decreases. This can be done, but the inspection time can be increased when the translator checks that the translation is done correctly. As the list of split pairs decreases, the number of small translation units decreases and the number of translation order assigned to it decreases.
  • 3A to 3H show the original text 1 of FIG. 2A in the mechanical segmentation and translation word order determining unit 122 in accordance with the division order rules of FIGS. 2B and 2C.
  • An embodiment of a series of processes of mechanically dividing by a program and then determining the order of translation between the divided small translation units and translating each small translation unit is shown.
  • a second operating BZ2 state can be a operating state at which a very small amount of fuel (in contrast to operating state BZ1) is metered, especiallly for catalyst heating.
  • the predetermined division word order rule between specific parts-of-speech may further include subordination order information specifying whether the front and rear sub-translation units to be divided for the divisional pair of the pre-partial and post-partial parts are in ascending translation order or descending translation order.
  • subordination order information specifying whether the front and rear sub-translation units to be divided for the divisional pair of the pre-partial and post-partial parts are in ascending translation order or descending translation order.
  • the translation order number between two divided sub-translation units is determined by the translation order number originally possessed by the original text or the sub-translation unit and the translation order number increased by 1 to the translation order.
  • the original sentence 1 before splitting in FIG. 3A splits between nouns-verbs according to the short segmentation ordering rule 6 of FIG. 3B to FIG.
  • the translation order number between (30, 31) is determined by the translation order number 1 that the original text originally had and the translation order number 2 increased by 1 to the translation order number.
  • the second small translation unit 31 is the translation order sequential number 2, but when the second small translation unit 31 is divided between all parts of speech and related companies according to the short sentence division order rule (6) (at which Is pre-determined to be analyzed as a related company in morphological analysis), and the translation order between the two small translation units 32 and 33 divided in FIG. 3C is the translation order originally possessed by the small translation unit 31 before the division. It is determined by the translation order number 3, which is increased by 1 to the translation number order 2 and the translation word order 2, and the translation word order of all the parts-related part division pairs in the division word order rule of FIG. 2C is determined to be the reverse order.
  • the later small translation unit 33 of the translation units 32, 33 is determined to be the translation order sequential 2, and the translation order of the preceding small translation unit 32 is 3.
  • the translation order number of the other small translation units other than the two sub-translation units that are divided has the same order as the increased translation order or the number greater than the increased translation order Increment the translation sequence number assigned to the other sub-translation units.
  • the third small translation unit 34 of FIG. 3D is divided by all the parts-sequence pairs of parentheses in the division word order rule, the third small translation unit 36 and the fourth small translation unit of FIG.
  • the translation order number of the small translation unit (36) and the fourth small translation unit (37) has (3) and the order number 2 and 1, which had the small translation unit (34) before the division, increased to 3 Since the order of division of all parts-left brackets in reverse division order is defined in reverse order, the translation order of the smallest translation unit (36) and the fourth small translation unit (37) is sequence number 3 and sequence number 2, respectively. Becomes At this time, the translation order number of the small translation unit 32 and the fourth small translation unit 35 rather than the small translation unit 34 divided in FIG. 3D before the division was 4 and 3, respectively. 3 small and 4 small translation units (36) and 4 small translation units (37) have the same or greater order than 3 of the small translation unit (36). (36, 37) becomes 4 and 5, respectively.
  • small translations 42 are generated by automatic translation.
  • the small translation 42 may be generated by searching for a translation memory matching the small translation units.
  • the small translation and the translation sentence generation are generated in the small translation and translation generation unit 130. Can be performed. The user may determine whether to divide additionally by looking at the translation sentences whenever the division into the small translation units of FIGS. 3B to 3H and the translation order are determined in each of the small translation units.
  • FIG. 3h shows the results of processing the translations of the translations of each of the small translations in the translation order through the translation translation processing unit 133.
  • the literary translation processing unit 133 adds the translated sentence specified in the small translation unit division pattern to the automatically translated small translation or the imported small translation when the translation is divided into small translation units by the small translation unit division pattern included in the translation order sequence pattern. Delete or modify duplicate search endings, prepositional translations, etc. in order of translation order.
  • 3i shows that after the original sentence (1) is applied to only the noun-verb and all parts-related verbs in the short segmentation order, the translation order is assigned to the small translations and the respective small translations, and then for each small translation unit.
  • 5A to 5N show a small sentence unit for the original sentence, while receiving a signal from the input unit with respect to the original sentence of FIG. 2A in the small translation unit and the translation word order determination unit 120, and dividing the original sentence into small translation units which are translation subunits.
  • One embodiment for determining the translation order of translation units is shown.
  • the input unit 142 included in the user interface unit 140 of FIG. 1 includes specific location information in the original sentence or sub-translational unit and is arranged in descending order before and after the sub-translational units divided based on the specific position.
  • a first signal indicating a sequential translation order, and a first order indicating a reverse translation order in descending order to the front and rear sub-translation units including specific position information in the original sentence or sub-translation unit and divided based on the specific position. 2 Receive a signal.
  • the input unit 142 receives a first signal specifying a specific position 51 and a sequential word order in the original sentence of FIG. 5A. Accordingly, as shown in FIG.
  • the first small translation unit 30 and the second small translation unit 31 are divided based on the specific position 51, and the translation order of 1 and 2 is determined by the sequential word order, respectively. do.
  • the input unit 142 may generate the information about the specific position 51 and the sequential order as, for example, 29-1, and transmit the information to the small translation unit and the translation word order determination unit 120.
  • 29 represents the 29th counted specific position 51 including a blank at the beginning of the original sentence
  • 1 represents a sequential word order. If '29 -2 ', 2 represents the reverse order.
  • the small translation unit and the translation order determination unit 120 of FIG. 1 receive the first signal or the second signal from the input unit, and divide the original text into a plurality of small translation units according to the first signal or the second signal. Then, the translation order of the divided small translation units is determined.
  • the small translation unit and the translation order determination unit 120 receive the first signal or the second signal from the input unit, the small translation unit and the translation word order determination unit 120 are already in the translation word order pattern matching unit 121 or the mechanical segmentation and translation order determination unit 122. If there are a plurality of small translation unit division patterns 300 and each of the plurality of small translation unit division patterns, the translation sequence number 5 is reset, and the original text is reset according to the first signal or the second signal.
  • the small translation unit and the translation word order determining unit 120 refer to the small translation unit or the original sentence to which the specific position 51, 52, 53, 54, 55 indicated by the first signal belongs according to the first signal based on the specific position.
  • two sub-translational units are divided and two sub-translational units are divided.
  • the sequential order between the divided front and rear sub-translation units is determined by the translation order number originally possessed by the small translation unit and the translation order number increased by 1 to the translation language order.
  • the other having the same or larger order Increment the translation order assigned to the small translation units by one.
  • the sequential order (sequence 2 and 3) between the divided front and rear sub translation units 34 and 35 is the translation order originally possessed by the small translation unit 33 (see FIG. 5E) before being divided. It is determined by the order number (number 2) and the order number of translations (number 3) which is increased by 1 in the order of translation.
  • the translation order sequential number (sequence 3) of the small translation unit 32 other than the small translation unit 33 to which the specific position 53 indicated by the first signal belongs is the same as the increased order (sequence 3) ( 5F)
  • the translation sequence number (order 3) assigned to the other small translation units 32 having the same sequence number is increased by one (see sequence 4, FIG. 5G).
  • the small translation unit and the translation word order determination unit 120 transmit the front and rear small translation units divided into the display unit 141 and the order number specified in the front and rear small translation units.
  • the small translation unit or original sentence to which the specific position indicated by the second signal belongs is divided into two front and rear small translation units based on the specific position, and the reverse order of descending two divided translation units is divided. Determine the order.
  • the reverse order between the divided front and rear sub-translation units is determined by the translation order number which is increased by 1 to the translation order sequence originally possessed by the small translation unit and the translation order sequence originally possessed by the second signal, and indicated by the second signal.
  • Translation order assigned to other small translation units having the same or larger order if the translation sequence number of a small translation unit other than the small translation unit to which a specific position belongs is the same as or greater than the increased sequence number. Increases the sequence number by 1.
  • the reverse sequence number (sequence 3 and sequence 2) between the front and rear sub-translation units 32 and 37 divided in FIG. 5J is the translation sequence sequence number (sequence 2, 2) originally possessed by the small translation unit 34 before being split. 5H), the small translation unit (34) which is determined by the translation order number (number 3) and the original translation order number (number 2) having 1 increased, and to which the specific position 54 indicated by the second signal belongs.
  • the translation order sequence (order number 4, sequence number 3) of the other small translation units 32, 35 other than) is the same as the increased sequence number (order 3) (see FIG. 5I) or greater than the increased sequence number (order 3).
  • sequence number (4) increases the translation sequence number assigned to the other small translation units 32, 35 having the same or larger sequence number by one.
  • the reverse order number (order 5 and number 4) between the front and rear sub translation units 38, 39 divided in FIG. It is determined by the translation order number (number 5) and the original number of translation word number (number 4) that 1 is increased to the translation word sequence number (number 4, see FIG. 5K), and the specific position indicated by the second signal ( 55) if the translation sequence number (sequence 5) of the small translation unit 32 other than the small translation unit 35 to which it belongs is equal to the increased sequence number (sequence 5) (see FIG.
  • the same sequence (sequence 5) Increase the translation sequence number specified in the other small translation unit (32) having 1) by 1.
  • the small translation unit and the translation word order determination unit 120 transmit the front and rear small translation units divided into the display unit 141 and the order number specified in the front and rear small translation units.
  • the small translation and the translation generation unit 130 receive the small translation units divided by the small translation unit and the translation word order determination unit 120 and the translation order assigned to them to generate the small translations for each of the small translation units.
  • the small translation sentences are arranged to generate a translation sentence, and are transmitted to the display unit 141.
  • the sentence translation processing unit 133 included in the small translation and the translation generation unit 130 receives information on the morphological sequence of the split patterns divided by the small translation unit and the translation word order determination unit 120 and the translation word order. The transliteration of the small translations of the small translation units can be processed to complete the translation.
  • the user sees the order of the small translation units and the small translation units divided from the first signal or the second signal for the original text from the display unit 141 and through the input unit 142 the new first signal.
  • the second signal may be input. After inputting the first signal or the second signal for a specific position, the user determines the translation order and accordingly displays the result of the automatic translation of the displayed original text.
  • the first signal or the second signal may be additionally input, or a portion of a small translation or a translation sentence may be input through the direct input unit 141.
  • the display of the sub-translation units and the translation order assigned to each of them may be displayed in the original text as shown in Figs. 5A to 5N, and as shown in Figs. 3A to 3I, each sub-translation unit
  • the order numbers 5 assigned to each sub translation unit may be displayed in the display area of the sub translations.
  • 5B corresponds to FIG. 3B
  • FIG. 5D corresponds to FIG. 3C
  • FIG. 5G corresponds to FIG. 3D
  • FIG. 5J corresponds to FIG. 3E
  • FIG. 5M corresponds to FIG. 3F.
  • FIG. 5N shows the small translation unit and the translation generation unit 130 after the division into the small translation units and the translation order are determined in each of the small translation units in the small translation unit and the translation word order determination unit 120 in FIGS. 5A to 5M.
  • the translation memory DB 170 search or automatic translation, small translations for each small translation unit are generated, and small translations are sorted according to the order of translation.
  • Translation pattern Stored in DB The translation order pattern Used In small translation units Determination of segmentation and translation order and generation of translation order
  • 6A to 6H illustrate a series of processes for generating a translation sentence based on the stored translation order pattern after the original text of FIG. 2A is stored in the translation order pattern DB 160 as the translation order patterns of FIGS. 4A to 4D. .
  • FIG. 6A is a text sentence 601 to be newly translated
  • FIG. 6B is a result 650 of stemming and tagging the text sentence 601.
  • 6C illustrates a translation word order pattern 462 matching the morpheme sequence 651 which is a result of morphological analysis and tagging of the original sentence 601 by searching the translation word order pattern DB 160.
  • the morpheme sequence is the result of the morphological analysis of the original sentence 651 by morphological analysis and tagging.
  • FIG. 6D shows that one or more of the morphological sequences of the original sentence 601 match the beginning and end of each of the small translation unit division patterns of the searched translation order pattern 464 in the translation order pattern DB 160. .
  • FIG. 6E is a result of dividing the small translation unit 530 by the small translation unit division patterns of the matched translation word order pattern 462 or the translation word order pattern 464 and displaying the translation order order 5.
  • 6F is a result 642 of automatic translation of each of the small translation units 530 through the small translation and the translation generation unit 130.
  • FIG. 6G shows a result 643 obtained by adding a translation of the translation of the translation order-order patterns 462 and 464 by the translation-processing unit 133.
  • FIG. 6H illustrates the slang translations in consideration of a suffix translation such as a search, a ending or a prepositional translation, which is determined from a division pattern end of each sub-translation unit division patterns and a division pattern pair at the beginning of the division pattern by the sentence translation unit 133.
  • the processing result is shown. After this processing, the translated text of the original text is completed.
  • FIG. 7A illustrates a translation word order pattern data structure for dividing an original text into small translation units that are translation subunits and determining translation order of the small translation units used in the translation apparatus or program according to the embodiment of FIG. 1A.
  • 7B is an example of a method for generating or functions for generating the same, and FIG. 7B divides an original text into a small translation unit which is a translation subunit and is used in a translation apparatus or program according to the embodiment of the present invention of FIG. 1A.
  • An embodiment of a method of determining the translation order of translation units and generating a translation is provided. These methods are performed by computer executable instructions and the instructions are stored on a computer readable recording medium.
  • the method for generating a translation order pattern data structure used for translation in a translation apparatus for translating an original sentence sentence (1) into a translation sentence sentence (2) according to the present invention includes a plurality of small translation units from the first sentence to the end of the original sentence.
  • Tagging parts of speech (1010 and 2010), dividing the original text into small translation units and determining translation order of the small translation units (1020 and 2020), the divided small translation units and the determined small translation units Displaying the translation order to the user and receiving input from the user (1030 and 2030), the translation order determined by the small translation units and the small translation units displayed to the user, and the small translation units input by the user, and Comparing the translation order number assigned to the small translation units and generating a translation order pattern including the small translation units and the translation order number assigned to the small translation units. And a step (1050 and 2050) for storing a phase (1040 and 2040), and the translated word order pattern on a translated word order pattern DB (160).
  • step 1050 of storing the translation order pattern in the translation order pattern DB 160 when another original sentence is retrieved after the step 1050 of storing the translation order pattern in the translation order pattern DB 160, the series of steps are performed again from step 1000.
  • the translation order pattern stored in the translation order pattern DB 160 in step 1050 is then retrieved in step 1022 to determine the translation order pattern when the original text is translated.
  • the method or function of generating the translated sentence of FIG. 7B includes the small translation and the translated sentence generation step 2050 and the small translation memory and the translation memory DB 170 after determining and storing the translation word order pattern of FIG. 7A.
  • the method may further include storing 2060.
  • FIG. 7B when the original text is loaded after the small translation and the translation in the small translation memory and the translation memory DB 170 (2060), the series of steps are performed again.
  • the step 1020 and 2020 of dividing the original sentence into small translation units and determining the translation word order of the small translation units may be based on a morpheme sequence obtained by morphological analysis of the original sentence. (1022 and 2022) dividing the original text into small translation units and determining the translation language order of the small translation units according to a translation word order pattern that is matched in whole or in part with the morphological sequences of translation unit division patterns. Contains a predetermined division pair for dividing between specific parts-of-speech in a morphological sequence of one original sentence tagged, and the preceding and subsequent sub-translation units to be divided by the division pair are in ascending translation order or in descending order.
  • steps 1022 and 1024 may be performed by either of the steps or both.
  • steps 2022 and 2024 may be performed by either step or both.
  • FIG. 7C illustrates a method or function for dividing an original text into small translation units that are translation subunits according to signals input by a user, and determining the order of translation of the small translation units, and a method or function for translating using the same. to be.
  • the method or function for determining the translation order of the sub-translation units for the original text by dividing the original text into the sub-translational units which are the translation sub-units is to retrieve the original text and display it to the user (3000), and to stem the original text. And tagging the parts of speech in the analyzed morphemes (3010), including a specific position information in the original sentence, and indicating a sequential translation order in ascending order to the front and rear small translation units divided based on the specific position.
  • Receiving a first signal and a second signal indicating a reverse translation order in descending order to front and rear sub translation units including specific position information in the original sentence or sub translation unit and divided based on the specific position ( 3020) Receiving the first signal or the second signal from the input unit, and recovers the original text according to the first signal or the second signal. Division of the cows into the translation unit, and a step 3030 to determine the translation word order sequence number of the divided said small translation unit.
  • a method or function for generating a translation sequence pattern comprising a designated translation sequence number (5) is a method for determining the translation sequence of the small translation units for the original sentence while dividing the above original sentence into small translation units which are translation subunits.
  • the method may further include generating a translation word order pattern including a plurality of small translation unit division patterns 300 and a translation word sequence number 5 assigned to each of the plurality of small translation unit division patterns 300.
  • the plurality of small translation unit division patterns 300 and the plurality of small translation unit division patterns for dividing an original sentence into a plurality of small translation units from sentence to sentence through morphological analysis and part-of-speech tagging 300
  • Generating small translations and translations for each of the small translation units by generating the translation sentences by sorting the small translation sentences according to the translation order assigned to the small translation units (3040), original sentence, Displaying (3050) the small translation units, and a translation sentence sequence number assigned to each of the small translation units, a translation sentence to a user, and receiving an input from the user; And receiving the input from the user, and if the input is the first signal or the second signal, perform steps 3030, 3040, and 3050, and if the input is not the first signal or the second signal, the divided small translation units And a morpheme sequence obtained by stemming the original sentence into a plurality of small translation units from the beginning to the end of the original sentence converted through morphological analysis and part-of-speech tagging from the translation order assigned to the small translation units.
  • Generating a translation word order pattern including a plurality of small translation unit division patterns 300 including a part-of-speech and a translation word sequence number 5 assigned to each of the plurality
  • the original sentence in the translation apparatus for translating the original sentence 1 into the translation sentence 2, the original sentence is divided into small translation units which are translation subunits, and the translation order of the small translation units is determined.
  • the method or function includes the step 3012 of dividing the original text into small translation units and determining the translation order of the small translation units between steps 3010 and 3020 of FIG. 7C, and displaying the determined translation order to the user.
  • Step 3018 may further include.
  • the step 3012 of dividing the original sentence into small translation units and determining the translation order of the small translation units may be performed by retrieving the translation order pattern DB (3014) and dividing the original sentence. And determining 3016 in accordance with the word order determination rule.
  • steps 3014 and 3016 may be performed in either or both of the steps. If the first signal or the second signal is received from an input unit after displaying the determined translation order to the user (3018), the original text is divided into small translation units and the translation order of the small translation units is determined. A plurality of small translation unit division patterns 300 determined in operation 3012 and a translation word order 5 assigned to each of the plurality of small translation unit division patterns are reset.
  • the small translations and the translation generation steps 2050 and 3040 may be omitted according to an embodiment.
  • the translation order pattern according to the present invention has been described using English as the original language and Korean as the target language.
  • the present invention is not limited to English and Korean, and may be utilized, for example, between other original languages and target languages between deadlocks, refractive words, and isolated words.
  • the translation word order pattern of the present invention can be applied to translation not only between Japanese and Chinese but also other languages such as German and Spanish.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Machine Translation (AREA)

Abstract

L'invention concerne une structure de données de motif d'ordre de traduction stockée dans un support informatique lisible par ordinateur, qui est utilisée pour une traduction dans un appareil de traduction pour traduire une phrase de langue source en une phrase de langue cible, et comprend : des données de motif de segmentation en petites unités de traduction pour segmenter une phrase de langue source entière du début à la fin en une pluralité de petites unités de traduction – les données de motif de segmentation en petites unités de traduction comprenant une ou plusieurs parties de parole dans une chaîne de morphèmes obtenue par analyse de morphème de la phrase de langue source -; et des données de numéro d'ordre de traduction spécifiées chacune pour chacune de la pluralité de données de motif de segmentation en petites unités de traduction.
PCT/KR2016/002909 2016-03-16 2016-03-23 Structure de données pour déterminer un ordre de traduction de mots compris dans un texte de langue source, programme pour générer une structure de données, et support informatique lisible par ordinateur le stockant WO2017159906A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020160031588A KR20170107808A (ko) 2016-03-16 2016-03-16 원문문장을 번역 소단위들로 분할하고 소번역단위들의 번역어순을 결정하는 번역어순패턴 데이터 구조, 이를 생성하기 위한 명령어들을 저장한 컴퓨터 판독가능한 저장매체 및 이를 가지고 번역을 수행하는 컴퓨터 판독가능한 저장매체에 저장된 번역 프로그램
KR10-2016-0031588 2016-03-16

Publications (1)

Publication Number Publication Date
WO2017159906A1 true WO2017159906A1 (fr) 2017-09-21

Family

ID=59851030

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2016/002909 WO2017159906A1 (fr) 2016-03-16 2016-03-23 Structure de données pour déterminer un ordre de traduction de mots compris dans un texte de langue source, programme pour générer une structure de données, et support informatique lisible par ordinateur le stockant

Country Status (2)

Country Link
KR (1) KR20170107808A (fr)
WO (1) WO2017159906A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019107623A1 (fr) * 2017-11-30 2019-06-06 주식회사 시스트란인터내셔널 Procédé et appareil de traduction automatique
KR102592630B1 (ko) * 2018-11-21 2023-10-23 한국전자통신연구원 번역단위 대역 코퍼스를 이용하는 동시통역 시스템 및 방법
KR102181677B1 (ko) * 2018-12-18 2020-11-24 (주)아이브릭스 특허 청구항 구조화를 위한 방법 및 장치

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020152081A1 (en) * 2001-04-16 2002-10-17 Mihoko Kitamura Apparatus and method for adding information to a machine translation dictionary
US20050060160A1 (en) * 2003-09-15 2005-03-17 Roh Yoon Hyung Hybrid automatic translation apparatus and method employing combination of rule-based method and translation pattern method, and computer-readable medium thereof
US20070150260A1 (en) * 2005-12-05 2007-06-28 Lee Ki Y Apparatus and method for automatic translation customized for documents in restrictive domain
KR20100069119A (ko) * 2008-12-16 2010-06-24 한국전자통신연구원 구문 분석 방법 및 그 장치
KR20120046414A (ko) * 2010-11-02 2012-05-10 에스케이플래닛 주식회사 중간 번역처리 결과 제공 장치 및 그 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020152081A1 (en) * 2001-04-16 2002-10-17 Mihoko Kitamura Apparatus and method for adding information to a machine translation dictionary
US20050060160A1 (en) * 2003-09-15 2005-03-17 Roh Yoon Hyung Hybrid automatic translation apparatus and method employing combination of rule-based method and translation pattern method, and computer-readable medium thereof
US20070150260A1 (en) * 2005-12-05 2007-06-28 Lee Ki Y Apparatus and method for automatic translation customized for documents in restrictive domain
KR20100069119A (ko) * 2008-12-16 2010-06-24 한국전자통신연구원 구문 분석 방법 및 그 장치
KR20120046414A (ko) * 2010-11-02 2012-05-10 에스케이플래닛 주식회사 중간 번역처리 결과 제공 장치 및 그 방법

Also Published As

Publication number Publication date
KR20170107808A (ko) 2017-09-26

Similar Documents

Publication Publication Date Title
US5794177A (en) Method and apparatus for morphological analysis and generation of natural language text
WO2014025135A1 (fr) Procédé permettant de détecter des erreurs grammaticales, appareil de détection d'erreurs correspondant, et support d'enregistrement lisible par ordinateur sur lequel le procédé est enregistré
WO2016125949A1 (fr) Procédé et serveur de résumé automatique de document
EP0378848A2 (fr) Procédé d'utilisation d'information morphologique pour renvoyer les mots-clé utilisés en recherche d'information
WO2014069779A1 (fr) Appareil d'analyse syntaxique fondée sur un prétraitement syntaxique, et son procédé
WO1997004405A9 (fr) Procede et appareil de recherche et extraction automatiques
JP2002215617A (ja) 品詞タグ付けをする方法
WO2005116863A1 (fr) Systeme d'affichage de caracteres
WO2015050321A1 (fr) Appareil pour générer un corpus d'alignement basé sur un alignement d'auto-apprentissage, procédé associé, appareil pour analyser un morphème d'expression destructrice par utilisation d'un corpus d'alignement et procédé d'analyse de morphème associé
WO2017159906A1 (fr) Structure de données pour déterminer un ordre de traduction de mots compris dans un texte de langue source, programme pour générer une structure de données, et support informatique lisible par ordinateur le stockant
US5088039A (en) System for translating adverb phrases placed between two commas through a converter using tree-structured conversion rules
GB2209614A (en) Translating apparatus
JPH0658676B2 (ja) 多品詞訳語の校正方法
KR20120048139A (ko) 자동 번역 장치 및 그 방법
WO2020209498A1 (fr) Procédé et dispositif de détermination de recherche de brevet
US20040054677A1 (en) Method for processing text in a computer and a computer
Saquete et al. Evaluation of the automatic multilinguality for time expression resolution
Nishida et al. Feedback of correcting information in postediting to a machine translation system
WO1997048058A1 (fr) Traduction automatisee de texte annote
KR100374114B1 (ko) 에이치티엠엘 기반 한글 용어/약어 하이퍼링크 생성기
JP4007630B2 (ja) 対訳例文登録装置
KR20090066470A (ko) 기 분석 데이터를 이용한 한국어 형태소 분석 시스템 및방법
JP2737662B2 (ja) 外国語キーワード文献検索処理装置
Diewald et al. TOKENIZING ON SCALE
JPS62203266A (ja) 機械翻訳システム

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16894637

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16894637

Country of ref document: EP

Kind code of ref document: A1