CN109657244A - A kind of English long sentence automatic segmentation method and system - Google Patents

A kind of English long sentence automatic segmentation method and system Download PDF

Info

Publication number
CN109657244A
CN109657244A CN201811549280.6A CN201811549280A CN109657244A CN 109657244 A CN109657244 A CN 109657244A CN 201811549280 A CN201811549280 A CN 201811549280A CN 109657244 A CN109657244 A CN 109657244A
Authority
CN
China
Prior art keywords
sequence
english
sentence
neural network
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811549280.6A
Other languages
Chinese (zh)
Other versions
CN109657244B (en
Inventor
张睦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Language Network (wuhan) Information Technology Co Ltd
Original Assignee
Language Network (wuhan) Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Language Network (wuhan) Information Technology Co Ltd filed Critical Language Network (wuhan) Information Technology Co Ltd
Priority to CN201811549280.6A priority Critical patent/CN109657244B/en
Publication of CN109657244A publication Critical patent/CN109657244A/en
Application granted granted Critical
Publication of CN109657244B publication Critical patent/CN109657244B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a kind of English long sentence automatic segmentation method and system, method includes: to obtain English long sentence to be slit;The English long sentence to be slit is input to the sequence after training into the neural network model of sequence frame, exports two English short sentences.A kind of English long sentence automatic segmentation method and system provided in an embodiment of the present invention, do pattern-recognition by the neural network model using sequence to sequence, English long sentence are automatically split as two short sentences, human resources are greatly saved.

Description

A kind of English long sentence automatic segmentation method and system
Technical field
The present embodiments relate to translation technology field more particularly to a kind of English long sentence automatic segmentation method and system.
Background technique
After the English interpreter from Chinese native language country completes translation of the Chinese to English, in order to further ensure translation Quality, translation company often invite the language specialist from English native country to examine and revise the translation of interpreter.By right Than after a collection of interpreter translation translation and expert check after translation, it can be found that in addition to some simple grammers, spelling and Other than the error correcting for editing aspect, it is to be substituted for a long english sentence that one kind most from foreign expert, which checks modification, Original meaning is identical and two short sentences of logical coherent.
It is complete when a brief Chinese text translates into English since Chinese and English are in linguistic otherness Whole information generally requires the English words of bigger length to describe.At the same time, foreign expert examine and revise version translation It is more reasonable in linguistic organization, and the mode of this style of writing can bring preferably reading sense for reader.But it invites outer The human cost for examining and revising work of nationality expert is very high, so be a kind of greatly waste for human resources.
Therefore, a kind of English long sentence automatic segmentation method is needed now to solve the above problems.
Summary of the invention
To solve the above-mentioned problems, the embodiment of the present invention provides one kind and overcomes the above problem or at least be partially solved State a kind of English long sentence automatic segmentation method and system of problem.
The first aspect embodiment of the present invention provides a kind of English long sentence automatic segmentation method, comprising:
Obtain English long sentence to be slit;
The English long sentence to be slit is input to the sequence after training into the neural network model of sequence frame, it is defeated Two English short sentences out.
The embodiment of the invention provides a kind of English long sentence automatic segmentation systems for second aspect, comprising:
Module is obtained, for obtaining English long sentence to be slit;
Automatic segmentation module, for the English long sentence to be slit to be input to the sequence after training to sequence frame In neural network model, two English short sentences are exported.
The embodiment of the invention provides a kind of electronic equipment for the third aspect, comprising:
Processor, memory, communication interface and bus;Wherein, the processor, memory, communication interface pass through described Bus completes mutual communication;The memory is stored with the program instruction that can be executed by the processor, the processor Described program instruction is called to be able to carry out above-mentioned English long sentence automatic segmentation method.
The embodiment of the invention provides a kind of non-transient computer readable storage medium, the non-transient calculating for fourth aspect Machine readable storage medium storing program for executing stores computer instruction, and it is automatic that the computer instruction makes the computer execute above-mentioned English long sentence Cutting method.
A kind of English long sentence automatic segmentation method and system provided in an embodiment of the present invention, by arriving sequence using sequence Neural network model does pattern-recognition, and English long sentence is automatically split as two short sentences, human resources are greatly saved.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of English long sentence automatic segmentation method flow schematic diagram provided in an embodiment of the present invention;
Fig. 2 is coder structure schematic diagram provided in an embodiment of the present invention;
Fig. 3 is the first short sentence decoder architecture schematic diagram provided in an embodiment of the present invention;
Fig. 4 is the second short sentence decoder architecture schematic diagram provided in an embodiment of the present invention;
Fig. 5 is a kind of English long sentence automatic segmentation system structure diagram provided in an embodiment of the present invention;
Fig. 6 is the structural block diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical solution in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
Currently, after the English interpreter from Chinese native language country completes translation of the Chinese to English, in order to further protect Translation quality is demonstrate,proved, translation company often invites the language specialist from English native country to examine and revise the translation of interpreter. By compare the translation after a collection of interpreter's translation and expert check after translation, it can be found that being spelled in addition to some simple grammers Other than the error correcting for writing and editing aspect, it is by a long english sentence that one kind most from foreign expert, which checks modification, It is substituted for that original meaning is identical and two short sentences of logical coherent.And the main reason for leading to such case and interpreter is not lain in itself Translation speciality it is horizontal, and be more because Chinese and English are in linguistic otherness.When a brief Chinese text When translating into English, complete information generally requires the English words of bigger length to describe.
The typical translation example provided in an embodiment of the present invention of table 1
Table 1 is typical translation example provided in an embodiment of the present invention, as shown in table 1, from the point of view of string editing distance, The gap very little between two kinds of different translations in table 1.It is apparent that foreign expert examine and revise version translation in linguistic organization It is upper more reasonable, and the mode of this style of writing can bring preferably reading sense for reader.However, inviting foreign expert The human cost for examining and revising work is very high.
In view of the above-mentioned problems, Fig. 1 is a kind of English long sentence automatic segmentation method flow signal provided in an embodiment of the present invention Figure, as shown in Figure 1, comprising:
101, English long sentence to be slit is obtained;
102, by the English long sentence to be slit be input to training after sequence to sequence frame neural network model In, export two English short sentences.
It is understood that in order to realize that the embodiment of the invention provides one by the effect of English long sentence automatic segmentation The neural network model of sequence to sequence frame after training carries out automatic segmentation to the English long sentence for arbitrarily needing cutting, need The neural network model that English long sentence to be slit is input to the sequence after the training to sequence frame can be automatically performed Long sentence cutting function, the sentence cut out are two English short sentences, can be described as the first short sentence and the in embodiments of the present invention Two short sentences.
Specifically, in a step 101, the embodiment of the present invention needs to obtain one or more English long sentences to be slit.It needs It is noted that the embodiment of the present invention does not do any restriction to the specific length and type of English long sentence.
Then in a step 102 by the English long sentence to be slit be input to training after sequence to sequence frame nerve In network model, which is that the embodiment of the present invention trains in advance, can be automatically performed the neural network mould of cutting short sentence function Type, training process needs to accumulate the original text of history, interpreter's translation and examines and revises translation as training set, and is labeled respectively, makes Obtaining neural network model can learn to the interpretative system of translation is examined and revised, to complete final cutting.
A kind of English long sentence automatic segmentation method and system provided in an embodiment of the present invention, by arriving sequence using sequence Neural network model does pattern-recognition, and English long sentence is automatically split as two short sentences, human resources are greatly saved.
On the basis of the above embodiments, the sequence after the English long sentence to be slit to be input to training described arrives In the neural network of sequence frame, before exporting two English short sentences, the method also includes:
Corpus data collection is obtained, the corpus data collection includes original text, interpreter's translation and examines and revises translation;
Using the corpus data collection as training sample set to the neural network model of preset sequence to sequence frame into Row is trained, the neural network model of the sequence after obtaining the training to sequence frame.
By the content of above-described embodiment it is found that the embodiment of the invention provides the sequences after a training to sequence frame Neural network model, then the neural network model of the sequence to sequence frame need that training sample set is trained can be complete At automatic segmentation function.
Using original text, interpreter's translation and translation is examined and revised as corpus data collection in embodiments of the present invention, and correspond Be trained as neural network model of the training set to sequence to sequence frame.It should be noted that the corpus number chosen According to integrating the data translated as history, and the Chinese and English Dan Yuyu of newest wikipedia are downloaded in the training process Material collects and is segmented.Then the training for carrying out Chinese and English term vector respectively using Skip-Gram algorithm, wherein main Training hyper parameter preferably can be set are as follows: the dimension of term vector be 300, contextual window 5.
Finally based on training sample set and with first practice Chinese and English term vector, sequence can be completed to sequence frame The training of neural network model.
On the basis of the above embodiments, described using the corpus data collection as training sample set to preset sequence To sequence frame neural network model be trained before, the method also includes:
The data prediction that the text that the corpus data is concentrated is segmented and made pauses in reading unpunctuated ancient writings.
By the content of above-described embodiment it is found that the embodiment of the present invention can accumulate original text, interpreter's translation and translation conduct is examined and revised Corpus data collection, the pretreatment that the text that then embodiment of the present invention can concentrate corpus data is segmented and made pauses in reading unpunctuated ancient writings.
Specifically, including being screened out from it N triple (an original text sentence, interpreter's translation sentence, (first Translation sentence is examined and revised, Article 2 examines and revises translation sentence)) as model training and validation test.
The present invention is that data set is rationally denoted as D={ D1,D2,D3,…,DN, wherein Di=(SRCi,TRASi, (REVIEWi1,REVIEWi2)).20% is extracted from D at random is used as validation test collection Dtest, remaining 80% is used as training set Dtrain
On the basis of the above embodiments, the neural network model of the sequence to sequence frame includes:
Original text encoder, translation encoder, the first short sentence decoder and the second short sentence decoder.
It is described using the corpus data collection as training sample set to the neural network mould of preset sequence to sequence frame Type is trained, comprising:
Based on the original text encoder and the translation encoder, the original text vector sum translation that the training sample is concentrated Vector is combined into primary vector;
Based on the first short sentence decoder and the primary vector, the first short sentence and secondary vector are generated;
Based on the second short sentence decoder and the secondary vector, the second short sentence is generated.
The neural network model of sequence provided in an embodiment of the present invention to sequence frame mainly includes four component parts, point It is not original text encoder, translation encoder, the first short sentence decoder and the second short sentence decoder.
Fig. 2 is coder structure schematic diagram provided in an embodiment of the present invention, as shown in Fig. 2, i.e. original text encoder and translation Original text is encoded into original text vector C using Recognition with Recurrent Neural Network LSTM by encoder, the encodersrcWith translation vector Ctrans, and lead to It crosses connection and is combined into a new vector C, i.e. primary vector in the embodiment of the present invention.
Such as: original text " is sayed, non-is to be not desired to save burning incense from damage, but be willing to hold old with the body of my this tool the old and the weak before me Tiger creates chance to escape to you." and interpreter's translation " What I said before does not mean denying the disciples protection but distractingTiger using my old and weak body so You all have a chance to escape. " is encoded into vector by encoder respectively, and connects into a new vector C。
And then, Fig. 3 is the first short sentence decoder architecture schematic diagram provided in an embodiment of the present invention, as shown in figure 3, by word Vector as the first decoder input and combine primary vector, first short sentence can be generated and obtain a new vector Creview1, i.e., secondary vector in the embodiment of the present invention.
Such as: first short sentence can be generated using the first short sentence decoder: " generating the first short sentence " What I said before does not mean denying the disciples protection.”。
Finally, Fig. 4 is the second short sentence decoder architecture schematic diagram provided in an embodiment of the present invention, as shown in figure 4, in conjunction with to Measure C and vector Creview1As the input of the second short sentence decoder, the second short sentence is generated.
Such as: using vector C caused by the second short sentence encoder and generate caused by the decoder of first short sentence Vector Creview1, Article 2 punctuate " Rather, I ' m willing to use my old is generated in another decoder and frail body to distract Tiger so you all have a chance to escape.”。
Fig. 5 is a kind of English long sentence automatic segmentation system structure diagram provided in an embodiment of the present invention, as shown in figure 5, It include: to obtain module 501 and automatic segmentation module 502, in which:
Module 501 is obtained for obtaining English long sentence to be slit;
Automatic segmentation module 502 is used to for the English long sentence to be slit being input to the sequence after training to sequence frame Neural network model in, export two English short sentences.
It is specific how to carry out English long sentence automatic segmentation by obtaining module 501 and automatic segmentation module 502 and can be used In the technical solution for executing English long sentence automatic segmentation embodiment of the method shown in FIG. 1, it is similar that the realization principle and technical effect are similar, Details are not described herein again.
A kind of English long sentence automatic segmentation method and system provided in an embodiment of the present invention, by arriving sequence using sequence Neural network model does pattern-recognition, and English long sentence is automatically split as two short sentences, human resources are greatly saved.
On the basis of the above embodiments, the system also includes:
Training module, for obtaining corpus data collection, the corpus data collection includes that original text, interpreter's translation and examining and revising is translated Text;
Using the corpus data collection as training sample set to the neural network model of preset sequence to sequence frame into Row is trained, the neural network model of the sequence after obtaining the training to sequence frame.
On the basis of the above embodiments, the system also includes:
Preprocessing module, the data prediction that the text for concentrating to the corpus data is segmented and made pauses in reading unpunctuated ancient writings.
On the basis of the above embodiments, the neural network model of the sequence to sequence frame includes:
Original text encoder, translation encoder, the first short sentence decoder and the second short sentence decoder.
On the basis of the above embodiments, the training module includes:
Coding unit concentrates the training sample for being based on the original text encoder and the translation encoder Original text vector sum translation vector is combined into primary vector;
First decoding unit, for be based on the first short sentence decoder and the primary vector, generate the first short sentence with And secondary vector;
Second decoding unit generates the second short sentence for being based on the second short sentence decoder and the secondary vector.
The embodiment of the present invention provides a kind of electronic equipment, comprising: at least one processor;And with the processor communication At least one processor of connection, in which:
Fig. 6 is the structural block diagram of electronic equipment provided in an embodiment of the present invention, referring to Fig. 6, the electronic equipment, comprising: Processor (processor) 601, communication interface (Communications Interface) 602, memory (memory) 603 With bus 604, wherein processor 601, communication interface 602, memory 603 complete mutual communication by bus 604.Place Reason device 601 can call the logical order in memory 603, to execute following method: obtaining English long sentence to be slit;By institute State English long sentence to be slit be input to training after sequence into the neural network model of sequence frame, it is short to export two English Sentence.
The embodiment of the present invention discloses a kind of computer program product, and the computer program product is non-transient including being stored in Computer program on computer readable storage medium, the computer program include program instruction, when described program instructs quilt When computer executes, computer is able to carry out method provided by above-mentioned each method embodiment, for example, obtains to be slit English long sentence;The English long sentence to be slit is input to the sequence after training into the neural network model of sequence frame, Export two English short sentences.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage Medium storing computer instruction, the computer instruction make the computer execute side provided by above-mentioned each method embodiment Method, for example, obtain English long sentence to be slit;The English long sentence to be slit is input to the sequence after training to sequence In the neural network model of column frame, two English short sentences are exported.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation Method described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (8)

1. a kind of English long sentence automatic segmentation method characterized by comprising
Obtain English long sentence to be slit;
The English long sentence to be slit is input to the sequence after training into the neural network model of sequence frame, output two Item English short sentence.
2. the method according to claim 1, wherein the English long sentence to be slit is input to instruction described Sequence after white silk is into the neural network of sequence frame, before exporting two English short sentences, the method also includes:
Corpus data collection is obtained, the corpus data collection includes original text, interpreter's translation and examines and revises translation;
It is instructed the corpus data collection as neural network model of the training sample set to preset sequence to sequence frame Practice, the neural network model of the sequence after obtaining the training to sequence frame.
3. according to the method described in claim 2, it is characterized in that, described using the corpus data collection as training sample set Before being trained to the neural network model of preset sequence to sequence frame, the method also includes:
The data prediction that the text that the corpus data is concentrated is segmented and made pauses in reading unpunctuated ancient writings.
4. according to the method described in claim 2, it is characterized in that, the sequence to sequence frame neural network model packet It includes:
Original text encoder, translation encoder, the first short sentence decoder and the second short sentence decoder.
5. according to the method described in claim 4, it is characterized in that, described using the corpus data collection as training sample set pair The neural network model of preset sequence to sequence frame is trained, comprising:
Based on the original text encoder and the translation encoder, the original text vector sum translation vector that the training sample is concentrated It is combined into primary vector;
Based on the first short sentence decoder and the primary vector, the first short sentence and secondary vector are generated;
Based on the second short sentence decoder and the secondary vector, the second short sentence is generated.
6. a kind of English long sentence automatic segmentation system characterized by comprising
Module is obtained, for obtaining English long sentence to be slit;
Automatic segmentation module, the nerve for the English long sentence to be slit to be input to the sequence after training to sequence frame In network model, two English short sentences are exported.
7. a kind of electronic equipment, which is characterized in that including memory and processor, the processor and the memory pass through always Line completes mutual communication;The memory is stored with the program instruction that can be executed by the processor, the processor tune Method as claimed in claim 1 to 5 is able to carry out with described program instruction.
8. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited Computer instruction is stored up, the computer instruction makes the computer execute such as method described in any one of claim 1 to 5.
CN201811549280.6A 2018-12-18 2018-12-18 English long sentence automatic segmentation method and system Active CN109657244B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811549280.6A CN109657244B (en) 2018-12-18 2018-12-18 English long sentence automatic segmentation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811549280.6A CN109657244B (en) 2018-12-18 2018-12-18 English long sentence automatic segmentation method and system

Publications (2)

Publication Number Publication Date
CN109657244A true CN109657244A (en) 2019-04-19
CN109657244B CN109657244B (en) 2023-04-18

Family

ID=66114558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811549280.6A Active CN109657244B (en) 2018-12-18 2018-12-18 English long sentence automatic segmentation method and system

Country Status (1)

Country Link
CN (1) CN109657244B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353281A (en) * 2020-02-24 2020-06-30 百度在线网络技术(北京)有限公司 Text conversion method and device, electronic equipment and storage medium
CN112506949A (en) * 2020-12-03 2021-03-16 北京百度网讯科技有限公司 Method and device for generating query statement of structured query language and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070282594A1 (en) * 2006-06-02 2007-12-06 Microsoft Corporation Machine translation in natural language application development
CN101458681A (en) * 2007-12-10 2009-06-17 株式会社东芝 Voice translation method and voice translation apparatus
CN106527756A (en) * 2016-10-26 2017-03-22 长沙军鸽软件有限公司 Method and device for intelligently correcting input information
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
CN107590132A (en) * 2017-10-17 2018-01-16 语联网(武汉)信息技术有限公司 A kind of method of automatic corrigendum segment word is judged by English part of speech
CN107797995A (en) * 2017-11-20 2018-03-13 语联网(武汉)信息技术有限公司 A kind of Chinese and English fragment language material generation method
WO2018097022A1 (en) * 2016-11-24 2018-05-31 国立研究開発法人情報通信研究機構 Automatic translation pattern learning device, automatic translation preprocessing device, and computer program
CN108491372A (en) * 2018-01-31 2018-09-04 华南理工大学 A kind of Chinese word cutting method based on seq2seq models
WO2018177334A1 (en) * 2017-03-30 2018-10-04 华为技术有限公司 Content explanation method and device
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070282594A1 (en) * 2006-06-02 2007-12-06 Microsoft Corporation Machine translation in natural language application development
CN101458681A (en) * 2007-12-10 2009-06-17 株式会社东芝 Voice translation method and voice translation apparatus
CN106527756A (en) * 2016-10-26 2017-03-22 长沙军鸽软件有限公司 Method and device for intelligently correcting input information
WO2018097022A1 (en) * 2016-11-24 2018-05-31 国立研究開発法人情報通信研究機構 Automatic translation pattern learning device, automatic translation preprocessing device, and computer program
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
WO2018177334A1 (en) * 2017-03-30 2018-10-04 华为技术有限公司 Content explanation method and device
CN107590132A (en) * 2017-10-17 2018-01-16 语联网(武汉)信息技术有限公司 A kind of method of automatic corrigendum segment word is judged by English part of speech
CN107797995A (en) * 2017-11-20 2018-03-13 语联网(武汉)信息技术有限公司 A kind of Chinese and English fragment language material generation method
CN108491372A (en) * 2018-01-31 2018-09-04 华南理工大学 A kind of Chinese word cutting method based on seq2seq models
CN108647207A (en) * 2018-05-08 2018-10-12 上海携程国际旅行社有限公司 Natural language modification method, system, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
任众;侯宏旭;武静;王洪彬;李金廷;樊文婷;申志鹏;: "基于统计和神经网络的蒙汉机器翻译研究" *
包乌格德勒;赵小兵;: "基于RNN和CNN的蒙汉神经机器翻译研究" *
王蕾;谢云;周俊生;顾彦慧;曲维光;: "基于神经网络的片段级中文命名实体识别" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353281A (en) * 2020-02-24 2020-06-30 百度在线网络技术(北京)有限公司 Text conversion method and device, electronic equipment and storage medium
CN112506949A (en) * 2020-12-03 2021-03-16 北京百度网讯科技有限公司 Method and device for generating query statement of structured query language and storage medium
CN112506949B (en) * 2020-12-03 2023-07-25 北京百度网讯科技有限公司 Method, device and storage medium for generating structured query language query statement

Also Published As

Publication number Publication date
CN109657244B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
US11238232B2 (en) Written-modality prosody subsystem in a natural language understanding (NLU) framework
WO2018028077A1 (en) Deep learning based method and device for chinese semantics analysis
CN108549637A (en) Method for recognizing semantics, device based on phonetic and interactive system
CN105243055B (en) Based on multilingual segmenting method and device
CN109670180B (en) Method and device for translating individual characteristics of vectorized translator
CN106897439A (en) The emotion identification method of text, device, server and storage medium
CN109871955A (en) A kind of aviation safety accident causality abstracting method
CN107301170A (en) The method and apparatus of cutting sentence based on artificial intelligence
US11157707B2 (en) Natural language response improvement in machine assisted agents
CN111832318B (en) Single sentence natural language processing method and device, computer equipment and readable storage medium
KR101948257B1 (en) Multi-classification device and method using lsp
CN110532575A (en) Text interpretation method and device
CN112560510A (en) Translation model training method, device, equipment and storage medium
CN111339772B (en) Russian text emotion analysis method, electronic device and storage medium
CN112541070A (en) Method and device for excavating slot position updating corpus, electronic equipment and storage medium
CN112597307A (en) Extraction method, device and equipment of figure action related data and storage medium
CN110309513B (en) Text dependency analysis method and device
CN109657244A (en) A kind of English long sentence automatic segmentation method and system
JP5911931B2 (en) Predicate term structure extraction device, method, program, and computer-readable recording medium
CN110851572A (en) Session labeling method and device, storage medium and electronic equipment
CN115017271A (en) Method and system for intelligently generating RPA flow component block
JP2014215920A (en) Case analysis model parameter learning apparatus, case analyzer, method and program
CN114861628A (en) System, method, electronic device and storage medium for training machine translation model
CN109934347A (en) Extend the device of question and answer knowledge base
CN113283218A (en) Semantic text compression method and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant