CN115796194A

CN115796194A - English translation system based on machine learning

Info

Publication number: CN115796194A
Application number: CN202211439914.9A
Authority: CN
Inventors: 张芳舟
Original assignee: Jilin Agricultural Science and Technology College
Current assignee: Jilin Agricultural Science and Technology College
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2023-03-14

Abstract

The invention relates to the field of translation, and discloses an English translation system based on machine learning, which comprises: the database is used for storing a translation set, and the translation set comprises a mapping set of English-standard language-common language; the voice receiving module is used for receiving the voice of a user to be treated and carrying out noise reduction and impurity removal treatment on the voice; the translation module is used for translating English and standard language mutually; the selection module is used for selecting a common language corresponding to the standard Chinese language in the database; the output module is used for outputting the common language; the learning module receives the language of the user, learns the common language habits of the user, establishes the corresponding relation with the standard language and stores the corresponding relation into the database.

Description

English translation system based on machine learning

Technical Field

The invention relates to the field of translation, in particular to an English translation system based on machine learning.

Background

Machine translation, also known as automatic translation, is the process of converting one natural language (source language) to another natural language (target language) using a computer. It is a branch of computational linguistics, is one of the ultimate targets of artificial intelligence, and has important scientific research value.

Meanwhile, machine translation has important practical value. With the rapid development of the globalization of economy and the internet, the machine translation technology plays an increasingly important role in the aspects of promoting political, economic and cultural communication and the like.

The existing machine translation systems are various in types, wherein English-to-Chinese systems are numerous, but the translated languages are hard, the translation cannot be carried out according to the common voice habits, speeches and the like of users, and the practical requirements of the existing translation cannot be met.

Disclosure of Invention

The invention provides an English translation system based on machine learning, which comprises:

the database is used for storing a translation set, and the translation set comprises a mapping set of English-standard language-common language;

the voice receiving module is used for receiving the voice of a user to be treated and carrying out noise reduction and impurity removal treatment on the voice;

the translation module is used for translating English and standard language;

the selection module is used for selecting a common language corresponding to the standard Chinese language in the database;

the output module is used for outputting the common language;

the learning module receives the language of the user, learns the common language habit of the user, establishes a corresponding relation with the standard language and stores the corresponding relation in the database.

Further: the method for translating Chinese into English by the translation module comprises the following steps:

s1: acquiring voice data information, and segmenting data to obtain a segmented data information table;

s2: acquiring an input data information table, and retrieving a corresponding data set in a database according to the key words;

s3: extracting corresponding data in the data set, and combining according to rules to obtain standard Chinese sentences;

s4: selecting the obtained standard Chinese sentences, and selecting the standard Chinese sentences with the highest probability;

s5: and outputting the obtained standard Chinese sentence to a selection module.

Further, the method comprises the following steps: in step S1, after acquiring the voice data, converting the voice data into text information, then segmenting the text information, and performing word segmentation on the text information, the steps are as follows:

s11: analyzing the language type of the character information to obtain a language type analysis result of the character information, wherein the language type analysis result at least comprises a prestored standard language type;

s12: according to the language patterns obtained by analysis, dividing characters and/or words of the character information, wherein each language pattern obtains a group of characters and/or words;

s13: and compiling each group of characters and/or words into a word information table.

Further, the method comprises the following steps: in step S2, a keyword is extracted for each piece of word information, and a corresponding data set is retrieved from the database according to the keyword, wherein a weight is given to the word list, and the higher the frequency of occurrence of the keyword in the word information list, the higher the corresponding weight.

Further: in step S2, the data set is composed of words and phrases corresponding to english and chinese.

Further, the method comprises the following steps: the learning module comprises a selection unit and a learning unit, wherein the selection unit is used for selecting the place where the voice habit of the user belongs and then downloading a corpus of the place from the master server to the database, and the corpus comprises a mapping set of English-standard language-common language;

the learning unit is used for learning the common language habit of the user, establishing a corresponding relation with the standard language, and storing and updating the database.

Further, the method comprises the following steps: the learning method of the learning unit comprises the following steps:

s101: receiving language data input by a user, analyzing and processing the language data, searching whether the same data exists in a database, if so, executing S102, and if not, executing S104;

s102: extracting a data set containing the language data in the database, and selecting and extracting corresponding standard language data in the data set;

s103: inputting the standard language data into a translation module;

s104: segmenting the language data according to the language criterion of the belonged place to obtain an analysis result, wherein the analysis result comprises a plurality of segmentation data tables according to the language criterion of the belonged place;

s104: finding out corresponding standard Chinese participles according to the obtained participle data table, and recombining the standard Chinese participles into sentences according to the corresponding relation between the language criterion of the place to which the standard Chinese participles belong and the standard Chinese language criterion;

s105: selecting a standard Chinese sentence with the maximum probability, establishing a corresponding relation between the language data and the standard Chinese sentence, and storing the language data and the standard Chinese sentence into a database;

s106: inputting the statement data in the standard into a translation module.

Further, the method comprises the following steps: in step S101, data is retrieved from the database by means of a keyword search.

Further: in step S105, a standard chinese sentence with the highest probability is selected according to the semantics and intonation of the language data input by the user.

Further: the output module comprises a voice playing module, and the voice playing module is used for playing the translated voice.

The invention has the beneficial effects that: the English translation system based on machine learning can realize learning according to the voice habit of a user, so that the translated language and the semantic are accurate and vivid in expression.

Drawings

FIG. 1 is a block diagram of the English translation system based on machine learning according to the present invention;

FIG. 2 is a schematic flowchart of a method for translating Chinese into English by a translation module in an English translation system based on machine learning according to the present invention;

fig. 3 is a schematic flow chart of a learning method of a learning unit in an english translation system based on machine learning according to the present invention.

Detailed Description

The subject matter described herein will now be discussed with reference to example embodiments. It is to be understood that these embodiments are discussed to enable those skilled in the art to better understand and thereby implement the subject matter described herein. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as necessary. In addition, features described with respect to some examples may also be combined in other examples.

Example 1

Referring to fig. 1, in the present embodiment, an english translation system based on machine learning is proposed, including:

the translation module is used for translating English and standard language;

the output module is used for outputting the common language;

Example 2

Referring to fig. 2, in this embodiment, the method for translating chinese into english by the translation module includes the following steps:

In step S1, after acquiring the voice data, converting the voice data into text information, then segmenting the text information, and performing word segmentation on the text information, the steps are as follows:

s13: each group of characters and/or words is organized into a word information table.

In step S2, a keyword is extracted for each piece of word information, and a corresponding data set is retrieved from the database according to the keyword, wherein a weight is given to the word list, and the higher the frequency of occurrence of the keyword in the word information list is, the higher the corresponding weight is.

In step S2, the data set is composed of words and phrases corresponding to english and chinese.

Example 3

Referring to fig. 3, in this embodiment, the learning module includes a selecting unit and a learning unit, the selecting unit is configured to select a location to which a voice habit of a user belongs, and then download a corpus of the location from the general server to the database, where the corpus includes a mapping set of english-standard language-common language;

the learning unit is used for learning the common language habits of the user, establishing a corresponding relation with the standard language, and storing and updating the database.

The learning method of the learning unit comprises the following steps:

s103: inputting the standard language data into a translation module;

s106: and inputting the statement data in the standard into a translation module.

In step S101, data is retrieved from the database by means of a keyword search.

In step S105, a standard chinese sentence with the highest probability is selected according to the semantics and intonation of the language data input by the user.

The output module comprises a voice playing module, and the voice playing module is used for playing the translated voice.

The English translation system based on machine learning provided by the invention can realize learning according to the voice habit of a user, so that the translated language and the semantic are accurate, and vivid expression is realized.

The embodiments of the present invention have been described with reference to the drawings, but the present invention is not limited to the above-mentioned specific embodiments, which are only illustrative and not restrictive, and those skilled in the art can make many forms without departing from the spirit and scope of the present invention and the protection scope of the claims.

Claims

1. An english translation system based on machine learning, comprising:

the translation module is used for translating English and standard language mutually;

the output module is used for outputting the common language;

the learning module receives the language of the user, learns the common language habit of the user, establishes a corresponding relation with the standard language and stores the corresponding relation into the database.

2. The machine learning-based english translation system according to claim 1, wherein the method for translating chinese into english of the translation module comprises the following steps:

3. The english translation system based on machine learning of claim 2, wherein in step S1, after obtaining the speech data, the speech data is converted into text information, and then the text information is segmented, and the step of segmenting the text information is as follows:

s11: analyzing the language form of the character information to obtain a language form analysis result of the character information, wherein the language form analysis result at least comprises a prestored standard language form;

s12: according to the language forms obtained by analysis, dividing characters and/or words into the character information, wherein each language form obtains a group of characters and/or words;

4. The english translation system based on machine learning according to claim 3, wherein in step S2, a keyword is extracted for each piece of word information, and the database is searched for the corresponding data set according to the keyword, wherein the word list is weighted, and the higher the frequency of occurrence of the keyword in the word information list is, the higher the corresponding weight is.

5. The machine-learning-based english translation system according to claim 4, wherein in step S2, the data set is composed of words and phrases corresponding to english and chinese.

6. The English translation system based on machine learning of claim 5, wherein the learning module comprises a selection unit and a learning unit, the selection unit is configured to select a location to which the voice habit of the user belongs, and then download a corpus of the location from the overall server into the database, the corpus comprising a mapping set of English-standard language-common language;

7. The English translation system based on machine learning of claim 6, wherein the learning method of the learning unit comprises the following steps:

s103: inputting the standard language data into a translation module;

s104: performing word segmentation on the piece of language data according to the language criterion of the belonged place to obtain an analysis result, wherein the analysis result comprises a plurality of word segmentation data tables according to the language criterion of the belonged place;

s104: finding out corresponding standard Chinese participles according to the obtained participle data table, and recombining the standard Chinese participles into sentences according to the corresponding relation between the language criterion of the corresponding place and the standard Chinese language criterion;

8. The machine-learning-based english translation system according to claim 7, wherein in step S101, data is retrieved from the database by means of keyword retrieval.

9. The system for translating english according to machine learning of claim 8, wherein in step S105, the standard chinese sentence with the highest probability is selected according to the semantic meaning and intonation of the language data inputted by the user.

10. The machine learning-based english translation system according to claim 9, wherein the output module includes a speech playing module, and the speech playing module is configured to play the speech after translation.