CN117010417A - Statement processing method, device, computer equipment and storage medium - Google Patents

Statement processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN117010417A
CN117010417A CN202310852266.8A CN202310852266A CN117010417A CN 117010417 A CN117010417 A CN 117010417A CN 202310852266 A CN202310852266 A CN 202310852266A CN 117010417 A CN117010417 A CN 117010417A
Authority
CN
China
Prior art keywords
sentence
translated
sentences
domain knowledge
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310852266.8A
Other languages
Chinese (zh)
Inventor
王星
何志威
梁添
焦文祥
涂兆鹏
杨余久
王瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310852266.8A priority Critical patent/CN117010417A/en
Publication of CN117010417A publication Critical patent/CN117010417A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a sentence processing method, a sentence processing device, computer equipment and a storage medium, belongs to the technical field of computers, and can be applied to the aspect of natural language processing of artificial intelligence. The method comprises the following steps: learning sample sentences and labels of the sample sentences through a large language model, and determining relations among the sentences, keywords, example sentences and domain knowledge; acquiring sentences to be translated belonging to source languages; analyzing the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated; and translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence and the domain knowledge. The technical scheme enables the large language model to fully understand the sentence to be translated, and translates the sentence to be translated on the basis, thereby improving the accuracy and efficiency of translation.

Description

Statement processing method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a sentence processing method, a sentence processing device, a computer device, and a storage medium.
Background
With the development of technology, more and more languages of different types are exchanged, and translation services are generated. Since manual translation takes a lot of time, the translation service is gradually changed from manual to machine. How to implement machine translation is an important point of research in the art.
Currently, the approach commonly adopted is implemented by using neural network machine translation (Neural Machine Translation, NMT). Specifically, the neural network is trained through a large number of bilingual parallel corpuses to automatically learn the mapping from the source language to the target language, so that fluent translation results can be generated.
However, the above scheme trains matching based on sentences, so that the neural network machine translation mainly focuses on the statistical relationship between the source language and the target language, but does not truly understand the inherent semantics of the language, thereby resulting in low accuracy of the translation result. For example, the translated sentence is not matched with the original text.
Disclosure of Invention
The embodiment of the application provides a sentence processing method, a sentence processing device, computer equipment and a storage medium, which can translate sentences to be translated on the basis of fully understanding the sentences to be translated, and improve the accuracy of translation. The technical scheme is as follows:
In one aspect, a sentence processing method is provided, the method includes:
through a large language model, learning a sample sentence and labels of the sample sentence, determining relations between the sentence and keywords, example sentences and domain knowledge, wherein the labels of the sample sentence are used for representing the keywords, example sentences and domain knowledge of the sample sentence, the keywords are used for representing core information of the corresponding sentence, the example sentences are used for representing context information of the corresponding sentence, and the domain knowledge is used for representing a theme of the corresponding sentence;
acquiring sentences to be translated belonging to source languages;
analyzing the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated;
and translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence and the domain knowledge.
In another aspect, there is provided a sentence processing apparatus, the apparatus including:
the system comprises a determining module, a processing module and a processing module, wherein the determining module is used for learning sample sentences and labels of the sample sentences through a large language model, determining relations among the sentences, keywords, example sentences and domain knowledge, wherein the labels of the sample sentences are used for representing the keywords, the example sentences and the domain knowledge of the sample sentences, the keywords are used for representing core information of the corresponding sentences, the example sentences are used for representing context information of the corresponding sentences, and the domain knowledge is used for representing topics of the corresponding sentences;
The acquisition module is used for acquiring sentences to be translated belonging to source languages;
the analysis module is used for analyzing the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated;
and the translation module is used for translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence and the domain knowledge.
In some embodiments, the analysis module is configured to analyze, through the large language model, the sentence to be translated based on a relationship between the sentence and the keyword, to obtain the keyword of the sentence to be translated; analyzing the sentence to be translated based on the relation between the sentence and the example sentence through the large language model to obtain the example sentence of the sentence to be translated; and analyzing the sentence to be translated based on the relation between the sentence and the domain knowledge through the large language model to obtain the domain knowledge of the sentence to be translated.
In some embodiments, the translation module comprises:
the first determining unit is used for determining a keyword group based on the keywords, wherein the keyword group comprises the keywords belonging to the source language and target words belonging to the target language, and the semantics of the keywords are the same as those of the target words;
A second determining unit, configured to determine an example sentence combination based on the example sentences, where the example sentence combination includes the example sentences belonging to the source language and target example sentences belonging to the target language, and semantics of the example sentences are the same as those of the target example sentences;
and the translation unit is used for translating the sentence to be translated into a sentence belonging to the target language based on the key word group, the example sentence combination and the domain knowledge.
In some embodiments, the translation unit is configured to obtain related knowledge of the sentence to be translated, where the related knowledge is used to represent content of the sentence to be translated; filtering the keyword group, the example sentence combination and the domain knowledge based on the related knowledge; and translating the sentence to be translated into a sentence belonging to the target language based on the filtered keyword group, the example sentence combination and the domain knowledge.
In some embodiments, the translation unit is configured to predict relevant knowledge of the sentence to be translated based on content of the sentence to be translated and context information of the sentence to be translated.
In some embodiments, the translation unit is configured to translate the sentence to be translated based on the keyword group and the target language, to obtain a first sentence; translating the sentence to be translated based on the example sentence combination and the target language to obtain a second sentence; translating the sentence to be translated based on the domain knowledge and the target language to obtain a third sentence; evaluating the first sentence, the second sentence and the third sentence to obtain evaluation scores corresponding to the sentences, wherein the evaluation scores are used for representing the accuracy of translation; and taking the sentence with the highest evaluation score as a final translation result of the sentence to be translated.
In some embodiments, the translation module is further configured to translate the sentence to be translated based on the keyword group, the example sentence combination, and an integration result of the domain knowledge, to obtain a fourth sentence; and evaluating the fourth statement.
In another aspect, a computer device is provided, where the computer device includes a processor and a memory, where the memory is configured to store at least one segment of a computer program, where the at least one segment of the computer program is loaded and executed by the processor to implement a statement processing method in an embodiment of the application.
In another aspect, a computer readable storage medium is provided, in which at least one segment of a computer program is stored, the at least one segment of the computer program being loaded and executed by a processor to implement a sentence processing method as in an embodiment of the present application.
In another aspect, there is provided a computer program product comprising a computer program stored in a computer readable storage medium, the computer program being read from the computer readable storage medium by a processor of a computer device, the computer program being executed by the processor to cause the computer device to perform the sentence processing method provided in each of the above aspects or in various alternative implementations of each of the aspects.
The embodiment of the application provides a sentence processing method, which is characterized in that through learning sample sentences and labels of the sample sentences, a large language model can learn the relations among the sentences and keywords, example sentences and domain knowledge in the labels, then the large language model is used for analyzing the sentences to be translated, so that the keywords, the example sentences and the domain knowledge of the sentences to be translated can be determined based on the relations among the previously learned sentences and the keywords, the example sentences and the domain knowledge, and the purpose of learning the knowledge of the sentences to be translated based on prompt learning of the sample sentences is realized; then, according to three angles of information such as keywords, example sentences and domain knowledge, the large language model can fully understand the sentences to be translated, and on the basis, the sentences to be translated are translated, so that the sentences obtained by translation are consistent with the original semantics to be translated, and the accuracy of translation is improved; in addition, the large language model has rich language basis, and the sentence to be translated is analyzed through the large language model, so that the translation result can be obtained quickly, and the translation efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation environment of a sentence processing method according to an embodiment of the present application;
FIG. 2 is a flowchart of a statement processing method provided in accordance with an embodiment of the application;
FIG. 3 is a flow chart of another sentence processing method provided in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram of a sentence processing method according to an embodiment of the present application;
FIG. 5 is a block diagram of a sentence processing device provided according to an embodiment of the present application;
FIG. 6 is a block diagram of another sentence processing device provided in accordance with an embodiment of the present application;
fig. 7 is a block diagram of a terminal according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
The terms "first," "second," and the like in this disclosure are used for distinguishing between similar elements or items having substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the terms "first," "second," and "n," and that there is no limitation on the amount and order of execution.
The term "at least one" in the present application means one or more, and the meaning of "a plurality of" means two or more.
It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the statement to be translated and the sample statement referred to in the present application are obtained under the condition of sufficient authorization.
In order to facilitate understanding, terms related to the present application are explained below.
Artificial intelligence (Artificial Intelligence, AI): refers to theory, methods, techniques and application systems that utilize digital computers or digital computer-controlled machines to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Natural language processing (Nature Language Processing, NLP): refers to an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
GPT (generated Pre-trained Transformer, pre-training model generated): refers to one of the large language models. The model uses the deep learning technology of the artificial neural network, so that the robot can chat and communicate like a robot and perform creation.
BLEU (Bilingual Evaluation Understudy, bilingual evaluation study): refers to a standard method of machine translation evaluation. The higher the value of BLEU, the better the translation effect. The evaluation score in embodiments of the present application may be the value of BLEU.
The sentence processing method provided by the embodiment of the application can be executed by the computer equipment. In some embodiments, the computer device is a terminal or a server. In the following, an implementation environment of the sentence processing method according to the embodiment of the present application is first described by taking a computer device as an example of a server, and fig. 1 is a schematic diagram of an implementation environment of the sentence processing method according to the embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102. The terminal 101 and the server 102 can be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
In some embodiments, terminal 101 is, but is not limited to, a smart phone, tablet, notebook, desktop, smart speaker, smart watch, smart voice-interactive device, smart home appliance, vehicle-mounted terminal, etc. The terminal 101 installs and runs an application program supporting translation. The application may be a translation-type application, a communication-type application, a conference-type application, or a multimedia-type application, to which embodiments of the present application are not limited. Illustratively, the terminal 101 is a terminal used by a user. The user may input a sentence to be translated at the terminal 101. The terminal 101 may then send the sentence to be translated to the server 102, and the sentence to be translated is translated by the server 102.
Those skilled in the art will recognize that the number of terminals may be greater or lesser. Such as the above-mentioned terminals may be only one, or the above-mentioned terminals may be several tens or hundreds, or more. The embodiment of the application does not limit the number of terminals and the equipment type.
In some embodiments, the server 102 is a stand-alone physical server, can be a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. Server 102 is used to provide background services for applications that support translation. The server 102 may translate the statement to be translated; and then transmits the translation result to the terminal 101. The terminal 101 may display the translation result of the sentence to be translated. In some embodiments, the server 102 takes on primary computing work and the terminal 101 takes on secondary computing work; alternatively, the server 102 takes on secondary computing work and the terminal 101 takes on primary computing work; alternatively, a distributed computing architecture is used for collaborative computing between the server 102 and the terminal 101.
Fig. 2 is a flowchart of a statement processing method according to an embodiment of the present application, and referring to fig. 2, description is given in an embodiment of the present application with an example of execution by a server. The sentence processing method comprises the following steps:
201. the server learns the sample sentences and the labels of the sample sentences through a large language model, determines the relation between the sentences and key words, example sentences and domain knowledge, wherein the labels of the sample sentences are used for representing the key words, example sentences and domain knowledge of the sample sentences, the key words are used for representing core information of the corresponding sentences, the example sentences are used for representing context information of the corresponding sentences, and the domain knowledge is used for representing the subjects of the corresponding sentences.
In embodiments of the application, the large language model (Large Language Model, LLM) may be a GPT model, a PaLM (Pathways Language Model, path language model), or a LLaMA (Large Language Model Meta Artificial Intelligence, large language model meta-artificial intelligence), which embodiments of the application do not limit. The sample sentence is marked with a label. The labels are used for representing the corresponding information such as keywords, example sentences, domain knowledge and the like of the sample sentences. The server may input the sample sentence and the label of the sample sentence to the large language model, so that the large language model may learn the relationship between the sentence and the keyword, the example sentence and the domain knowledge. That is, the large language model can learn how to acquire core information, context information, subject, and the like of a sentence from the sentence.
202. The server acquires the sentences to be translated belonging to the source language.
In the embodiment of the application, the server can acquire the statement to be translated from other computer equipment such as a terminal and the like. The embodiment of the application does not limit the way in which the server acquires the statement to be translated. The sentence to be translated belongs to the source language. The source language may be chinese, english, or german, and the embodiments of the application are not limited in this regard. The execution timing of step 201 and step 202 is not limited in the embodiment of the present application. That is, the server may learn the sample sentence and the label of the sample sentence first, and then obtain the sentence to be translated of the source language; or the server can acquire the sentences to be translated of the source language first and learn the sample sentences and the labels of the sample sentences; or the server acquires the sentences to be translated of the source language simultaneously, and learns the sample sentences and the labels of the sample sentences.
203. The server analyzes the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated.
In the embodiment of the application, the server inputs the sentence to be translated into a large language model. Then, the server analyzes the sentence to be translated through a large language model. Then, in the analysis process of the sentence to be translated, the server analyzes the sentence to be translated through the relation between the sentence learned before the large language model and the keywords, the example sentences and the domain knowledge, and obtains the keywords, the example sentences and the domain knowledge of the sentence to be translated. The keywords are used to represent core information of the sentence to be translated. The example sentence is used to represent the context information of the sentence to be translated. Domain knowledge is used to represent the subject matter of the sentence to be translated.
204. The server translates the sentence to be translated into a sentence belonging to the target language based on the keyword, the example sentence and the domain knowledge.
In the embodiment of the application, the server can determine the core information, the context information and the theme of the sentence to be translated based on the keyword, the example sentence and the domain knowledge of the sentence to be translated through a large language model. Correspondingly, the server translates the sentence to be translated into a sentence belonging to the target language based on the core information, the context information and the theme of the sentence to be translated through the large language model. The target language is different from the source language. The embodiment of the application does not limit the target language. Then, the server may send the translated sentence of the target language to the terminal, and the terminal displays the sentence of the target language.
The embodiment of the application provides a sentence processing method, which is characterized in that through learning sample sentences and labels of the sample sentences, a large language model can learn the relations among the sentences and keywords, example sentences and domain knowledge in the labels, then the large language model is used for analyzing the sentences to be translated, so that the keywords, the example sentences and the domain knowledge of the sentences to be translated can be determined based on the relations among the previously learned sentences and the keywords, the example sentences and the domain knowledge, and the purpose of learning the knowledge of the sentences to be translated based on prompt learning of the sample sentences is realized; then, translating the sentence to be translated according to three angles of information such as keywords, example sentences and domain knowledge, namely, through core information, context information and subjects of the sentence to be translated, so that the large-scale language model can fully understand the sentence to be translated, and the sentence to be translated is translated on the basis, thereby improving the accuracy of translation; in addition, the large language model has rich language basis, and the sentence to be translated is analyzed through the large language model, so that the translation result can be obtained quickly, and the translation efficiency is improved.
Fig. 3 is a flowchart of another sentence processing method according to an embodiment of the present application, and referring to fig. 3, description is given in an embodiment of the present application with an example of execution by a server. The sentence processing method comprises the following steps:
301. the server learns the sample sentences and the labels of the sample sentences through a large language model, determines the relation between the sentences and key words, example sentences and domain knowledge, wherein the labels of the sample sentences are used for representing the key words, example sentences and domain knowledge of the sample sentences, the key words are used for representing core information of the corresponding sentences, the example sentences are used for representing context information of the corresponding sentences, and the domain knowledge is used for representing the subjects of the corresponding sentences.
In the embodiment of the present application, the sample sentence may be a sentence belonging to the same field as the subsequent sentence to be translated, or may be any sentence in other fields, which is not limited in the embodiment of the present application. The sample statement includes a tag. The labels include keywords (keywords), example sentences (example sentences), and domain knowledge (topics) of sample sentences. The server inputs the sample sentences and labels of the sample sentences into a large language model, and learns the relations between the sentences and the keywords, the example sentences and the domain knowledge through the large language model. The embodiment of the application does not limit the number of sample sentences. Keywords may be words or terms representing entities in a sentence, etc., and embodiments of the present application are not limited in this respect. The example sentence may be another sentence having a correlation with the sentence, and the embodiment of the present application is not limited thereto. The correlation between sentences may refer to sentences belonging to the same domain; or the same words in the sentence reach the target number; or semantic similarity of sentences, etc., which is not limiting in the embodiments of the present application. The domain knowledge may be a domain to which the background technology corresponding to the sentence belongs, which is not limited by the embodiment of the present application.
In the process of learning the relationship between the sentence and the keyword, the example sentence and the domain knowledge, the server can learn the relationship between the sentence and the keyword, the example sentence and the domain knowledge respectively through a large language model. Accordingly, the server acquires three sample sentences, namely a first sample sentence, a second sample sentence and a third sample sentence. The labels of the first sample sentence are used to represent keywords of the first sample sentence. The label of the second sample sentence is used to represent the example sentence of the second sample sentence. The labels of the third sample sentence are used to represent domain knowledge of the third sample sentence. The server learns the first sample sentence and the label of the first sample sentence through a large language model, and determines the relation between the sentence and the keyword. The server learns the second sample sentence and the label of the second sample sentence through the large language model, and determines the relation between the sentence and the example sentence. The server learns the third sample sentence and the label of the third sample sentence through the large language model, and determines the relation between the sentence and the domain knowledge. Alternatively, the server may learn the relationships between sentences and keywords, example sentences, and domain knowledge through a large language model. Accordingly, three kinds of information including keywords, example sentences, domain knowledge and the like of the sample label are recorded in the label of each sample sentence. Then, the server learns the sample sentences and the labels of the sample sentences through the large language model, and simultaneously determines the relations between the sentences and the keywords, the example sentences and the domain knowledge. According to the scheme provided by the embodiment of the application, the large language model is subjected to prompt learning (prompt learning) through the sample sentences and the labels of the sample sentences so as to determine the relation between the sentences and the keywords, example sentences and domain knowledge, and the aim of learning and mining sentence information from three angles of word level (word-level), sentence level (sentence-level) and chapter level (document-level) is fulfilled. The prompt learning in the embodiment of the present application may be regarded as context learning (in-context learning).
302. The server acquires the sentences to be translated belonging to the source language.
In the embodiment of the application, the sentence to be translated can be text input by the user through the terminal or voice input by the user through the terminal, and the embodiment of the application is not limited to the text. And responding to the sentence input operation triggered by the user, and uploading the sentence to be translated input by the user to the server by the terminal. The embodiment of the application does not limit the source language to which the statement to be translated belongs.
For example, the sentence to be translated is "X company is an Internet company, life of Internet users is enriched through technology, and the assistance enterprises digitally upgrade. The source language to which the sentence to be translated belongs is Chinese.
The execution timing of step 301 and step 302 is not limited in the embodiment of the present application. That is, the server may perform step 301 first and then perform step 302. Specifically, the server learns the sample sentence and the label of the sample sentence, and then obtains the sentence to be translated of the source language. Alternatively, the server may execute step 302 first and then execute step 301. Specifically, the server acquires the sentences to be translated of the source language first, then acquires sample sentences, and learns the sample sentences and the labels of the sample sentences. Alternatively, the server performs step 301 and step 302 simultaneously. Specifically, the server acquires the sentences to be translated of the source language at the same time, and learns the sample sentences and the labels of the sample sentences.
303. The server analyzes the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated.
In the embodiment of the application, after acquiring the statement to be translated, the server inputs the statement to be translated into a large language model so as to carry out knowledge mining on the statement to be translated through the large language model. Accordingly, the server digs the information of the sentence to be translated through the relation between the sentence learned before the large language model and the keyword, the example sentence and the domain knowledge, and obtains the keyword, the example sentence and the domain knowledge of the sentence to be translated. The scheme provided by the embodiment of the application realizes that the information of the mining sentence is learned from three angles of vocabulary level, sentence level and chapter level. The example sentence of the sentence to be translated may be a sentence generated by a large language model based on the sentence to be translated, or may be a sentence retrieved by a server from a corpus database, which is not limited in the embodiment of the present application. Under the condition that example sentences are generated based on the large language model, the server only needs to maintain one language model, namely the large language model, other resources such as a corpus database and the like are not required to be maintained, and storage cost can be saved.
In some embodiments, the process of analyzing the sentence to be translated by the server to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated includes: the server analyzes the sentence to be translated based on the relation between the sentence and the keyword through a large language model to obtain the keyword of the sentence to be translated. The server analyzes the sentence to be translated based on the relation between the sentence and the example sentence through the large language model to obtain the example sentence of the sentence to be translated. The server analyzes the sentence to be translated based on the relation between the sentence and the domain knowledge through a large language model to obtain the domain knowledge of the sentence to be translated. According to the scheme provided by the embodiment of the application, the sentence to be translated is analyzed through the large language model, so that the keyword, the example sentence and the domain knowledge of the sentence to be translated can be determined based on the relationship between the previously learned sentence and the keyword, the example sentence and the domain knowledge, the purpose of learning the knowledge of the sentence to be translated based on the prompt learning of the sample sentence is realized, the sentence to be translated can be more accurately understood from three angles of the keyword, the example sentence and the domain knowledge, and better assistance is provided for subsequent translation.
For example, the server analyzes the sentence to be translated, "the X company is an Internet company, and the life of Internet users is enriched through technology, so that the assistance enterprises digitally upgrade". Then, the server determines that the keywords of the sentence to be translated are 'X company', 'Internet user'; example sentences of the sentences to be translated comprise example sentence one, namely 'the business of company X comprises e-commerce, software development and other contents', and example sentence two ', namely' an internet company is a company which provides services by utilizing a network platform based on a computer network technology; the domain knowledge of the sentence to be translated is the "science and technology, internet" domain.
In some embodiments, in addition to the foregoing prompt learning of the sentence to be translated through the sample sentence, the server may also perform prompt learning of the sentence to be translated through a prompt language. Accordingly, the process of determining the keywords, example sentences and domain knowledge of the sentences to be translated by the server comprises the following steps: and the server analyzes the sentence to be translated based on the first prompt through the large language model to obtain the keyword of the sentence to be translated. And the server analyzes the sentence to be translated based on the second prompt language through the large language model to obtain the example sentence of the sentence to be translated. And the server analyzes the sentence to be translated based on the third prompt language through the large language model to obtain the domain knowledge of the sentence to be translated. According to the scheme provided by the embodiment of the application, the sentence to be translated is analyzed through the large language model, so that the key words, example sentences and domain knowledge of the sentence to be translated can be determined based on the prompt language of the sentence to be translated, the purpose of learning the knowledge of the sentence to be translated based on the prompt learning of the prompt language is realized, the sentence to be translated can be more accurately understood from three angles of the key words, the example sentences and the domain knowledge, and better assistance is provided for subsequent translation.
The first prompt is used for representing a keyword extraction strategy. The keyword extraction strategy may be to take at least one of a subject, a predicate, an object or time in the sentence to be translated as a keyword of the sentence to be translated; or, the entity or the term in the sentence to be translated is used as the keyword of the sentence to be translated, which is not limited by the embodiment of the present application. The second prompt is used for indicating an example sentence acquisition strategy. The example sentence acquisition strategy can be to acquire example sentences with similarity meeting conditions with sentences to be translated; or acquiring sentences belonging to the same field as the sentences to be translated; or may obtain the same sentence as the subject of the sentence to be translated, etc., which is not limited by the embodiment of the present application. The third prompt is used for indicating a theme acquisition strategy. The topic acquisition strategy can be the topic of the article to which the sentence to be translated belongs; or may be a domain to which the subject in the sentence to be translated belongs, etc., which is not limited by the embodiment of the present application.
In the case where the sentence to be translated is prompt-learned by the prompt, the server does not need to execute step 301. That is, step 301 is an optional step.
In some embodiments, the server may also adjust domain knowledge based on user needs. The user demand is the field indicated by the user. The scheme provided by the embodiment of the application can adjust the domain knowledge according to the user requirement, so that the subsequent translation result meets the user requirement, namely, the translation is more targeted and practical.
304. The server determines a keyword group based on the keywords, wherein the keyword group comprises keywords belonging to source languages and target words belonging to target languages, and the semantics of the keywords are the same as those of the target words.
In the embodiment of the application, the server acquires the target word according to the keyword of the sentence to be translated. Then, the server constructs a keyword group based on the keyword and the target word. The target word is a word in the target language that has the same meaning as the key word. The target word may be obtained based on keyword translation or may be retrieved from a corpus database, which is not limited in the embodiment of the present application. The server can translate the keywords through a large language model to obtain target words; the server may also translate the keywords through other translation models to obtain the target word, which is not limited in this embodiment of the present application.
For example, keywords of the sentence to be translated are "X company", "internet company", and "internet user"; the keyword group is "X company-X company", "Internet company-Internet company", and "Internet user-Internet users".
In some embodiments, the server may also adjust the target word of the keyword according to the user's needs. The user demand may be a domain indicated by the user; or, the keywords indicated by the user to be reserved, that is, keywords that do not need to be translated, which is not limited by the embodiment of the present application. According to the scheme provided by the embodiment of the application, the target word of the keyword can be adjusted according to the user requirement, so that the subsequent translation result meets the user requirement, namely, the translation is more targeted and practical.
For example, if the user requirement is that the keyword to be reserved indicated by the user is "company X", the keyword is reserved in the translation result.
305. The server determines example sentence combinations based on example sentences, wherein the example sentence combinations comprise example sentences belonging to source languages and target example sentences belonging to target languages, and the semantics of the example sentences are the same as those of the target example sentences.
In the embodiment of the application, the server acquires the target example sentence according to the example sentence of the sentence to be translated. Then, the server constructs an example sentence combination based on the example sentence and the target example sentence. The target example sentence is a word with the same semantic meaning as the example sentence in the target language. The target example sentence can be obtained based on example sentence translation or can be retrieved from a corpus database, and the embodiment of the application is not limited to this. The server can translate the example sentences through a large language model to obtain target example sentences; or, the server may translate the example sentence through other translation models to obtain the target word, which is not limited in the embodiment of the present application.
For example, example sentences of a sentence to be translated include example sentence one, wherein "the business of company X includes e-commerce, software development and other contents" and example sentence two, "an internet company is a company which provides services by using a network platform based on computer network technology; the example sentence pair is combined into the "business of X company includes e-commerce and software development etc. X Company's business includes e-commerce and software development "; the combination of example sentences corresponding to example sentences II is an Internet company which is a company providing service by using a network platform based on a computer network technology. Internet companies are based on computer network technology and utilize network platforms to provide services "
306. The server translates the sentence to be translated into a sentence belonging to the target language based on the key phrase, the example sentence combination and the domain knowledge.
In the embodiment of the application, the server can integrate the keyword group, the example sentence combination and the domain knowledge. Then, the server translates the sentence to be translated into a sentence belonging to the target language based on the integrated knowledge. That is, after acquiring the keyword group, the example sentence combination, and the domain knowledge, the server may integrate the keyword group, the example sentence combination, and the domain knowledge as contextual background information into the translation process to achieve higher quality translation. The key phrase, example sentence combination and domain knowledge can all be regarded as the display knowledge of the sentence to be translated. This explicit knowledge provides richer context information that can help the large language model understand the meaning and underlying intent of the sentence to be translated more deeply.
The server performs knowledge integration by prompting the large language model for learning. For example, before translation begins, the knowledge involved in steps 304 through 306 is consolidated to form hints that can be as follows:
key word group: x company-X company, internet company-Internet company, internet user- Internet users;
Example sentence combination: the business of company X includes e-commerce, software development, etc. X Company' s businessincludes e-commerce and software development; the Internet company uses computer network Technology is based on companies that provide services using a network platform. Internet companies are based on A-5, computer network technology and utilize network platforms to provide services;
domain knowledge: science and technology, the internet.
Then, the server can directly translate the sentence to be translated according to the obtained keyword group, example sentence combination and domain knowledge.
In some embodiments, the obtained keyword groups, example sentence combinations, and domain knowledge may play a key role in the translation process. However, noise may also be introduced during translation. This is because different sentences to be translated require different knowledge, while irrelevant knowledge affects the translation performance of a large language model. On the basis, the application provides a method for filtering key word groups, example sentence combinations and domain knowledge and then translating. Correspondingly, the process of translating the sentence to be translated into the sentence belonging to the target language by the server based on the keyword group, the example sentence combination and the domain knowledge comprises the following steps: the server obtains the relevant knowledge of the sentence to be translated. The relevant knowledge is used to represent the content of the sentence to be translated. Then, the server filters the keyword groups, the example sentence combinations, and the domain knowledge based on the related knowledge. And then, the server translates the sentence to be translated into a sentence belonging to the target language based on the filtered keyword group, the example sentence combination and the domain knowledge. According to the scheme provided by the embodiment of the application, the related knowledge of the sentence to be translated can represent the content of the translated sentence, so that the related knowledge of the sentence to be translated is used for filtering the key word group, the example sentence combination and the domain knowledge, so that the content irrelevant to translation can be filtered, and the noise in the translation process is reduced; and then translating the sentence to be translated, so that the accuracy of translation can be improved.
The process of obtaining the related knowledge of the sentence to be translated by the server comprises the following steps: the server predicts relevant knowledge of the sentence to be translated based on the content of the sentence to be translated and the context information of the sentence to be translated. According to the scheme provided by the embodiment of the application, before translation, the content of the statement to be translated and the context information of the statement to be translated are analyzed to obtain the related knowledge of the statement to be translated, so that the related knowledge can accurately reflect the information of the statement to be translated, the unrelated information can be accurately filtered out later, and the guarantee is provided for the accuracy of subsequent translation. According to the scheme provided by the embodiment of the application, the keyword group, the example sentence combination and the domain knowledge are filtered through the related knowledge acquired before translation, so that a mode of selecting the prior knowledge is realized.
In some embodiments, the server may translate the sentence to be translated through the filtered keyword group, the example sentence combination and the domain knowledge, respectively. After the translation is completed, the server can evaluate the translation result to select a sentence with the highest translation performance as a final translation result. Correspondingly, the process of translating the sentence to be translated into the sentence belonging to the target language by the server based on the keyword group, the example sentence combination and the domain knowledge comprises the following steps: and the server translates the sentence to be translated based on the keyword group and the target language to obtain a first sentence. And the server translates the sentence to be translated based on the example sentence combination and the target language to obtain a second sentence. And the server translates the sentence to be translated based on the domain knowledge and the target language to obtain a third sentence. And then, the server evaluates the first sentence, the second sentence and the third sentence to obtain evaluation scores corresponding to the sentences. The evaluation score is used to represent the accuracy of the translation. Then, the server takes the sentence with the highest evaluation score as the final translation result of the sentence to be translated. Wherein, the server can evaluate through wmt20-comet-qe-da tool, the embodiment of the application does not limit the specific way of evaluating. According to the scheme provided by the embodiment of the application, the sentences to be translated are translated respectively through the key word groups, the example sentence combinations and the domain knowledge, and the noise in the translation process is reduced because a proper amount of knowledge (such as only the key word groups) is used in each translation process; and after the translation is completed, the translation results corresponding to the keyword groups, the example sentence combinations and the domain knowledge are respectively evaluated, so that a sentence with the highest evaluation score is selected as a final translation result, and the translation accuracy is ensured.
In some embodiments, the server may translate the sentence to be translated based on the keyword group, the example sentence combination, and the integration result of the domain knowledge, to obtain the fourth sentence. The server then evaluates the fourth statement. Then, the server selects the sentence with the highest evaluation score from the first sentence, the second sentence, the third sentence and the fourth sentence as a final translation result of the sentence to be translated. Or, the server can also select any two kinds of knowledge from the key word group, the example sentence combination and the domain knowledge to translate the sentence to be translated, and then select the sentence with the highest evaluation score as the final translation result of the sentence to be translated. The embodiment of the present application is not limited thereto.
In order to more clearly describe the sentence processing method provided by the embodiment of the present application, the sentence processing method provided by the present application is further described below with reference to the accompanying drawings. Fig. 4 is a schematic diagram of a sentence processing method according to an embodiment of the present application. Referring to fig. 4, the server learns the sample sentences and the labels of the sample sentences through a large language model, and determines the relations between the sentences and the keywords, the example sentences and the domain knowledge. The server acquires the sentences to be translated belonging to the source language. The server analyzes the sentences to be translated based on the relation between the sentences and the keywords, the example sentences and the domain knowledge through the large language model to obtain the keywords, the example sentences and the domain knowledge of the sentences to be translated, and the information of the mined sentences is learned from three angles of vocabulary level, sentence level, chapter level and the like. The server then determines the keyword group based on the keywords. The server determines example sentence combinations based on the example sentences. And then, the server integrates the key word groups, the example sentence combinations and the domain knowledge. The server may predict relevant knowledge of the sentence to be translated based on the content of the sentence to be translated and the context information of the sentence to be translated. The server may then filter the key word groups, example sentence combinations, and domain knowledge. And then, the server translates the sentence to be translated into a sentence belonging to the target language based on the filtered keyword group, the example sentence combination and the domain knowledge. Then, the server can translate the sentences to be translated through the filtered key word groups, example sentence combinations and domain knowledge. After the translation is completed, the server can evaluate the translation result to select a sentence with the highest translation performance as a final translation result.
The embodiment of the application provides a sentence processing method, which is characterized in that through learning sample sentences and labels of the sample sentences, a large language model can learn the relations among the sentences and keywords, example sentences and domain knowledge in the labels, then the large language model is used for analyzing the sentences to be translated, so that the keywords, the example sentences and the domain knowledge of the sentences to be translated can be determined based on the relations among the previously learned sentences and the keywords, the example sentences and the domain knowledge, and the purpose of learning the knowledge of the sentences to be translated based on prompt learning of the sample sentences is realized; then, translating the sentence to be translated according to three angles of information such as keywords, example sentences and domain knowledge, namely, through core information, context information and subjects of the sentence to be translated, so that the large-scale language model can fully understand the sentence to be translated, and the sentence to be translated is translated on the basis, thereby improving the accuracy of translation; in addition, the large language model has rich language basis, and the sentence to be translated is analyzed through the large language model, so that the translation result can be obtained quickly, and the translation efficiency is improved.
Fig. 5 is a block diagram of a sentence processing device according to an embodiment of the present application. The sentence processing apparatus is for executing the steps when the sentence processing method described above is executed, and referring to fig. 5, the sentence processing apparatus includes:
a determining module 501, configured to learn, through a large language model, a sample sentence and a label of the sample sentence, and determine a relationship between the sentence and a keyword, an example sentence, and domain knowledge, where the label of the sample sentence is used to represent the keyword, the example sentence, and the domain knowledge of the sample sentence, the keyword is used to represent core information of the corresponding sentence, the example sentence is used to represent context information of the corresponding sentence, and the domain knowledge is used to represent a subject of the corresponding sentence;
the obtaining module 502 is configured to obtain a sentence to be translated belonging to a source language;
an analysis module 503, configured to analyze, through a large language model, a sentence to be translated based on a relationship between the sentence and a keyword, an example sentence, and domain knowledge, to obtain the keyword, the example sentence, and the domain knowledge of the sentence to be translated;
and a translation module 504, configured to translate the sentence to be translated into a sentence belonging to the target language based on the keyword, the example sentence and the domain knowledge.
In some embodiments, fig. 6 is a block diagram of another sentence processing device provided according to an embodiment of the present application. Referring to fig. 6, an analysis module is configured to analyze a sentence to be translated based on a relationship between the sentence and a keyword through a large language model, so as to obtain the keyword of the sentence to be translated; analyzing the sentence to be translated based on the relation between the sentence and the example sentence through a large language model to obtain the example sentence of the sentence to be translated; analyzing the sentence to be translated based on the relation between the sentence and the domain knowledge through a large language model to obtain the domain knowledge of the sentence to be translated.
In some embodiments, with continued reference to fig. 6, translation module 504 includes:
a first determining unit 5041 configured to determine a keyword group based on a keyword, where the keyword group includes a keyword belonging to a source language and a target word belonging to a target language, and the semantics of the keyword and the target word are the same;
a second determining unit 5042 configured to determine an example sentence combination based on example sentences, where the example sentence combination includes example sentences belonging to a source language and target example sentences belonging to a target language, and the semantics of the example sentences are the same as those of the target example sentences;
a translation unit 5043 for translating the sentence to be translated into a sentence belonging to the target language based on the keyword group, the example sentence combination and the domain knowledge.
In some embodiments, with continued reference to fig. 6, the translation unit 5043 is configured to obtain related knowledge of the sentence to be translated, where the related knowledge is used to represent the content of the sentence to be translated; based on the related knowledge, filtering the key word group, the example sentence combination and the domain knowledge; and translating the sentence to be translated into a sentence belonging to the target language based on the filtered key phrase, example sentence combination and domain knowledge.
In some embodiments, with continued reference to fig. 6, a translation unit 5043 is configured to predict relevant knowledge of the sentence to be translated based on the content of the sentence to be translated and the context information of the sentence to be translated.
In some embodiments, with continued reference to fig. 6, the translation unit 5043 is configured to translate the sentence to be translated based on the keyword group and the target language, to obtain a first sentence; translating the sentence to be translated based on the example sentence combination and the target language to obtain a second sentence; translating the sentence to be translated based on the domain knowledge and the target language to obtain a third sentence; evaluating the first sentence, the second sentence and the third sentence to obtain evaluation scores corresponding to the sentences, wherein the evaluation scores are used for representing the accuracy of translation; and taking the sentence with the highest evaluation score as a final translation result of the sentence to be translated.
In some embodiments, with continued reference to fig. 6, the translation module 504 is further configured to translate the sentence to be translated based on the key phrase, the example sentence combination, and the integration result of the domain knowledge, to obtain a fourth sentence; the fourth sentence is evaluated.
The embodiment of the application provides a sentence processing device, which learns the labels of sample sentences and sample sentences to enable a large language model to learn the relations among the sentences and keywords, example sentences and domain knowledge in the labels, then analyzes the sentences to be translated through the large language model to enable the keywords, example sentences and domain knowledge of the sentences to be translated to be determined based on the relations among the previously learned sentences and the keywords, example sentences and domain knowledge, thereby realizing the purpose of learning the knowledge of the sentences to be translated based on the prompt learning of the sample sentences; then, translating the sentence to be translated according to three angles of information such as keywords, example sentences and domain knowledge, namely, through core information, context information and subjects of the sentence to be translated, so that the large-scale language model can fully understand the sentence to be translated, and the sentence to be translated is translated on the basis, thereby improving the accuracy of translation; in addition, the large language model has rich language basis, and the sentence to be translated is analyzed through the large language model, so that the translation result can be obtained quickly, and the translation efficiency is improved.
It should be noted that, when the sentence processing device provided in the foregoing embodiment processes a sentence, only the division of the functional modules is used for illustration, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the sentence processing device and the sentence processing method provided in the foregoing embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments, which are not described herein again.
In the embodiment of the present application, the computer device can be configured as a terminal or a server, when the computer device is configured as a terminal, the technical solution provided by the embodiment of the present application may be implemented by the terminal as an execution body, and when the computer device is configured as a server, the technical solution provided by the embodiment of the present application may be implemented by the server as an execution body, or the technical solution provided by the present application may be implemented by interaction between the terminal and the server, which is not limited by the embodiment of the present application.
Fig. 7 is a block diagram of a terminal 700 according to an embodiment of the present application. The terminal 700 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.
In general, the terminal 700 includes: a processor 701 and a memory 702.
Processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 701 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 701 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit, image processor) for taking care of rendering and drawing of content that the display screen is required to display. In some embodiments, the processor 701 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. The memory 702 may also include high-speed random access memory as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one computer program for execution by processor 701 to implement the sentence processing method provided by the method embodiments of the present application.
In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 703 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, a display 705, a camera assembly 706, audio circuitry 707, and a power supply 708.
A peripheral interface 703 may be used to connect I/O (Input/Output) related at least one peripheral device to the processor 701 and memory 702. In some embodiments, the processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 704 is configured to receive and transmit RF (Radio Frequency) signals, also referred to as electromagnetic signals. The radio frequency circuitry 704 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 704 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. In some embodiments, the radio frequency circuit 704 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 704 may also include NFC (Near Field Communication ) related circuitry, which is not limiting of the application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 705 is a touch display, the display 705 also has the ability to collect touch signals at or above the surface of the display 705. The touch signal may be input to the processor 701 as a control signal for processing. At this time, the display 705 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 705 may be one and disposed on the front panel of the terminal 700; in other embodiments, the display 705 may be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The display 705 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 706 is used to capture images or video. In some embodiments, camera assembly 706 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing, or inputting the electric signals to the radio frequency circuit 704 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 707 may also include a headphone jack.
The power supply 708 is used to power the various components in the terminal 700. The power source 708 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power source 708 comprises a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 700 further includes one or more sensors 709. The one or more sensors 709 include, but are not limited to: acceleration sensor 710, gyro sensor 711, pressure sensor 712, optical sensor 713, and proximity sensor 714.
The acceleration sensor 710 may detect the magnitudes of accelerations on three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 710 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 710. Acceleration sensor 710 may also be used for the acquisition of motion data of a game or user.
The gyro sensor 711 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 711 may collect a 3D motion of the user on the terminal 700 in cooperation with the acceleration sensor 710. The processor 701 may implement the following functions according to the data collected by the gyro sensor 711: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 712 may be disposed at a side frame of the terminal 700 and/or at a lower layer of the display screen 705. When the pressure sensor 712 is disposed at a side frame of the terminal 700, a grip signal of the user to the terminal 700 may be detected, and the processor 701 performs a left-right hand recognition or a shortcut operation according to the grip signal collected by the pressure sensor 712. When the pressure sensor 712 is disposed at the lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The optical sensor 713 is used to collect the intensity of ambient light. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 713. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 705 is turned up; when the ambient light intensity is low, the display brightness of the display screen 705 is turned down. In another embodiment, the processor 701 may also dynamically adjust the shooting parameters of the camera assembly 706 based on the ambient light intensity collected by the optical sensor 713.
A proximity sensor 714, also known as a distance sensor, is typically provided on the front panel of the terminal 700. The proximity sensor 714 is used to collect the distance between the user and the front of the terminal 700. In one embodiment, when the proximity sensor 714 detects that the distance between the user and the front of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the off screen state; when the proximity sensor 714 detects that the distance between the user and the front surface of the terminal 700 gradually increases, the processor 701 controls the display screen 705 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 7 is not limiting of the terminal 700 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may have a relatively large difference due to different configurations or performances, and may include one or more processors (Central Processing Units, CPU) 801 and one or more memories 802, where at least one computer program is stored in the memories 802, and the at least one computer program is loaded and executed by the processor 801 to implement the sentence processing method provided in the above method embodiments. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.
The embodiment of the application also provides a computer readable storage medium, in which at least one section of computer program is stored, the at least one section of computer program being loaded and executed by a processor of a computer device to implement the operations performed by the computer device in the sentence processing method of the above embodiment. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.
The embodiments of the present application also provide a computer program product comprising a computer program stored on a computer readable storage medium. The processor of the computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program so that the computer device executes the sentence processing method provided in the above-described various alternative implementations.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims (11)

1. A sentence processing method, the method comprising:
through a large language model, learning a sample sentence and labels of the sample sentence, determining relations between the sentence and keywords, example sentences and domain knowledge, wherein the labels of the sample sentence are used for representing the keywords, example sentences and domain knowledge of the sample sentence, the keywords are used for representing core information of the corresponding sentence, the example sentences are used for representing context information of the corresponding sentence, and the domain knowledge is used for representing a theme of the corresponding sentence;
acquiring sentences to be translated belonging to source languages;
analyzing the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated;
and translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence and the domain knowledge.
2. The method of claim 1, wherein the analyzing, by the large language model, the sentence to be translated based on the relationship between the sentence and the keyword, the example sentence, and the domain knowledge to obtain the keyword, the example sentence, and the domain knowledge of the sentence to be translated includes:
analyzing the sentence to be translated based on the relation between the sentence and the keywords through the large language model to obtain the keywords of the sentence to be translated;
analyzing the sentence to be translated based on the relation between the sentence and the example sentence through the large language model to obtain the example sentence of the sentence to be translated;
and analyzing the sentence to be translated based on the relation between the sentence and the domain knowledge through the large language model to obtain the domain knowledge of the sentence to be translated.
3. The method of claim 1, wherein the translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence, and the domain knowledge comprises:
determining a keyword group based on the keywords, wherein the keyword group comprises the keywords belonging to the source language and target words belonging to the target language, and the semantics of the keywords are the same as those of the target words;
Based on the example sentences, determining example sentence combinations, wherein the example sentence combinations comprise the example sentences belonging to the source languages and target example sentences belonging to the target languages, and the example sentences and the target example sentences have the same semantic meaning;
and translating the sentence to be translated into a sentence belonging to the target language based on the key word group, the example sentence combination and the domain knowledge.
4. The method of claim 3, wherein the translating the sentence to be translated into the sentence belonging to the target language based on the keyword group, the example sentence combination, and the domain knowledge comprises:
acquiring related knowledge of the sentence to be translated, wherein the related knowledge is used for representing the content of the sentence to be translated;
filtering the keyword group, the example sentence combination and the domain knowledge based on the related knowledge;
and translating the sentence to be translated into a sentence belonging to the target language based on the filtered keyword group, the example sentence combination and the domain knowledge.
5. The method of claim 4, wherein the obtaining the relevant knowledge of the statement to be translated comprises:
Based on the content of the sentence to be translated and the context information of the sentence to be translated, predicting the relevant knowledge of the sentence to be translated.
6. The method of claim 3, wherein the translating the sentence to be translated into the sentence belonging to the target language based on the keyword group, the example sentence combination, and the domain knowledge comprises:
translating the sentence to be translated based on the key word group and the target language to obtain a first sentence;
translating the sentence to be translated based on the example sentence combination and the target language to obtain a second sentence;
translating the sentence to be translated based on the domain knowledge and the target language to obtain a third sentence;
evaluating the first sentence, the second sentence and the third sentence to obtain evaluation scores corresponding to the sentences, wherein the evaluation scores are used for representing the accuracy of translation;
and taking the sentence with the highest evaluation score as a final translation result of the sentence to be translated.
7. The method of claim 6, wherein the method further comprises:
translating the sentence to be translated based on the key word group, the example sentence combination and the integration result of the domain knowledge to obtain a fourth sentence;
And evaluating the fourth statement.
8. A sentence processing apparatus, the apparatus comprising:
the system comprises a determining module, a processing module and a processing module, wherein the determining module is used for learning sample sentences and labels of the sample sentences through a large language model, determining relations among the sentences, keywords, example sentences and domain knowledge, wherein the labels of the sample sentences are used for representing the keywords, the example sentences and the domain knowledge of the sample sentences, the keywords are used for representing core information of the corresponding sentences, the example sentences are used for representing context information of the corresponding sentences, and the domain knowledge is used for representing topics of the corresponding sentences;
the acquisition module is used for acquiring sentences to be translated belonging to source languages;
the analysis module is used for analyzing the sentence to be translated based on the relation between the sentence and the keyword, the example sentence and the domain knowledge through the large language model to obtain the keyword, the example sentence and the domain knowledge of the sentence to be translated;
and the translation module is used for translating the sentence to be translated into a sentence belonging to a target language based on the keyword, the example sentence and the domain knowledge.
9. A computer device, characterized in that it comprises a processor and a memory for storing at least one piece of computer program, which is loaded by the processor and which performs the sentence processing method according to any of the claims 1-7.
10. A computer-readable storage medium storing at least one piece of computer program for executing the sentence processing method according to any one of claims 1 to 7.
11. A computer program product comprising a computer program which, when executed by a processor, implements the sentence processing method according to any one of claims 1 to 7.
CN202310852266.8A 2023-07-11 2023-07-11 Statement processing method, device, computer equipment and storage medium Pending CN117010417A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310852266.8A CN117010417A (en) 2023-07-11 2023-07-11 Statement processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310852266.8A CN117010417A (en) 2023-07-11 2023-07-11 Statement processing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117010417A true CN117010417A (en) 2023-11-07

Family

ID=88575477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310852266.8A Pending CN117010417A (en) 2023-07-11 2023-07-11 Statement processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117010417A (en)

Similar Documents

Publication Publication Date Title
CN110471858B (en) Application program testing method, device and storage medium
CN110807325B (en) Predicate identification method, predicate identification device and storage medium
CN110852100A (en) Keyword extraction method, keyword extraction device, electronic equipment and medium
CN112269853A (en) Search processing method, search processing device and storage medium
CN111428522B (en) Translation corpus generation method, device, computer equipment and storage medium
CN113269279B (en) Multimedia content classification method and related device
CN114333774A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN112764600B (en) Resource processing method, device, storage medium and computer equipment
CN117454954A (en) Model training method, device, computer equipment and storage medium
CN110990549B (en) Method, device, electronic equipment and storage medium for obtaining answer
CN115116437B (en) Speech recognition method, device, computer equipment, storage medium and product
CN116956814A (en) Punctuation prediction method, punctuation prediction device, punctuation prediction equipment and storage medium
CN111597823B (en) Method, device, equipment and storage medium for extracting center word
CN115130456A (en) Sentence parsing and matching model training method, device, equipment and storage medium
CN111428523B (en) Translation corpus generation method, device, computer equipment and storage medium
CN114281937A (en) Training method of nested entity recognition model, and nested entity recognition method and device
CN114328815A (en) Text mapping model processing method and device, computer equipment and storage medium
CN113822084A (en) Statement translation method and device, computer equipment and storage medium
CN117010417A (en) Statement processing method, device, computer equipment and storage medium
CN113569043A (en) Text category determination method and related device
CN113515943A (en) Natural language processing method and method, device and storage medium for acquiring model thereof
CN111737415A (en) Entity relationship extraction method, and method and device for acquiring entity relationship learning model
CN116431838B (en) Document retrieval method, device, system and storage medium
CN111368556B (en) Performance determination method and confidence determination method and device of translation model
CN116776898A (en) Text translation model acquisition method, text translation method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication