WO2022148087A1 - 编程语言翻译模型的训练方法、装置、设备及存储介质 - Google Patents

编程语言翻译模型的训练方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022148087A1
WO2022148087A1 PCT/CN2021/124418 CN2021124418W WO2022148087A1 WO 2022148087 A1 WO2022148087 A1 WO 2022148087A1 CN 2021124418 W CN2021124418 W CN 2021124418W WO 2022148087 A1 WO2022148087 A1 WO 2022148087A1
Authority
WO
WIPO (PCT)
Prior art keywords
code
solution
word
programming language
answer
Prior art date
Application number
PCT/CN2021/124418
Other languages
English (en)
French (fr)
Inventor
刘玉
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022148087A1 publication Critical patent/WO2022148087A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/51Source to source

Definitions

  • the present application relates to the field of artificial intelligence (Artificial Intelligence, AI), in particular to a training method, apparatus, device and storage medium for a programming language translation model.
  • AI Artificial Intelligence
  • the embodiments of the present application provide a training method, apparatus, device, and storage medium for a programming language translation model, which can improve the construction efficiency of the model.
  • a first aspect of the present application provides a training method for a programming language translation model
  • the programming language translation model includes an encoding layer and a decoding layer
  • the encoding layer includes a first encoding layer and a second encoding layer
  • the programming language translation The model is obtained by training a first solution code set and a second solution code set, the first solution code set and the second solution code set are in one-to-one correspondence, and the programming of each first solution code in the first solution code set
  • the language is a first programming language
  • the programming language of each second solution code in the second solution code set is a second programming language
  • the first programming language is different from the second programming language
  • the method includes:
  • the first solution code is input into the first encoding layer, so that each word in the first solution code is analyzed by the first encoding layer Encoding is performed to obtain the first feature vector corresponding to each word in the first answer code;
  • the second solution code is input into the second encoding layer for each word in the second solution code by the second encoding layer Encoding is performed to obtain the second feature vector corresponding to each word in the second solution code;
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • a second aspect of the present application provides a training device for a programming language translation model
  • the programming language translation model includes an encoding layer and a decoding layer
  • the encoding layer includes a first encoding layer and a second encoding layer
  • the programming language translation The model is obtained by training a first solution code set and a second solution code set, the first solution code set and the second solution code set are in one-to-one correspondence, and the programming of each first solution code in the first solution code set
  • the language is a first programming language
  • the programming language of each second solution code in the second solution code set is a second programming language
  • the first programming language is different from the second programming language
  • the training device includes a first programming language.
  • the first input module is configured to, for each first solution code in the first solution code set, input the first solution code into the first coding layer, so that the Each word in the first answer code is encoded to obtain the first feature vector corresponding to each word in the first answer code;
  • the second input module is configured to input the second solution code into the second encoding layer for each second solution code in the second solution code set, so that the Each word in the second answer code is encoded to obtain the second feature vector corresponding to each word in the second answer code;
  • the third input module is configured to input the first feature vector corresponding to each word in the first solution code and the second feature vector of the corresponding word in the second solution code into the decoding layer to predict the Describe the code translation result corresponding to the first answer code;
  • the processing module is configured to adjust the model parameters of the programming language translation model according to the code translation result corresponding to the first answer code, so as to train the programming language translation model.
  • a third aspect of the present application provides an electronic device, comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and executed by the processor
  • the programming language translation model includes an encoding layer and a decoding layer
  • the encoding layer includes a first encoding layer and a second encoding layer
  • the programming language translation model solves the code through the first
  • the first solution code set and the second solution code set are in one-to-one correspondence
  • the programming language of each first solution code in the first solution code set is the first programming language
  • the programming language of each second solution code in the second solution code set is a second programming language
  • the first programming language is different from the second programming language
  • the method includes:
  • the first solution code is input into the first encoding layer, so that each word in the first solution code is analyzed by the first encoding layer Encoding is performed to obtain the first feature vector corresponding to each word in the first answer code;
  • the second solution code is input into the second encoding layer for each word in the second solution code by the second encoding layer Encoding is performed to obtain the second feature vector corresponding to each word in the second solution code;
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • a fourth aspect of the present application provides a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, and the stored computer program is executed by the processor to implement the above-mentioned training method for a programming language translation model , the programming language translation model includes an encoding layer and a decoding layer, the encoding layer includes a first encoding layer and a second encoding layer, and the programming language translation model is obtained by training the first answer code set and the second answer code set,
  • the first solution code set and the second solution code set are in one-to-one correspondence, the programming language of each first solution code in the first solution code set is the first programming language, and each The programming language of the second solution code is a second programming language, the first programming language is different from the second programming language, and the method includes:
  • the first solution code is input into the first encoding layer, so that each word in the first solution code is analyzed by the first encoding layer Encoding is performed to obtain the first feature vector corresponding to each word in the first answer code;
  • the second solution code is input into the second encoding layer for each word in the second solution code by the second encoding layer Encoding is performed to obtain the second feature vector corresponding to each word in the second solution code;
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • the codes written in the two programming languages are encoded by using two encoding layers, so that the encoded feature vector is input into the decoding layer to realize the training of the programming language translation model, thereby avoiding
  • the problem of low model construction efficiency caused by the need to manually construct rules in the existing scheme is solved.
  • the coding efficiency is improved, and the model construction efficiency is also improved.
  • FIG. 1 is a schematic diagram of a programming language translation model provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a training method for a programming language translation model provided by an embodiment of the present application
  • FIG. 3 is a schematic flowchart of another method for training a programming language translation model provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a decoding layer performing prediction according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a training device for a programming language translation model provided by an embodiment of the application
  • FIG. 6 is a schematic structural diagram of an electronic device of a hardware operating environment involved in an embodiment of the present application.
  • This application may involve artificial intelligence technology, such as model training through machine learning.
  • the technical solutions of the present application can be applied to the training of programming language translation models in various scenarios, such as model training in digital medical scenarios, and model training in financial technology scenarios, to improve coding efficiency and model training. Build efficiency, thereby promoting the construction of smart cities.
  • the training method for a programming language translation model may be applicable to electronic devices, and the electronic devices may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices or Other processing devices connected to the wireless modem, as well as various forms of user equipment (User Equipment, UE), mobile station (Mobile Station, MS), terminal device (terminal device), etc., are not limited here.
  • UE User Equipment
  • MS Mobile Station
  • terminal device terminal device
  • FIG. 1 is a schematic diagram of a programming language translation model provided by an embodiment of the present application.
  • the programming language translation model 100 includes an encoding layer 110 and a decoding layer 120
  • the encoding layer 110 includes a first encoding layer 1101 and a second encoding layer 1102 .
  • the first coding layer 1101 and the second coding layer 1102 may be the coding layers of the TransCoder model.
  • the decoding layer 120 may be the decoding layer of the TransCoder model.
  • the programming language translation model 100 may further include an attention layer 130 .
  • FIG. 2 is a schematic flowchart of a training method for a programming language translation model provided by an embodiment of the present application.
  • the programming language translation model includes an encoding layer and a decoding layer, the encoding layer includes a first encoding layer and a second encoding layer, and the programming language translation model is obtained by training the first answer code set and the second answer code set.
  • There is a one-to-one correspondence between the first solution code set and the second solution code set the programming language of each first solution code in the first solution code set is the first programming language, and each first solution code in the second solution code set is in the first programming language.
  • the programming language of the two-solution code is a second programming language, and the first programming language is different from the second programming language.
  • the second programming language is the C++ language; if the first programming language is the C language, the second programming language is the Java language; if the first programming language is the C language, the second programming language The language is Python; if the first programming language is C, the second programming language is PHP; if the first programming language is Python, the second programming language is Java.
  • the method includes:
  • each first solution code in the first solution code set input the first solution code into the first coding layer, so that each of the first solution codes is processed by the first coding layer. encoding each word to obtain a first feature vector corresponding to each word in the first answer code.
  • each word in the first answer code can be understood as an English word, a number, a Chinese character, etc. in the first answer code, which is not limited here.
  • step 201 may include: for each first solution code in the first solution code set, inserting a start symbol [CLS] at the start position of the first solution code and inserting an end symbol [SEP] at the end position of the first solution code , obtain the new first solution code; input the new first solution code into the first coding layer, so that the new first solution code can be analyzed by the first coding layer in the new first solution code. to encode each word of , to obtain the first feature vector corresponding to each word in the new first solution code.
  • the new first solution code further includes a start symbol [CLS] and an end symbol [SEP].
  • start symbol [CLS] and the end symbol [SEP] can be used as the start signal and the end signal for encoding the first solution code.
  • each second solution code in the second solution code set input the second solution code into the second encoding layer, so that each second solution code in the second solution code is analyzed by the second encoding layer. encoding each word to obtain a second feature vector corresponding to each word in the second solution code.
  • each word in the second answer code can be understood as an English word, a number, a Chinese character, etc. in the second answer code, which is not limited here.
  • step 202 may include: for each second solution code in the second solution code set, inserting a start symbol [CLS] at the starting position of the second solution code to obtain a new second solution code; inputting the new second solution code into the second encoding layer to encode each word in the new second solution code by the second encoding layer to obtain the second feature vector corresponding to each word in the new second solution code.
  • CLS start symbol
  • the new second solution code further includes a start symbol [CLS].
  • start symbol [CLS] can be used as a start signal for encoding the second solution code.
  • the number of words included in each first solution code in the first solution code set is the same as the number of words included in the corresponding second solution code in the second solution code set. Therefore, when the first solution code inserts the start symbol [CLS] and the end symbol [SEP], the corresponding second solution code in the second solution code set only needs to insert the start symbol [CLS], but does not need to insert the end symbol [SEP].
  • step 203 may include: inputting the first feature vector corresponding to each word in the new first solution code and the second feature vector of the corresponding word in the new second solution code. the decoding layer to predict a code translation result corresponding to the first solution code.
  • the codes written in the two programming languages are encoded by using two encoding layers, so that the encoded feature vector is input into the decoding layer to realize the training of the programming language translation model, thereby avoiding
  • the problem of low model construction efficiency caused by the need to manually construct rules in the existing scheme is solved.
  • the coding efficiency is improved, and the model construction efficiency is also improved.
  • FIG. 3 is a schematic flowchart of another method for training a programming language translation model provided by an embodiment of the present application.
  • the programming language translation model includes an encoding layer and a decoding layer, the encoding layer includes a first encoding layer and a second encoding layer, and the programming language translation model is obtained by training the first answer code set and the second answer code set.
  • There is a one-to-one correspondence between the first solution code set and the second solution code set the programming language of each first solution code in the first solution code set is the first programming language, and each first solution code in the second solution code set is in the first programming language.
  • the programming language of the two-solution code is a second programming language, and the first programming language is different from the second programming language.
  • the second programming language is the C++ language; if the first programming language is the C language, the second programming language is the Java language; if the first programming language is the C language, the second programming language The language is Python; if the first programming language is C, the second programming language is PHP; if the first programming language is Python, the second programming language is Java.
  • the method includes:
  • a web crawler also known as a web spider, a web robot, and in the FOAF community, more often referred to as a web page chaser
  • a web crawler is a program or script that automatically crawls information on the World Wide Web according to certain rules.
  • the web crawlers may include: General Purpose Web Crawler, Focused Web Crawler, Incremental Web Crawler, Deep Web Crawler, etc. This does not limit.
  • tags in the M tags are used to indicate that different solution codes in the M solution codes adopt different programming languages.
  • the second programming question corresponds to three tags.
  • label 1 is used to indicate that the programming language used by the answer code 1 corresponding to the second programming question is the first programming language
  • label 2 is used to indicate that the programming language used by the answer code 2 corresponding to the second programming question is the second programming language Language
  • the label 3 is used to indicate that the programming language used by the solution code 3 corresponding to the second programming question is a programming language different from the first programming language and the second programming language.
  • the tag can be represented by different bits.
  • the M tags include a first tag and a second tag, the first tag is used to indicate that the programming language in the M solution codes is the solution code in the first programming language, and the second tag is used to indicate the M solution codes.
  • the solution code in which the programming language in the solution code is the second programming language. If the first tag is a "0" bit, the second tag may be a "1" bit; or, if the first tag is a "1" bit, the second tag may be a "0" bit.
  • the M labels corresponding to each programming question in the N programming questions from the M answer codes corresponding to each programming question in the N programming questions, determine that the programming language is the The first set of solution codes in the first programming language and the programming language are the second set of solution codes in the second programming language.
  • the first answer code set may include answer codes written in a first programming language corresponding to different programming questions among the N programming questions
  • the second answer code set may include answer codes written in a second programming language corresponding to different programming questions among the N programming questions
  • the answer code is not limited here.
  • one programming question in the N programming questions is the first programming question
  • step 302 may include: according to the M labels corresponding to the first programming question, from the first programming question corresponding to the Among the M answer codes, a third answer code is selected, and the third answer code is any code in the first answer code set; according to the M-1 tags corresponding to the first programming question, Among the M-1 answer codes corresponding to the programming question, the fourth answer code is selected, and the M-1 labels corresponding to the first programming question are the M labels corresponding to the first programming question except the Labels other than the labels corresponding to the three answer codes, the M-1 answer codes corresponding to the first programming question are the M answer codes corresponding to the first programming question except for the third answer other solution codes than the code, the fourth solution code is the solution code corresponding to the third solution code in the second solution code set.
  • the third answer code and the fourth answer code are answer codes written in different programming languages for the same programming question.
  • the solution codes written in different programming languages are determined based on the tags, thereby realizing the efficient determination of the solution codes written in different programming languages.
  • Steps 303 to 306 are the same as steps 201 to 204 in FIG. 2 , and details are not repeated here.
  • the first solution code includes K words, where K is an integer greater than 0, and the first feature vector corresponding to each word in the first solution code and the second solution code are Input the second feature vector of the corresponding word in the decoding layer to predict the code translation result corresponding to the first answer code, including:
  • S3 Input the first feature vector corresponding to the ith word in the first solution code and the second feature vector of the corresponding word in the second solution code into the decoding layer to predict the ith word
  • the corresponding code translation result wherein the decoding layer includes a first hidden vector and a second hidden vector, the first hidden vector is determined according to the context information of all words in the first solution code, and the second The latent vector is determined according to the context information of all words in the second solution code;
  • S6 Obtain the code translation result corresponding to each word in the first answer code from the code translation result library, and map the code translation result corresponding to each word in the first answer code to obtain the code translation result corresponding to each word in the first answer code. Describe the code translation result corresponding to the first answer code;
  • the first hidden vector is obtained from the first decoding layer
  • the second hidden vector is obtained from the second decoding layer.
  • the code translation result library may be, for example, a database or a blockchain, which is not limited here.
  • a blockchain is a chained data structure that connects data blocks in chronological order, and is an untamperable and unforgeable distributed ledger guaranteed by cryptography.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the characteristics of blockchain include openness, consensus, decentralization, trustlessness, transparency, anonymity of both parties, immutability, and traceability.
  • openness and transparency mean that anyone can participate in the blockchain network, each device can be used as a node, and each node is allowed to obtain a complete copy of the database.
  • nodes Based on a consensus mechanism, nodes jointly maintain the entire blockchain through competitive computing. If any node fails, the remaining nodes can still work normally.
  • decentralization and de-trusting are arbitrary because the blockchain is composed of many nodes to form an end-to-end network, and there is no centralized equipment and management organization. The data exchange between nodes is verified by digital signature technology, and there is no need to trust each other.
  • the blockchain can use the block chain data structure to verify and store data, use distributed node consensus algorithm to generate and update data, use cryptography to ensure the security of data transmission and access, and use automated script code.
  • the smart contract makes all the terms written into the program, these terms can be automatically executed on the blockchain, which ensures that when there are conditions to trigger the smart contract, the blockchain can enforce the execution according to the content of the smart contract, and does not Blocked by any external force, thus ensuring the validity and execution of the contract, which can not only greatly reduce costs, but also improve efficiency.
  • Each node on the blockchain has the same ledger, which ensures that the ledger recording process is open and transparent.
  • Blockchain technology can realize a point-to-point, open and transparent direct interaction, making efficient, large-scale, decentralized information exchange a reality.
  • FIG. 4 is a schematic flowchart of a decoding layer prediction according to an embodiment of the present application.
  • the first feature vector corresponding to [s 1 ] and the second feature vector of the corresponding word in the second answer code are input into the decoding layer to predict the code translation result corresponding to [s 1 ]; then Input the first feature vector corresponding to [s 2 ] and the second feature vector of the corresponding word in the second answer code into the decoding layer to predict the code translation result corresponding to [s 2 ] ; A feature vector and the second feature vector of the corresponding word in the second solution code are input to the decoding layer to predict the translation result of the code corresponding to [s 3 ].
  • the first feature vector corresponding to the i-th word in the first solution code and the second feature vector of the corresponding word in the second solution code are input to the decoding layer, thereby realizing different programming languages.
  • the answer code is input to the decoding layer at the same time, so that the trained programming language translation model can better translate the programming language, improve the accuracy of code translation, and also improve the generalization ability of the trained programming language translation model.
  • the programming language translation model further includes an attention layer
  • the method further includes:
  • the K words are sorted, and the sorted K words are obtained. Describe K words.
  • the method may further include: inputting the first feature vector and the first latent vector corresponding to the start symbol and the end symbol respectively into the attention layer, so as to determine through the attention layer that the start symbol and the end symbol respectively correspond The similarity between the first feature vector and the first latent vector. It can be understood that for each first solution code in the first solution code set, the similarity between the first feature vector corresponding to the start symbol and the end symbol respectively and the first latent vector are the same.
  • the method may further include: inputting the second feature vector corresponding to the start symbol and the second latent vector into the attention layer, so as to determine the second feature vector corresponding to the start symbol and the second latent vector through the attention layer The similarity between the second latent vectors. It can be understood that, for each second solution code in the second solution code set, the similarity between the second feature vector corresponding to the start symbol and the second hidden vector is the same.
  • the words in the first answer code can be sorted in order of similarity from high to low, so as to readjust the words in the first answer code Order.
  • the first feature vector corresponding to the ith word in the first solution code and the second feature vector of the corresponding word in the second solution code are input into the decoding layer to predict the The code translation result corresponding to the i-th word in the first answer code, including:
  • adjusting the model parameters of the programming language translation model to train the programming language translation model including:
  • the code translation result corresponding to each word in the first solution code, and the corresponding word in the second solution code determine the loss value corresponding to each word in the first solution code
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • the preset loss function may be, for example, a cross-entropy loss function, which is not limited herein.
  • FIG. 5 is a schematic diagram of a training apparatus for a programming language translation model provided by an embodiment of the present application.
  • the training device 500 of the programming language translation model may include a first input module 501 , a second input module 502 , a third input module 503 and a processing module 504 .
  • the programming language translation model includes an encoding layer and a decoding layer
  • the encoding layer includes a first encoding layer and a second encoding layer
  • the programming language translation model is obtained by training the first answer code set and the second answer code set
  • the first solution code set and the second solution code set are in one-to-one correspondence
  • the programming language of each first solution code in the first solution code set is the first programming language
  • each first solution code in the second solution code set is in the first programming language
  • the programming language of the second solution code is a second programming language
  • the first programming language is different from the second programming language.
  • the first input module 501 is configured to input the first solution code into the first encoding layer for each first solution code in the first solution code set, so as to pass the first encoding layer Encoding each word in the first answer code to obtain the first feature vector corresponding to each word in the first answer code;
  • the second input module 502 is configured to, for each second solution code in the second solution code set, input the second solution code into the second encoding layer, so as to pass the second encoding layer to the second solution code.
  • Each word in the second answer code is encoded, and the second feature vector corresponding to each word in the second answer code is obtained;
  • the third input module 503 is configured to input the first feature vector corresponding to each word in the first solution code and the second feature vector of the corresponding word in the second solution code into the decoding layer to predict the code translation result corresponding to the first answer code;
  • the processing module 504 is configured to adjust the model parameters of the programming language translation model according to the code translation result corresponding to the first answer code, so as to train the programming language translation model.
  • the codes written in the two programming languages are encoded by using two encoding layers, so that the encoded feature vector is input into the decoding layer to realize the training of the programming language translation model, thereby avoiding
  • the problem of low model construction efficiency caused by the need to manually construct rules in the existing scheme is solved.
  • the coding efficiency is improved, and the model construction efficiency is also improved.
  • the training device further includes an acquisition module 505 and a determination module 506.
  • the acquisition module 505 is used to pass The web crawler obtains M answer codes and M labels corresponding to each of the N programming questions from the Internet, the N and the M are both integers greater than 0, the M answer codes and the M There is a one-to-one correspondence between the M tags, and each tag in the M tags is used to indicate the programming language adopted by each solution code in the M solution codes; the determining module 506 is used for programming according to the N solutions.
  • the code set and programming language are the second solution code set of the second programming language.
  • the first programming question is a programming question among the N programming questions, and according to the M labels corresponding to each programming question in the N programming questions, from the N programming questions Among the M answer codes corresponding to each programming question in , in terms of determining that the programming language is the first answer code set of the first programming language and the programming language is the second answer code set of the second programming language, the The determining module 506 is specifically configured to select a third answer code from the M answer codes corresponding to the first programming question according to the M labels corresponding to the first programming question, and the third answer code is any code in the first answer code set; according to the M-1 labels corresponding to the first programming question, from the M-1 answer codes corresponding to the first programming question, select the fourth answer code,
  • the M-1 labels corresponding to the first programming question are other labels except the label corresponding to the third solution code among the M labels corresponding to the first programming question, and the first programming question
  • the corresponding M-1 answer codes are other answer codes except the third answer code among the M answer codes corresponding to the first programming question,
  • the solution codes written in different programming languages are determined based on the tags, thereby realizing the efficient determination of the solution codes written in different programming languages.
  • the first solution code includes K words, where K is an integer greater than 0, and the first feature vector corresponding to each word in the first solution code and the second solution code are In terms of the second feature vector corresponding to the word in the decoding layer, to predict the code translation result corresponding to the first answer code, the third input module 503 is specifically used for S1: the initial value of i is set to 1; S2: if the i is less than or equal to the K, execute step S3; if the i is greater than the K, execute step S6; S3: the first answer code corresponding to the i-th word
  • the feature vector and the second feature vector of the corresponding word in the second solution code are input into the decoding layer to predict the code translation result corresponding to the i-th word, wherein the decoding layer includes the first hidden vector and The second hidden vector, the first hidden vector is determined according to the context information of all words in the first solution code, and the second hidden vector is determined according to the context information of all words in the second solution code ; S4: save the
  • the first feature vector corresponding to the i-th word in the first solution code and the second feature vector of the corresponding word in the second solution code are input to the decoding layer, thereby realizing different programming languages.
  • the answer code is input to the decoding layer at the same time, so that the trained programming language translation model can better translate the programming language, improve the accuracy of code translation, and also improve the generalization ability of the trained programming language translation model.
  • the programming language translation model further includes an attention layer
  • the processing module 504 is further configured to input the first feature vector and the first latent vector corresponding to each word in the first answer code.
  • the attention layer to determine the similarity between the first feature vector corresponding to each word in the first answer code and the first latent vector through the attention layer; according to the first answer code
  • the K words are sorted to obtain the sorted K words.
  • the words in the first answer code can be sorted in order of similarity from high to low, so as to readjust the words in the first answer code order, so as to avoid the problem of word order reversal when using the decoding layer for prediction.
  • the first feature vector corresponding to the ith word in the first solution code and the second feature vector of the corresponding word in the second solution code are input into the decoding layer to predict the In terms of the code translation result corresponding to the i-th word in the first answer code, the third input module 503 is specifically used to convert the first feature vector and the i-th word corresponding to the sorted K words.
  • the second feature vector of the corresponding word in the second solution code is input into the decoding layer to predict the code translation result corresponding to the i-th word.
  • the processing module 504 specifically uses Determine the loss value corresponding to each word in the first answer code according to the preset loss function, the code translation result corresponding to each word in the first answer code, and the corresponding word in the second answer code; The average value of the loss value corresponding to each word in the first answer code is used to adjust the model parameters of the programming language translation model to train the programming language translation model.
  • FIG. 6 is a schematic structural diagram of an electronic device of a hardware operating environment involved in an embodiment of the present application.
  • An embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the processor
  • a machine executes to execute instructions comprising steps in a method of training a translation model of any programming language.
  • the electronic devices of the hardware operating environment involved in the embodiments of the present application may include:
  • a processor 601 such as a CPU.
  • the memory 602 may be a high-speed RAM memory, or may be a stable memory, such as a disk memory.
  • the communication interface 603 is used to realize the connection communication between the processor 601 and the memory 602 .
  • FIG. 6 does not constitute a limitation thereof, and may include more or less components than the one shown, or combine some components, or arrange different components.
  • the memory 602 may include an operating system, a network communication module, and one or more programs.
  • An operating system is a program that manages and controls server hardware and software resources, and supports the operation of one or more programs.
  • the network communication module is used to realize the communication between the various components in the memory 602, as well as the communication with other hardware and software in the electronic device.
  • the programming language translation model includes an encoding layer and a decoding layer
  • the encoding layer includes a first encoding layer and a second encoding layer
  • the programming language translation model is obtained by training the first answer code set and the second answer code set
  • the first solution code set and the second solution code set are in one-to-one correspondence
  • the programming language of each first solution code in the first solution code set is the first programming language
  • each first solution code in the second solution code set is in the first programming language
  • the programming language of the second solution code is a second programming language
  • the first programming language is different from the second programming language.
  • the processor 601 is configured to execute one of the memory 602 or Multiple programs that implement the following steps:
  • the first solution code is input into the first encoding layer, so that each word in the first solution code is analyzed by the first encoding layer Encoding is performed to obtain the first feature vector corresponding to each word in the first answer code;
  • the second solution code is input into the second encoding layer for each word in the second solution code by the second encoding layer Encoding is performed to obtain the second feature vector corresponding to each word in the second solution code;
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • the present application also provides a computer-readable storage medium, wherein the programming language translation model includes an encoding layer and a decoding layer, the encoding layer includes a first encoding layer and a second encoding layer, and the programming language translation model is The first solution code set and the second solution code set are obtained by training, the first solution code set and the second solution code set are in one-to-one correspondence, and the programming language of each first solution code in the first solution code set is A first programming language, the programming language of each second solution code in the second solution code set is a second programming language, the first programming language is different from the second programming language, and the computer-readable storage medium uses to store a computer program, and the stored computer program is executed by the processor to realize the following steps:
  • the first solution code is input into the first encoding layer, so that each word in the first solution code is analyzed by the first encoding layer Encoding is performed to obtain the first feature vector corresponding to each word in the first answer code;
  • the second solution code is input into the second encoding layer for each word in the second solution code by the second encoding layer Encoding is performed to obtain the second feature vector corresponding to each word in the second solution code;
  • the model parameters of the programming language translation model are adjusted to train the programming language translation model.
  • the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

本申请涉及模型构建技术领域,公开了一种编程语言翻译模型的训练方法、装置、设备及存储介质,该方法包括:通过第一编码层对第一解答代码中的每个单词进行编码,得到第一解答代码中每个单词对应的第一特征向量;通过第二编码层对第二解答代码中的每个单词进行编码,得到第二解答代码中每个单词对应的第二特征向量;将第一解答代码中每个单词对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入解码层,以预测第一解答代码对应的代码翻译结果;根据第一解答代码对应的代码翻译结果,调整编程语言翻译模型的模型参数,以对编程语言翻译模型进行训练。实施本申请实施例,提高了模型的构建效率。

Description

编程语言翻译模型的训练方法、装置、设备及存储介质
本申请要求于2021年1月8日提交中国专利局、申请号为202110021389.8,发明名称为“编程语言翻译模型的训练方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(Artificial Intelligence,AI)领域,尤其涉及编程语言翻译模型的训练方法、装置、设备及存储介质。
背景技术
目前编程语言层出不穷,从最初的机器语言到如今2500种以上的高级语言。然而学习一门新的编程语言并不简单,需要花费大量的时间。而且有些编程语言语法十分复杂,运用方式也十分灵活。因此,亟需一种可以将一种编程语言翻译为另一种编程语言的技术手段。
发明人发现,一般来说,在现有的模型中,需要由人工构建启发式的规则,以实现将一种编程语言翻译为另一种编程语言。具体的,假设需要将采用C语言编写的代码翻译为采用Python语言编写的代码,那么就需要人工构建C语言以及Python语言涉及到的各种规则。发明人意识到,这种模型的构建效率低,不适用于未来更多应用场景。
发明内容
本申请实施例提供了编程语言翻译模型的训练方法、装置、设备及存储介质,可以提高模型的构建效率。
本申请第一方面提供了一种编程语言翻译模型的训练方法,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
本申请第二方面提供了一种编程语言翻译模型的训练装置,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述训练装置包括第一输入模块,第二输入模块、第三输入模块和处理模块,
所述第一输入模块,用于针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词 进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
所述第二输入模块,用于针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
所述第三输入模块,用于将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
所述处理模块,用于根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
本申请第三方面提供了一种电子设备,包括处理器、存储器、通信接口以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,由所述处理器执行以实现上述编程语言翻译模型的训练方法,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
本申请第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质用于存储计算机程序,所述存储计算机程序被所述处理器执行,以实现上述编程语言翻译模型的训练方法,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
可以看出,上述技术方案中,通过利用两个编码层分别对两种编程语言编写的代码进行编码,从而将编码后的特征向量输入解码层,以实现对编程语言翻译模型的训练,从而避免了现有方案中需要人工构建规则导致的模型构建效率低的问题。同时,通过采用两个编码层同时对不同的代码集进行处理,提高了编码效率,进而也提高了模型构建效率。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
其中:
图1是本申请实施例提供的一种编程语言翻译模型的示意图;
图2是本申请实施例提供的一种编程语言翻译模型的训练方法的流程示意图;
图3是本申请实施例提供的又一种编程语言翻译模型的训练方法的流程示意图;
图4为本申请实施例提供的一种解码层进行预测的流程示意图;
图5为本申请实施例提供的一种编程语言翻译模型的训练装置的示意图;
图6为本申请的实施例涉及的硬件运行环境的电子设备结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
以下分别进行详细说明。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请可涉及人工智能技术,如可通机器学习进行模型训练。可选的,本申请的技术方案可应用于各种场景下的编程语言翻译模型训练,如数字医疗场景下的模型训练,又如金融科技场景下的模型训练等等,以提升编码效率和模型构建效率,从而推动智慧城市的建设。
应理解的,本申请实施例提供的一种编程语言翻译模型的训练方法可以适用于电子设备,该电子设备可以包括各种具有无线通信功能的手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其他处理设备,以及各种形式的用户设备(User Equipment,UE),移动台(Mobile Station,MS),终端设备(terminal device)等,在此不做限制。
参见图1,图1是本申请实施例提供的一种编程语言翻译模型的示意图。其中,如图1所示,该编程语言翻译模型100包括编码层110和解码层120,编码层110包括第一编码层1101和第二编码层1102。其中,第一编码层1101和第二编码层1102可以为TransCoder模型的编码层。解码层120可以为TransCoder模型的解码层。
另外,该编程语言翻译模型100还可以包括注意力层130。
参见图2,图2是本申请实施例提供的一种编程语言翻译模型的训练方法的流程示意图。所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解 答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言。
示例性的,若第一编程语言为C语言,第二编程语言为C++语言;若第一编程语言为C语言,第二编程语言为Java语言;若第一编程语言为C语言,第二编程语言为Python语言;若第一编程语言为C语言,第二编程语言为PHP语言;若第一编程语言为Python语言,第二编程语言为Java语言。
其中,如图2所示,所述方法包括:
201、针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量。
需要说明,在本申请中,第一解答代码中的每个单词可以理解为第一解答代码中的一个英文单词、一个数字、一个汉字等,在此不做限制。
可选的,步骤201可以包括:针对所述第一解答代码集中每个第一解答代码,在所述第一解答代码的起始位置插入开始符号[CLS]以及结束位置插入结束符号[SEP],得到新的所述第一解答代码;将所述新的所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述新的所述第一解答代码中的每个单词进行编码,得到所述新的所述第一解答代码中每个单词对应的第一特征向量。
其中,所述新的所述第一解答代码还包括开始符号[CLS]和结束符号[SEP]。
可以理解的,开始符号[CLS]和结束符号[SEP]可以作为所述第一解答代码进行编码的开始信号和结束信号。
202、针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量。
需要说明,在本申请中,第二解答代码中的每个单词可以理解为第二解答代码中的一个英文单词、一个数字、一个汉字等,在此不做限制。
可选的,步骤202可以包括:针对所述第二解答代码集中每个第二解答代码,在所述第二解答代码的起始位置插入开始符号[CLS],得到新的所述第二解答代码;将所述新的所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述新的所述第二解答代码中的每个单词进行编码,得到所述新的所述第二解答代码中每个单词对应的第二特征向量。
其中,所述新的所述第二解答代码还包括开始符号[CLS]。
可以理解的,开始符号[CLS]可以作为所述第二解答代码进行编码的开始信号。
需要说明的,在本申请中,所述第一解答代码集中每个第一解答代码包括的单词数量和所述第二解答代码集中对应第二解答代码包括的单词数量相同。因此,当第一解答代码插入开始符号[CLS]和结束符号[SEP]时,第二解答代码集中对应第二解答代码只需要插入开始符号[CLS],而无需插入结束符号[SEP]。
203、将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果。
可选的,步骤203可以包括:将所述新的所述第一解答代码中每个单词对应的第一特征向量和所述新的所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果。
204、根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
可以看出,上述技术方案中,通过利用两个编码层分别对两种编程语言编写的代码进行编码,从而将编码后的特征向量输入解码层,以实现对编程语言翻译模型的训练,从而避免了现有方案中需要人工构建规则导致的模型构建效率低的问题。同时,通过采用两个编码层同时对不同的代码集进行处理,提高了编码效率,进而也提高了模型构建效率。
参见图3,图3是本申请实施例提供的又一种编程语言翻译模型的训练方法的流程示意图。所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言。
示例性的,若第一编程语言为C语言,第二编程语言为C++语言;若第一编程语言为C语言,第二编程语言为Java语言;若第一编程语言为C语言,第二编程语言为Python语言;若第一编程语言为C语言,第二编程语言为PHP语言;若第一编程语言为Python语言,第二编程语言为Java语言。
其中,如图3所示,所述方法包括:
301、通过网络爬虫从互联网中获取N个编程题中每个编程题对应的M个解答代码以及M个标签,所述N和所述M均为大于0的整数,所述M个解答代码和所述M个标签一一对应,所述M个标签中的每个标签用于指示所述M个解答代码中每个解答代码所采用的编程语言。
其中,网络爬虫(又被称为网页蜘蛛,网络机器人,在FOAF社区中间,更经常的称为网页追逐者),是一种按照一定的规则,自动地抓取万维网信息的程序或者脚本。
可选的,网络爬虫可以包括:通用网络爬虫(General Purpose Web Crawler)、聚焦网络爬虫(Focused Web Crawler)、增量式网络爬虫(Incremental Web Crawler)、深层网络爬虫(Deep Web Crawler)等,在此不做限制。
其中,所述M个标签中的不同标签用于指示所述M个解答代码中不同解答代码所采用的编程语言不同。
示例性的,若第二编程题为N个编程题中任意一个编程题,第二编程题对应3个标签。其中,标签1用于指示第二编程题对应的解答代码1所采用的编程语言为第一编程语言;标签2用于指示第二编程题对应的解答代码2所采用的编程语言为第二编程语言;标签3用于指示第二编程题对应的解答代码3所采用的编程语言为不同于第一编程语言和第二编程语言的编程语言。那么,可以根据第二编程题对应的3个标签,确定编程语言为第一编程语言的解答代码1和编程语言为第二编程语言的解答代码2。
可选的,在本申请中,标签可以用不同的比特来表示。示例性的,M个标签包括第一标签和第二标签,第一标签用于指示所述M个解答代码中编程语言为第一编程语言的解答代码,第二标签用于指示所述M个解答代码中编程语言为第二编程语言的解答代码。若第一标签为“0”比特,第二标签可以为“1”比特;或,若第一标签为“1”比特,第二标签可以为“0”比特。
302、根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集。
其中,第一解答代码集可以包括N个编程题中不同编程题目对应的第一编程语言编写的解答代码,第二解答代码集包括N个编程题中不同编程题目对应的第二编程语言编写的解答代码,在此不做限制。
可选的,所述N个编程题中的一个编程题为第一编程题,步骤302可以包括:根据所 述第一编程题对应的M个标签,从所述第一编程题对应的所述M个解答代码中,选择第三解答代码,所述第三解答代码为所述第一解答代码集中任意一个代码;根据所述第一编程题对应的M-1个标签,从所述第一编程题对应的M-1个解答代码中,选择第四解答代码,所述第一编程题对应的M-1个标签为所述第一编程题对应的所述M个标签中除所述第三解答代码对应的标签之外的其他标签,所述第一编程题对应的所述M-1个解答代码为所述第一编程题对应的所述M个解答代码中除所述第三解答代码之外的其他解答代码,所述第四解答代码为所述第二解答代码集中与所述第三解答代码对应的解答代码。
其中,所述第三解答代码和所述第四解答代码为同一编程题不同编程语言编写的解答代码。
可以看出,上述技术方案中,通过基于标签来确定不同编程语言编写的解答代码,从而实现了高效确定不同编程语言编写的解答代码。
303-306、与图2中步骤201-204相同,在此不加赘述。
可以看出,上述技术方案中,通过网络爬虫获取大量编程题对应的不同解答代码,以确定不同编程语言的解答代码集,从而可以利用两个编码层分别对两种编程语言编写的代码进行编码,并将编码后的特征向量输入解码层,以实现对编程语言翻译模型的训练,从而避免了现有方案中需要人工构建规则导致的模型构建效率低的问题。同时,通过采用两个编码层同时对不同的代码集进行处理,提高了编码效率,进而也提高了模型构建效率。
可选的,所述第一解答代码包括K个单词,所述K为大于0的整数,所述将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果,包括:
S1:设置i的初始值为1;
S2:若所述i小于或等于所述K,则执行步骤S3;若所述i大于所述K,则执行步骤S6;
S3:将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果,其中,所述解码层包括第一隐向量和第二隐向量,所述第一隐向量是根据所述第一解答代码中所有单词的上下文信息确定的,所述第二隐向量是根据所述第二解答代码中所有单词的上下文信息确定的;
S4:将所述第i个单词对应的代码翻译结果保存在代码翻译结果库中;
S5:令i=i+1,返回执行步骤S2;
S6:从所述代码翻译结果库中,获取所述第一解答代码中每个单词对应的代码翻译结果,将所述第一解答代码中每个单词对应的代码翻译结果进行映射,以得到所述第一解答代码对应的代码翻译结果;
S7:结束预测所述第一解答代码对应的代码翻译结果。
其中,所述第一隐向量是从所述第一解码层获取的,所述第二隐向量是从所述第二解码层获取的。
其中,代码翻译结果库例如可以为数据库或区块链,在此不做限定。
可以理解的,区块链是一种按照时间顺序将数据区块相连的一种链式数据结构,并以密码学方式保证的不可篡改和不可伪造的分布式账本。该区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
进一步的,区块链的特性有开放、共识、去中心、去信任、透明、双方匿名、不可篡改以及可追溯等。其中,开放与透明意为任何人都可以参与到区块链网络,每一台设备都能作为一个节点,每个节点都允许获得一份完整的数据库拷贝。节点基于一套共识机制,通过竞争计算共同维护整个区块链。任一节点失效,其余节点仍能正常工作。其中,去中 心化与去信任意为区块链由众多节点共同组成一个端到端的网络,不存在中心化的设备和管理机构。节点之间数据交换通过数字签名技术进行验证,无需互相信任,只要按照系统既定的规则进行,节点之间不能也无法欺骗其他节点。其中,透明与双方匿名意为区块链的运行规则是公开的,所有的数据信息也是公开的,因此每一笔交易都对所有节点可见。由于节点与节点之间是去信任的,因此节点之间无需公开身份,每个参与的节点都是匿名的。其中,不可篡改和可追溯意为每个甚至多个节点对数据库的修改无法影响其他节点的数据库,除非能控制整个网络中超过51%的节点同时修改,这是几乎不可能发生的。区块链中的,每一笔交易都通过密码学方法与相邻两个区块串联,因此可以追溯到任何一笔交易记录。
具体的,区块链可以利用块链式数据结构来验证与存储数据、利用分布式节点共识算法来生成和更新数据、利用密码学的方式保证数据传输和访问的安全、利用由自动化脚本代码组成的智能合约来编程和操作数据的一种全新的分布式基础架构与计算方式。因此,区块链技术不可篡改的特性从根本上改变了中心化的信用创建方式,有效提高了数据的不可更改性以及安全性。其中,由于智能合约使得所有的条款编写为程序,这些条款可在区块链上自动执行,保证了当存在触发智能合约的条件时,区块链能强制根据智能合约中的内容执行,且不受任何外力阻挡,从而保证了合约的有效性和执行力,不仅能够大大降低成本,也能提高效率。区块链上的各个节点都有相同的账本,能够确保账本记录过程是公开透明的。区块链技术可以实现了一种点对点的、公开透明的直接交互,使得高效率、大规模、无中心化代理的信息交互方式成为了现实。
示例性的,若第一解答代码包括[s 1]、[s 2]和[s 3],其中,[s 1]、[s 2]和[s 3]表示3个不同的单词。参见图4,图4为本申请实施例提供的一种解码层进行预测的流程示意图。如图4所示,先将[s 1]对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入该解码层,以预测[s 1]对应的代码翻译结果;然后再将[s 2]对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入该解码层,以预测[s 2]对应的代码翻译结果;最后将[s 3]对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入该解码层,以预测[s 3]对应的代码翻译结果。
可以看出,上述技术方案中,通过第一解答代码中第i个单词对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入解码层,从而实现了将不同编程语言编写的解答代码同时输入解码层,进而使得训练后的编程语言翻译模型能够更好的进行编程语言的翻译,提高了代码翻译的精准性,也提高了训练后的编程语言翻译模型的泛化能力。
可选的,所述编程语言翻译模型还包括注意力层,所述方法还包括:
将所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度;
根据所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度从高到低的顺序,对所述K个单词进行排序,得到排序后的所述K个单词。
其中,该方法还可以包括:将开始符号和结束符号分别对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定开始符号和结束符号分别对应的第一特征向量和所述第一隐向量之间的相似度。可以理解的,针对第一解答代码集中每个第一解答代码,其开始符号和结束符号分别对应的第一特征向量和所述第一隐向量之间的相似度均相同。
其中,该方法还可以包括:将开始符号对应的第二特征向量和所述第二隐向量输入所述注意力层,以通过所述注意力层确定开始符号对应的第二特征向量和所述第二隐向量之间的相似度。可以理解的,针对第二解答代码集中每个第二解答代码,其开始符号对应的 第二特征向量和所述第二隐向量之间的相似度均相同。
可以看出,上述技术方案中,通过在注意力层确定相似度,从而可以按照相似度从高到低的顺序对第一解答代码中的单词进行排序,以重新调整第一解答代码中的单词的顺序。
可选的,所述将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第一解答代码中第i个单词对应的代码翻译结果,包括:
将所述排序后的所述K个单词中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果。
可以看出,上述技术方案中,通过重新调整第一解答代码中的单词的顺序,从而在利用解码层进行预测时避免语序颠倒的问题。
可选的,所述根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练,包括:
根据预设损失函数、所述第一解答代码中每个单词对应的代码翻译结果以及所述第二解答代码中对应单词,确定所述第一解答代码中每个单词对应的损失值;
根据所述第一解答代码中每个单词对应的损失值的平均值,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
其中,预设损失函数例如可以为交叉熵损失函数,在此不做限制。
需要说明的,当编程语言翻译模型收敛时,停止训练。
参见图5,图5为本申请实施例提供的一种编程语言翻译模型的训练装置的示意图。其中,如图5所示,该编程语言翻译模型的训练装置500可以包括第一输入模块501,第二输入模块502、第三输入模块503和处理模块504。
其中,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言。
其中,所述第一输入模块501,用于针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
所述第二输入模块502,用于针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
所述第三输入模块503,用于将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
所述处理模块504,用于根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
可以看出,上述技术方案中,通过利用两个编码层分别对两种编程语言编写的代码进行编码,从而将编码后的特征向量输入解码层,以实现对编程语言翻译模型的训练,从而避免了现有方案中需要人工构建规则导致的模型构建效率低的问题。同时,通过采用两个编码层同时对不同的代码集进行处理,提高了编码效率,进而也提高了模型构建效率。
可选的,在针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得 到所述第一解答代码中每个单词对应的第一特征向量之前,所述训练装置还包括获取模块505和确定模块506,所述获取模块505,用于通过网络爬虫从互联网中获取N个编程题中每个编程题对应的M个解答代码以及M个标签,所述N和所述M均为大于0的整数,所述M个解答代码和所述M个标签一一对应,所述M个标签中的每个标签用于指示所述M个解答代码中每个解答代码所采用的编程语言;所述确定模块506,用于根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集。
可以看出,上述技术方案中,通过网络爬虫获取大量编程题对应的不同解答代码,以确定不同编程语言的解答代码集。
可选的,第一编程题为所述N个编程题中的一个编程题,所述根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集方面,所述确定模块506,具体用于根据所述第一编程题对应的M个标签,从所述第一编程题对应的所述M个解答代码中,选择第三解答代码,所述第三解答代码为所述第一解答代码集中任意一个代码;根据所述第一编程题对应的M-1个标签,从所述第一编程题对应的M-1个解答代码中,选择第四解答代码,所述第一编程题对应的M-1个标签为所述第一编程题对应的所述M个标签中除所述第三解答代码对应的标签之外的其他标签,所述第一编程题对应的所述M-1个解答代码为所述第一编程题对应的所述M个解答代码中除所述第三解答代码之外的其他解答代码,所述第四解答代码为所述第二解答代码集中与所述第三解答代码对应的解答代码。
可以看出,上述技术方案中,通过基于标签来确定不同编程语言编写的解答代码,从而实现了高效确定不同编程语言编写的解答代码。
可选的,所述第一解答代码包括K个单词,所述K为大于0的整数,所述将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果方面,所述第三输入模块503,具体用于S1:设置i的初始值为1;S2:若所述i小于或等于所述K,则执行步骤S3;若所述i大于所述K,则执行步骤S6;S3:将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果,其中,所述解码层包括第一隐向量和第二隐向量,所述第一隐向量是根据所述第一解答代码中所有单词的上下文信息确定的,所述第二隐向量是根据所述第二解答代码中所有单词的上下文信息确定的;S4:将所述第i个单词对应的代码翻译结果保存在代码翻译结果库中;S5:令i=i+1,返回执行步骤S2;S6:从所述代码翻译结果库中,获取所述第一解答代码中每个单词对应的代码翻译结果,将所述第一解答代码中每个单词对应的代码翻译结果进行映射,以得到所述第一解答代码对应的代码翻译结果;S7:结束预测所述第一解答代码对应的代码翻译结果。
可以看出,上述技术方案中,通过第一解答代码中第i个单词对应的第一特征向量和第二解答代码中对应单词的第二特征向量输入解码层,从而实现了将不同编程语言编写的解答代码同时输入解码层,进而使得训练后的编程语言翻译模型能够更好的进行编程语言的翻译,提高了代码翻译的精准性,也提高了训练后的编程语言翻译模型的泛化能力。
可选的,所述编程语言翻译模型还包括注意力层,所述处理模块504,还用于将所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度;根据所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量 之间的相似度从高到低的顺序,对所述K个单词进行排序,得到排序后的所述K个单词。
可以看出,上述技术方案中,通过在注意力层确定相似度,从而可以按照相似度从高到低的顺序对第一解答代码中的单词进行排序,以重新调整第一解答代码中的单词的顺序,从而在利用解码层进行预测时避免语序颠倒的问题。
可选的,所述将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第一解答代码中第i个单词对应的代码翻译结果方面,所述第三输入模块503,具体用于将所述排序后的所述K个单词中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果。
可以看出,上述技术方案中,通过重新调整第一解答代码中的单词的顺序,从而在利用解码层进行预测时避免语序颠倒的问题。
可选的,所述根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练方面,所述处理模块504,具体用于根据预设损失函数、所述第一解答代码中每个单词对应的代码翻译结果以及所述第二解答代码中对应单词,确定所述第一解答代码中每个单词对应的损失值;根据所述第一解答代码中每个单词对应的损失值的平均值,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
参见图6,图6为本申请的实施例涉及的硬件运行环境的电子设备结构示意图。
本申请实施例提供了一种电子设备,包括处理器、存储器、通信接口以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,并且被配置由所述处理器执行,以执行包括任一项编程语言翻译模型的训练方法中的步骤的指令。其中,如图6所示,本申请的实施例涉及的硬件运行环境的电子设备可以包括:
处理器601,例如CPU。
存储器602,可选的,存储器可以为高速RAM存储器,也可以是稳定的存储器,例如磁盘存储器。
通信接口603,用于实现处理器601和存储器602之间的连接通信。
本领域技术人员可以理解,图6中示出的电子设备的结构并不构成对其的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图6所示,存储器602中可以包括操作系统、网络通信模块以及一个或多个程序。操作系统是管理和控制服务器硬件和软件资源的程序,支持一个或多个程序的运行。网络通信模块用于实现存储器602内部各组件之间的通信,以及与电子设备内部其他硬件和软件之间通信。
其中,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,在图6所示的电子设备中,处理器601用于执行存储器602中一个或多个程序,实现以下步骤:
针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
本申请涉及的电子设备的具体实施可参见上述编程语言翻译模型的训练方法的各实施例,在此不做赘述。
本申请还提供了一种计算机可读存储介质,其中,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述计算机可读存储介质用于存储计算机程序,所述存储计算机程序被所述处理器执行,以实现以下步骤:
针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
本申请涉及的计算机可读存储介质的具体实施可参见上述编程语言翻译模型的训练方法的各实施例,在此不做赘述。
可选的,本申请涉及的存储介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应所述知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应所述知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (20)

  1. 一种编程语言翻译模型的训练方法,其中,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
    针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
    针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
    将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
    根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  2. 根据权利要求1所述的方法,其中,在针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量之前,所述方法还包括:
    通过网络爬虫从互联网中获取N个编程题中每个编程题对应的M个解答代码以及M个标签,所述N和所述M均为大于0的整数,所述M个解答代码和所述M个标签一一对应,所述M个标签中的每个标签用于指示所述M个解答代码中每个解答代码所采用的编程语言;
    根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集。
  3. 根据权利要求2所述的方法,其中,所述N个编程题中的一个编程题为第一编程题,所述根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集,包括:
    根据所述第一编程题对应的M个标签,从所述第一编程题对应的所述M个解答代码中,选择第三解答代码,所述第三解答代码为所述第一解答代码集中任意一个代码;
    根据所述第一编程题对应的M-1个标签,从所述第一编程题对应的M-1个解答代码中,选择第四解答代码,所述第一编程题对应的M-1个标签为所述第一编程题对应的所述M个标签中除所述第三解答代码对应的标签之外的其他标签,所述第一编程题对应的所述M-1个解答代码为所述第一编程题对应的所述M个解答代码中除所述第三解答代码之外的其他解答代码,所述第四解答代码为所述第二解答代码集中与所述第三解答代码对应的解答代码。
  4. 根据权利要求1所述的方法,其中,所述第一解答代码包括K个单词,所述K为大于0的整数,所述将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果,包括:
    S1:设置i的初始值为1;
    S2:若所述i小于或等于所述K,则执行步骤S3;若所述i大于所述K,则执行步骤S6;
    S3:将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果,其中,所述解码层包括第一隐向量和第二隐向量,所述第一隐向量是根据所述第一解答代码中所有单词的上下文信息确定的,所述第二隐向量是根据所述第二解答代码中所有单词的上下文信息确定的;
    S4:将所述第i个单词对应的代码翻译结果保存在代码翻译结果库中;
    S5:令i=i+1,返回执行步骤S2;
    S6:从所述代码翻译结果库中,获取所述第一解答代码中每个单词对应的代码翻译结果,将所述第一解答代码中每个单词对应的代码翻译结果进行映射,以得到所述第一解答代码对应的代码翻译结果;
    S7:结束预测所述第一解答代码对应的代码翻译结果。
  5. 根据权利要求4所述的方法,其中,所述编程语言翻译模型还包括注意力层,所述方法还包括:
    将所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度;
    根据所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度从高到低的顺序,对所述K个单词进行排序,得到排序后的所述K个单词。
  6. 根据权利要求5所述的方法,其中,所述将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第一解答代码中第i个单词对应的代码翻译结果,包括:
    将所述排序后的所述K个单词中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果。
  7. 根据权利要求1-6任意一项所述的方法,其中,所述根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练,包括:
    根据预设损失函数、所述第一解答代码中每个单词对应的代码翻译结果以及所述第二解答代码中对应单词,确定所述第一解答代码中每个单词对应的损失值;
    根据所述第一解答代码中每个单词对应的损失值的平均值,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  8. 一种编程语言翻译模型的训练装置,其中,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述训练装置包括第一输入模块,第二输入模块、第三输入模块和处理模块,
    所述第一输入模块,用于针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
    所述第二输入模块,用于针对所述第二解答代码集中每个第二解答代码,将所述第二 解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
    所述第三输入模块,用于将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
    所述处理模块,用于根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  9. 一种电子设备,其中,包括处理器、存储器、通信接口以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,并且由所述处理器执行以实现编程语言翻译模型的训练方法,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
    针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量;
    针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
    将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
    根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  10. 根据权利要求9所述的电子设备,其中,在针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量之前,还包括:
    通过网络爬虫从互联网中获取N个编程题中每个编程题对应的M个解答代码以及M个标签,所述N和所述M均为大于0的整数,所述M个解答代码和所述M个标签一一对应,所述M个标签中的每个标签用于指示所述M个解答代码中每个解答代码所采用的编程语言;
    根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集。
  11. 根据权利要求10所述的电子设备,其中,所述N个编程题中的一个编程题为第一编程题,执行所述根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集,包括:
    根据所述第一编程题对应的M个标签,从所述第一编程题对应的所述M个解答代码中,选择第三解答代码,所述第三解答代码为所述第一解答代码集中任意一个代码;
    根据所述第一编程题对应的M-1个标签,从所述第一编程题对应的M-1个解答代码中,选择第四解答代码,所述第一编程题对应的M-1个标签为所述第一编程题对应的所述M个标签中除所述第三解答代码对应的标签之外的其他标签,所述第一编程题对应的所述M-1 个解答代码为所述第一编程题对应的所述M个解答代码中除所述第三解答代码之外的其他解答代码,所述第四解答代码为所述第二解答代码集中与所述第三解答代码对应的解答代码。
  12. 根据权利要求9所述的电子设备,其中,所述第一解答代码包括K个单词,所述K为大于0的整数,执行所述将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果,包括:
    S1:设置i的初始值为1;
    S2:若所述i小于或等于所述K,则执行步骤S3;若所述i大于所述K,则执行步骤S6;
    S3:将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果,其中,所述解码层包括第一隐向量和第二隐向量,所述第一隐向量是根据所述第一解答代码中所有单词的上下文信息确定的,所述第二隐向量是根据所述第二解答代码中所有单词的上下文信息确定的;
    S4:将所述第i个单词对应的代码翻译结果保存在代码翻译结果库中;
    S5:令i=i+1,返回执行步骤S2;
    S6:从所述代码翻译结果库中,获取所述第一解答代码中每个单词对应的代码翻译结果,将所述第一解答代码中每个单词对应的代码翻译结果进行映射,以得到所述第一解答代码对应的代码翻译结果;
    S7:结束预测所述第一解答代码对应的代码翻译结果。
  13. 根据权利要求12所述的电子设备,其中,所述编程语言翻译模型还包括注意力层,还包括:
    将所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度;
    根据所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度从高到低的顺序,对所述K个单词进行排序,得到排序后的所述K个单词。
  14. 根据权利要求9-13任意一项所述的电子设备,其中,执行所述根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练,包括:
    根据预设损失函数、所述第一解答代码中每个单词对应的代码翻译结果以及所述第二解答代码中对应单词,确定所述第一解答代码中每个单词对应的损失值;
    根据所述第一解答代码中每个单词对应的损失值的平均值,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质用于存储计算机程序,所述存储计算机程序被处理器执行,以实现编程语言翻译模型的训练方法,所述编程语言翻译模型包括编码层和解码层,所述编码层包括第一编码层和第二编码层,所述编程语言翻译模型通过第一解答代码集和第二解答代码集训练得到,所述第一解答代码集和所述第二解答代码集一一对应,所述第一解答代码集中每个第一解答代码的编程语言为第一编程语言,所述第二解答代码集中每个第二解答代码的编程语言为第二编程语言,所述第一编程语言不同于所述第二编程语言,所述方法包括:
    针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一 解答代码中每个单词对应的第一特征向量;
    针对所述第二解答代码集中每个第二解答代码,将所述第二解答代码输入所述第二编码层,以通过所述第二编码层对所述第二解答代码中的每个单词进行编码,得到所述第二解答代码中每个单词对应的第二特征向量;
    将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果;
    根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
  16. 根据权利要求15所述的计算机可读存储介质,其中,在针对所述第一解答代码集中每个第一解答代码,将所述第一解答代码输入所述第一编码层,以通过所述第一编码层对所述第一解答代码中的每个单词进行编码,得到所述第一解答代码中每个单词对应的第一特征向量之前,还包括:
    通过网络爬虫从互联网中获取N个编程题中每个编程题对应的M个解答代码以及M个标签,所述N和所述M均为大于0的整数,所述M个解答代码和所述M个标签一一对应,所述M个标签中的每个标签用于指示所述M个解答代码中每个解答代码所采用的编程语言;
    根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述N个编程题中的一个编程题为第一编程题,执行所述根据所述N个编程题中每个编程题对应的所述M个标签,从所述N个编程题中每个编程题对应的所述M个解答代码中,确定编程语言为所述第一编程语言的第一解答代码集和编程语言为所述第二编程语言的第二解答代码集,包括:
    根据所述第一编程题对应的M个标签,从所述第一编程题对应的所述M个解答代码中,选择第三解答代码,所述第三解答代码为所述第一解答代码集中任意一个代码;
    根据所述第一编程题对应的M-1个标签,从所述第一编程题对应的M-1个解答代码中,选择第四解答代码,所述第一编程题对应的M-1个标签为所述第一编程题对应的所述M个标签中除所述第三解答代码对应的标签之外的其他标签,所述第一编程题对应的所述M-1个解答代码为所述第一编程题对应的所述M个解答代码中除所述第三解答代码之外的其他解答代码,所述第四解答代码为所述第二解答代码集中与所述第三解答代码对应的解答代码。
  18. 根据权利要求15所述的计算机可读存储介质,其中,所述第一解答代码包括K个单词,所述K为大于0的整数,执行所述将所述第一解答代码中每个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层,以预测所述第一解答代码对应的代码翻译结果,包括:
    S1:设置i的初始值为1;
    S2:若所述i小于或等于所述K,则执行步骤S3;若所述i大于所述K,则执行步骤S6;
    S3:将所述第一解答代码中第i个单词对应的第一特征向量和所述第二解答代码中对应单词的第二特征向量输入所述解码层中,以预测所述第i个单词对应的代码翻译结果,其中,所述解码层包括第一隐向量和第二隐向量,所述第一隐向量是根据所述第一解答代码中所有单词的上下文信息确定的,所述第二隐向量是根据所述第二解答代码中所有单词的上下文信息确定的;
    S4:将所述第i个单词对应的代码翻译结果保存在代码翻译结果库中;
    S5:令i=i+1,返回执行步骤S2;
    S6:从所述代码翻译结果库中,获取所述第一解答代码中每个单词对应的代码翻译结果,将所述第一解答代码中每个单词对应的代码翻译结果进行映射,以得到所述第一解答代码对应的代码翻译结果;
    S7:结束预测所述第一解答代码对应的代码翻译结果。
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述编程语言翻译模型还包括注意力层,还包括:
    将所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量输入所述注意力层,以通过所述注意力层确定所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度;
    根据所述第一解答代码中每个单词对应的第一特征向量和所述第一隐向量之间的相似度从高到低的顺序,对所述K个单词进行排序,得到排序后的所述K个单词。
  20. 根据权利要求15-19任意一项所述的计算机可读存储介质,其中,所述根据所述第一解答代码对应的代码翻译结果,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练,包括:
    根据预设损失函数、所述第一解答代码中每个单词对应的代码翻译结果以及所述第二解答代码中对应单词,确定所述第一解答代码中每个单词对应的损失值;
    根据所述第一解答代码中每个单词对应的损失值的平均值,调整所述编程语言翻译模型的模型参数,以对所述编程语言翻译模型进行训练。
PCT/CN2021/124418 2021-01-08 2021-10-18 编程语言翻译模型的训练方法、装置、设备及存储介质 WO2022148087A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110021389.8A CN112346737B (zh) 2021-01-08 2021-01-08 编程语言翻译模型的训练方法、装置、设备及存储介质
CN202110021389.8 2021-01-08

Publications (1)

Publication Number Publication Date
WO2022148087A1 true WO2022148087A1 (zh) 2022-07-14

Family

ID=74427961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124418 WO2022148087A1 (zh) 2021-01-08 2021-10-18 编程语言翻译模型的训练方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112346737B (zh)
WO (1) WO2022148087A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112346737B (zh) * 2021-01-08 2021-04-13 深圳壹账通智能科技有限公司 编程语言翻译模型的训练方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739483A (zh) * 2018-12-28 2019-05-10 北京百度网讯科技有限公司 用于生成语句的方法和装置
CN110263348A (zh) * 2019-03-06 2019-09-20 腾讯科技(深圳)有限公司 翻译方法、装置、计算机设备和存储介质
CN110598224A (zh) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 翻译模型的训练方法、文本处理方法、装置及存储介质
US20200151399A1 (en) * 2018-09-27 2020-05-14 Intuit Inc. Translating transaction descriptions using machine learning
CN112346737A (zh) * 2021-01-08 2021-02-09 深圳壹账通智能科技有限公司 编程语言翻译模型的训练方法、装置、设备及存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012059026A (ja) * 2010-09-09 2012-03-22 Hitachi Ltd ソースコード変換方法およびソースコード変換プログラム
CN106295343B (zh) * 2016-08-24 2019-03-12 北京奇虎测腾安全技术有限公司 一种基于序列化中间表示的源代码分布式检测系统及方法
US10467039B2 (en) * 2017-08-07 2019-11-05 Open Data Group Inc. Deployment and management platform for model execution engine containers
CN108595185B (zh) * 2018-04-11 2021-07-27 暨南大学 一种将以太坊智能合约转换成超级账本智能合约的方法
CN109857459B (zh) * 2018-12-27 2022-03-08 中国海洋大学 一种e级超算海洋模式自动移植优化方法及系统
CN109614111B (zh) * 2018-12-28 2022-02-01 北京百度网讯科技有限公司 用于生成代码的方法和装置
CN110879710B (zh) * 2019-07-24 2023-07-07 中信银行股份有限公司 一种rpg程序自动转成java程序的方法
CN110488755A (zh) * 2019-08-21 2019-11-22 江麓机电集团有限公司 一种数控g代码的转换方法
CN110851142A (zh) * 2019-10-18 2020-02-28 浙江大学 一种将Transact-SQL程序转换为Java程序的方法
CN111931518A (zh) * 2020-10-15 2020-11-13 北京金山数字娱乐科技有限公司 一种翻译模型的训练方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200151399A1 (en) * 2018-09-27 2020-05-14 Intuit Inc. Translating transaction descriptions using machine learning
CN109739483A (zh) * 2018-12-28 2019-05-10 北京百度网讯科技有限公司 用于生成语句的方法和装置
CN110263348A (zh) * 2019-03-06 2019-09-20 腾讯科技(深圳)有限公司 翻译方法、装置、计算机设备和存储介质
CN110598224A (zh) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 翻译模型的训练方法、文本处理方法、装置及存储介质
CN112346737A (zh) * 2021-01-08 2021-02-09 深圳壹账通智能科技有限公司 编程语言翻译模型的训练方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN112346737A (zh) 2021-02-09
CN112346737B (zh) 2021-04-13

Similar Documents

Publication Publication Date Title
US11481418B2 (en) Natural question generation via reinforcement learning based graph-to-sequence model
Fan et al. One2multi graph autoencoder for multi-view graph clustering
Strinati et al. 6G networks: Beyond Shannon towards semantic and goal-oriented communications
Yin et al. Graph-based neural sentence ordering
CN111695674B (zh) 联邦学习方法、装置、计算机设备及可读存储介质
Xing et al. A new scheme of vulnerability analysis in smart contract with machine learning
CN112905187B (zh) 编译方法、装置、电子设备及存储介质
CN111507070B (zh) 自然语言生成方法和装置
CN112560456A (zh) 一种基于改进神经网络的生成式摘要生成方法和系统
CN113836866B (zh) 文本编码方法、装置、计算机可读介质及电子设备
CN113312919A (zh) 一种知识图谱的文本生成方法及装置
Zhou et al. Learning with annotation of various degrees
CN113487024A (zh) 交替序列生成模型训练方法、从文本中抽取图的方法
WO2022148087A1 (zh) 编程语言翻译模型的训练方法、装置、设备及存储介质
CN115762659A (zh) 融合smiles序列和分子图的分子预训练表示方法及系统
CN113158051B (zh) 一种基于信息传播和多层上下文信息建模的标签排序方法
Su et al. A novel strategy for minimum attribute reduction based on rough set theory and fish swarm algorithm
He et al. Purify and generate: Learning faithful item-to-item graph from noisy user-item interaction behaviors
Przewoźniczek et al. The transformation of the k-Shortest Steiner trees search problem into binary dynamic problem for effective evolutionary methods application
Li et al. A genetic algorithm enhanced automatic data flow management solution for facilitating data intensive applications in the cloud
CN112463161A (zh) 基于联邦学习的代码注释生成方法、系统及装置
CN111882416A (zh) 一种风险预测模型的训练方法和相关装置
Zheng et al. Subclass maximum margin tree error correcting output codes
Le Martelot et al. A systemic computation platform for the modelling and analysis of processes with natural characteristics
CN116737763B (zh) 结构化查询语句执行方法、装置、计算机设备、存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21917129

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 201023)

122 Ep: pct application non-entry in european phase

Ref document number: 21917129

Country of ref document: EP

Kind code of ref document: A1