CN109918058B

CN109918058B - Information processing apparatus and method, and method of recommending code in programming environment

Info

Publication number: CN109918058B
Application number: CN201711328030.5A
Authority: CN
Inventors: 钟朝亮; 杨铭; 黄琦珍; 孙俊
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-12-13
Filing date: 2017-12-13
Publication date: 2022-08-12
Anticipated expiration: 2037-12-13
Also published as: CN109918058A

Abstract

The present disclosure relates to an information processing apparatus and an information processing method, and a method for recommending code chips in a programming environment. An information processing apparatus according to the present disclosure is for processing an element sequence made up of a number of elements in an element set to predict a subsequent element, there being a logical relationship between the elements in the element sequence, the information processing apparatus including: a first prediction unit that receives the sequence of elements and generates an intermediate state and a first prediction result based on the sequence of elements; one or more second prediction units, the number of which corresponds to the number of element types in the set of elements, there being one corresponding second prediction unit for each element type, the second prediction unit receiving the intermediate state and generating a second prediction result based on the intermediate state and parameters related to the respective element type; and a determination unit that receives the first prediction result and the second prediction result and determines a subsequent element based on the first prediction result and the second prediction result.

Description

Information processing apparatus and method, and method of recommending code in programming environment

Technical Field

The present disclosure relates to an information processing apparatus and an information processing method. In particular, the present disclosure relates to recommending code chips in a programming environment.

Background

Code recommendation is one of the main functions in modern Integrated Development Environments (IDEs). For static programming languages such as Java, the traditional code recommendation method works well thanks to the identifier type annotation. However, with a dynamic programming language such as Python, JavaScript, etc., which has been widely used in recent years, since there is no identifier type annotation, the conventional code recommendation method cannot provide support equivalent to a static programming language.

Therefore, it is desirable to provide an information processing technology that can overcome the drawbacks of the existing code recommendation method and provide a good code recommendation function for a dynamic programming language.

It should be noted that the above background description is only for the convenience of clear and complete description of the technical solutions of the present application and for the understanding of those skilled in the art. Such solutions are not considered to be known to the person skilled in the art merely because they have been set forth in the background section of the present application.

Disclosure of Invention

A brief summary of the disclosure is provided below in order to provide a basic understanding of some aspects of the disclosure. It should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.

An object of the present disclosure is to provide an information processing technique applicable to recommending code in a programming environment of a dynamic programming language. Compared with the traditional code recommendation method, the code recommendation method realized by the information processing technology can improve the accuracy of code recommendation on the premise of not improving the calculation cost.

To achieve the object of the present disclosure, according to one aspect of the present disclosure, there is provided an information processing apparatus for processing an element sequence made up of a number of elements in an element set to predict a subsequent element, there being a logical relationship between the elements in the element sequence, the information processing apparatus including: a first prediction unit that receives the sequence of elements and generates an intermediate state and a first prediction result based on the sequence of elements; one or more second prediction units, the number of the one or more second prediction units corresponding to the number of element types in the set of elements, there being one corresponding second prediction unit for each element type, the one or more second prediction units receiving the intermediate state and generating one or more second prediction results based on the intermediate state and parameters related to the respective element type; and a determination unit that receives the first prediction result and the one or more second prediction results and determines a subsequent element based on the first prediction result and the one or more second prediction results.

According to another aspect of the present disclosure, there is provided an information processing method for processing an element sequence made up of a number of elements in an element set to predict a subsequent element, there being a logical relationship between the elements in the element sequence, the information processing method including: receiving a sequence of elements and generating an intermediate state and a first prediction result based on the sequence of elements; for each element type, receiving an intermediate state and generating one or more second predicted results based on the intermediate state and parameters related to the respective element type; and receiving the first predicted outcome and the one or more second predicted outcomes and determining a subsequent element based on the first predicted outcome and the one or more second predicted outcomes.

According to another aspect of the present disclosure, there is also provided a method for recommending code chips in a programming environment, which is implemented by an information processing apparatus according to the present disclosure.

According to another aspect of the present disclosure, there is also provided a computer program capable of implementing the information processing method according to the present disclosure. Furthermore, a computer program product in the form of at least a computer-readable medium having computer program code recorded thereon for implementing the information processing method according to the present disclosure is also provided.

By the method for recommending code chips in a programming environment, which is realized by the information processing technology according to the present disclosure, codes can be accurately recommended to programmers without increasing the calculation cost.

Drawings

The above and other objects, features and advantages of the present disclosure will be more readily understood by reference to the following description of embodiments of the present disclosure taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a schematic diagram illustrating code recommendation in an integrated development environment;

FIG. 2 is a schematic flow diagram illustrating a training process for a code recommendation model;

fig. 3 is a block diagram showing an information processing apparatus according to a first embodiment of the present disclosure;

fig. 4 is a schematic diagram showing an internal configuration of each unit of the information processing apparatus according to the first embodiment of the present disclosure;

FIG. 5 is a schematic diagram showing statistics of distances between calls and definitions of different types of identifiers in the JavaScript programming language;

fig. 6 is a block diagram showing an information processing apparatus according to a second embodiment of the present disclosure;

fig. 7 is a schematic diagram showing an internal configuration of each unit of an information processing apparatus according to a second embodiment of the present disclosure;

fig. 8 is a schematic diagram illustrating an internal structure of a second prediction unit used in fig. 7;

fig. 9 is a flowchart illustrating an information processing method according to the present disclosure; and

fig. 10 is a block diagram showing a configuration of a general-purpose machine that can be used to implement the information processing apparatus and the information processing method according to the embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another.

Here, it should be further noted that, in order to avoid obscuring the present disclosure with unnecessary details, only components closely related to the scheme according to the present disclosure are shown in the drawings, and other details not so related to the present disclosure are omitted.

Generally, code recommendations can be implemented as follows: a language model is learned from a source code library of a programming language, and codes with the highest possibility next are predicted for input codes according to the learned language model and recommended to a programmer. In fact, all codes in a programming language can be considered to constitute a vector space, and a code segment composed of codes as elements, i.e., a sequence of elements, can be considered to be a sequence of vectors in the vector space. In other words, a code recommendation may be viewed as processing a sequence of elements of a number of elements in a set of elements to predict a subsequent element, where there is a logical relationship between the elements in the sequence of elements.

For example, for a particular programming language, all of the code in the programming language may be comprised of chips. Assuming that all chips of the programming language constitute a vector space, the length of each vector is the total number of all chips. For a code segment composed of a few chips, it can be split into vector sets in time steps, there is a logical relationship between the vectors in the vector sets, and an example of the logical relationship can be a syntax of a programming language to some extent. The vector with the highest probability next, i.e. the chip with the highest probability next of the code segment, can be predicted by the learned language model.

To this end, an information processing apparatus according to the present disclosure includes: a first prediction unit that receives the sequence of elements and generates an intermediate state and a first prediction result based on the sequence of elements; one or more second prediction units, the number of the one or more second prediction units corresponding to the number of element types in the set of elements, there being one corresponding second prediction unit for each element type, the one or more second prediction units receiving the intermediate state and generating one or more second prediction results based on the intermediate state and parameters related to the respective element type; and a determination unit that receives the first prediction result and the one or more second prediction results and determines a subsequent element based on the first prediction result and the one or more second prediction results.

The information processing technique according to the present disclosure uses a natural language model based on a sparse attention network in order to obtain long-distance interdependencies between elements in a sequence of elements, thereby improving the accuracy of prediction of subsequent elements. For example, information processing techniques according to the present disclosure may be used to recommend code to a programmer that is the most likely next based on program code that the programmer has entered in an integrated development environment.

In an integrated development environment, code recommendations can be categorized into code (token) level recommendations and statement (statement) level recommendations. Examples of chips may be various identifiers used in a programming language, and examples of statements may be statements used in a programming language.

FIG. 1 is a schematic diagram illustrating code recommendation in an integrated development environment.

As shown in fig. 1, a chip level recommendation may automatically recommend a complete chip based on the characters that the programmer has entered, while a statement level recommendation may recommend the next most likely chip based on the chip sequence that the programmer has entered. For example, as shown in fig. 1, for the recommendation at the chip level, when a programmer inputs "Re", the recommendation function of the integrated development environment may automatically recommend the chips "Reader", "Reduce (function)", "ReferenceEror", etc., headed by "Re", as candidates for the chips to be input. Also, for example, as shown in FIG. 1, for a statement level recommendation, when a programmer enters a piece of code, the chip sequence "for (int i ═ 0; i < 10;"), the recommendation function of the integrated development environment may automatically recommend the chip "i + +" to complete the statement.

The embodiments described herein only relate to a recommendation that states the hierarchy, i.e. recommends to the programmer the next most likely chip according to a previously constructed model.

FIG. 2 is a schematic flow diagram illustrating the training process of a code recommendation model.

As shown in fig. 2, a source code as training data may first be subjected to a chipping process to obtain a chip sequence. Subsequently, the chip sequence is segmented to obtain chips, i.e. unlabeled training data (context chips) and labeled training data (prediction chips) are obtained. The labeled training data is then trained using a learning algorithm to obtain a code recommendation model that can be used for processing of unlabeled training data to obtain the label, i.e., the next most likely chip.

An information processing apparatus according to an embodiment of the present disclosure is described below with reference to fig. 3 to 8.

First embodiment

Fig. 3 is a block diagram illustrating an information processing apparatus 300 according to a first embodiment of the present disclosure. Fig. 4 is a schematic diagram showing an internal configuration of each unit of the information processing apparatus 300 according to the first embodiment of the present disclosure.

As shown in fig. 3, the information processing apparatus 300 includes a first prediction unit 301, a second prediction unit 302, and a determination unit 303.

The first prediction unit 301 may receive a chip sequence as an element sequence and generate an intermediate state and a first prediction result based on the chip sequence. As shown in fig. 4, the first prediction unit 301 may be implemented by a Long Short Term Memory (LSTM) neural network.

The LSTM neural network is a time Recurrent Neural Network (RNN) that is suitable for processing and predicting significant events of very long intervals and delays in a time series. Given that LSTM neural networks are well known to those skilled in the art, their application in embodiments of the present disclosure is described herein only, and the principles thereof are not described in further detail.

As shown in FIG. 4, a first prediction unit 301 implemented by an LSTM neural network receives a sequence input (x) composed of elements (e.g., chips) ₀ ,x ₁ ,…,x _t ) And outputs the intermediate state (h) ₀ ,h ₁ ,…,h _t ). Further, the first prediction unit 301 calculates as the first prediction result y according to the following expression (1) _t The distribution of the next element with the highest probability.

Wherein, the first and the second end of the pipe are connected with each other,

and

is a parameter that can be adjusted through training, | V | is the total number of elements in the set of elements. For example, in an integrated development environment, | V | is the total number of all chips in a programming language.

The second prediction unit 302 receives the intermediate state output by the first prediction unit 301 and generates a second prediction result based on the received intermediate state. As shown in fig. 4, the second prediction unit 302 may be implemented by a Sparse Attention Network (SAN) model. In recent years, sparse attention networks have been widely used in various types of deep learning tasks such as natural language processing, image recognition, and speech recognition, and are well known to those skilled in the art, and only the application thereof in the embodiments of the present disclosure will be described herein without further detailed description of the principles thereof.

Specifically, as shown in FIG. 4, at time step t, the most recent previous K elements are concatenated (concatenate) together as a context representation and stored in memory

In (1). K denotes the length of attention. Subsequently, at each time step t, based on the memory M according to the following equations (2) to (5) _t Generating an attention (attention) distribution

And context vector

Wherein

And

is a parameter that can be adjusted by training, and 1 _K Representing a K-dimensional vector with elements all 1. tanh denotes taking the tanh function, and softmax denotes taking the softmax function.

Further, the second prediction unit 302 holds a vector m of symbol ids expressed with respect to these elements _t ＝[id ₁ ,id ₂ ,…,id _K ] ^T ∈N ^K (i.e., pointers to large sets of all elements). Subsequently, the sparse distribution over the entire element set is obtained according to the following equations (6) and (7).

where-C is a large negative constant, e.g., -1000.

The determination unit 303 receives the first prediction y _t And a second predicted result i _t And determining a subsequent element based thereon. For example, in an integrated development environment, the next element is the next most likely chip.

Specifically, as shown in fig. 4, the determination unit 303 performs the first prediction result y for the first prediction unit 301 and the second prediction unit 302 according to the following equations (8) to (10) _t And a second predicted result i _t Calculate its probability distribution

And obtaining a final weighted prediction result

Wherein

And

respectively, a weight matrix and an offset that can be adjusted by training.

The prediction result obtained by the determination unit 303

May be used to predict subsequent elements. For example, in an integrated development environment, the next element is the next highest possible slice.

The information processing apparatus 300 according to the first embodiment of the present disclosure can improve recommendation accuracy without increasing calculation cost.

In the information processing apparatus 300 according to the first embodiment of the present disclosure, the second prediction unit 302 sets the length of attention with respect to any type of element to the same value, i.e., K. In addition, the second prediction unit 302 sets the attention start position with respect to any type of element to the same value, i.e., t-K.

However, the inventors have noted that in practical applications, for example, for JavaScript's source code, the definitions of different types of identifiers and the proportion of the distance between calls over different distance intervals are not the same. FIG. 5 is a schematic diagram showing statistics of distances between calls and definitions of different types of identifiers in the JavaScript programming language.

There are five types of identifiers in the JavaScript programming language, namely variable (variable), array (array), function (function), class (class), and attribute (attribute). For example, as shown in FIG. 5, the distance between the definition of the array and the call is the largest proportion, and the distance between the definition of the function and the call is the smallest proportion, in units of chips, over a distance of 0-10. The ratio of the distance between the definition of the five types of identifiers and the call is substantially similar over a distance of 10-20. The distance between the definition of the class and the call is the largest proportion of the distance between 20 and 30. The distance between the definition of the function and the call is the largest proportion of the distances 30-40 and 40-50.

Based on the above statistical data, the information processing apparatus according to the first embodiment of the present disclosure may be improved to further improve the accuracy of recommendation.

Second embodiment

Fig. 6 is a block diagram illustrating an information processing apparatus 600 according to a second embodiment of the present disclosure. Fig. 7 is a schematic diagram showing an internal configuration of each unit of an information processing apparatus 600 according to a second embodiment of the present disclosure.

Similar to the information processing apparatus 300 according to the first embodiment of the present disclosure, the information processing apparatus 600 according to the second embodiment of the present disclosure also includes a first prediction unit 601, a second prediction unit 602, and a determination unit 603. However, unlike the information processing apparatus 300 according to the first embodiment of the present disclosure, the information processing apparatus 600 according to the second embodiment of the present disclosure includes a plurality of second prediction units 602.

According to a second embodiment of the present disclosure, the number of second prediction units 602 corresponds to the number of element types in the set of elements, there being one corresponding second prediction unit 602 for each element type. For example, for a chipping code recommendation in a JavaScript programming environment, there are 5 second prediction units 602, corresponding to different identifier type variables, arrays, functions, classes, and attributes, respectively.

Similar to the first prediction unit 301 of the information processing apparatus 300 according to the first embodiment, the first prediction unit 601 of the information processing apparatus 600 according to the second embodiment may receive a chip sequence as an element sequence and generate an intermediate state and a first prediction result based on the chip sequence. As shown in fig. 7, the first prediction unit 601 may also be implemented by a long-short term memory neural network.

As shown in FIG. 7, a first prediction unit 601 implemented by an LSTM neural network receives a sequence input (x) composed of elements (e.g., chips) ₀ ,x ₁ ,…,x _t ) And outputs the intermediate state (h) ₀ ,h ₁ ,…,h _t ). Further, the first prediction unit 601 calculates as the first prediction result y according to the following expression (11) _0,t The distribution of the next element with the highest probability.

Wherein the content of the first and second substances,

and

is a parameter that can be adjusted by training, | V | is the total number of elements in the set of elements, e.g., the total number of all chips in the programming language.

According to the second disclosed embodiment, the plurality of second prediction units 602 receive the intermediate state output by the first prediction unit 601 and generate a plurality of second prediction results based on the intermediate state and parameters related to the respective element types. As shown in fig. 7, the second prediction unit 602 may also be implemented by a sparse attention network. However, since the second prediction units 602 correspond to the element types one to one, different second prediction units 602 may be set for different element types.

As described above, in practical applications, different attention start positions and attention lengths may be set for different element types, thereby further improving recommendation accuracy. To this end, in the second embodiment according to the disclosure, respective second prediction units 602 are provided for respective element types, and the respective attention start position and the attention length of each second prediction unit 602 are individually set for the respective element types.

According to the second embodiment of the present disclosure, the parameters related to the respective element types in the plurality of second prediction units 602 may include respective attention start positions and attention lengths set for the respective element types.

As shown in fig. 7, unlike the second prediction unit 302 according to the first embodiment, the second prediction unit 602 according to the second embodiment has four inputs. The first input is the intermediate state h output by the first prediction unit 601 ₀ ,h ₁ ,…,h _t . The second input is b representing the attention start position for the ith element type at time step t-1 _1,t-1 ,b _2,t-1 ,…,b _N,t-1 Wherein i is more than or equal to 1 and less than or equal to N. N represents the number of element types, i.e., the number of second prediction units 602. For a JavaScript programming environment, N-5. The third input is l representing the length of attention for the ith element type at time step t-1 _1,t-1 ,l _2,t-2 ,…,l _N,t-1 . The fourth input is m _i,t ∈N ^K Which is a vector of symbol ids for element type i at time step t.

Further, the second prediction unit 602 according to the second embodiment has four outputs. First output c _i,t A context vector representing a sparse attention network for element type i at time step t. A second output b _i,t Indicating the attention start position for the ith element type at time step t. Third output l _i,t The length of attention with respect to the ith element type at time step t is indicated. Fourth output y _i,t Representing a pseudo-sparse distribution over the entire set of elements generated by the respective second prediction unit 602 at time step t.

Fig. 8 is a schematic diagram illustrating an internal structure of the second prediction unit 602 used in fig. 7.

As shown in FIG. 8, for the second prediction unit 602 corresponding to the ith element type, at time step t, the latest previous K elements are concatenated together as a context representation and stored in memory

In (1). Subsequently, based on the memory M at each time step t according to the following equations (12) to (15) _i,t To generate an attention profile

And a context vector c _i,t ∈R ^k 。

Wherein

And

is a parameter that can be adjusted by training, and

denotes l of all elements 1 _i,t-1 A dimension vector.

Further, the second prediction unit 602 holds a vector of symbol ids represented with respect to these elements

(i.e., pointers to a large set of all elements). Subsequently, the sparse distribution over the entire element set is obtained according to the following equations (16) and (17).

where-C is a large negative constant, e.g., -1000.

According to the second embodiment of the present disclosure, the second output b representing the attention start position with respect to the i-th element type at time step t may be obtained by training according to the following equations (18) to (20) _i,t 。

b _i,t ＝round(q _i,t ×t)∈N,0≤b _i,t ≤t (20)

Wherein the content of the first and second substances,

and

are parameters that can be adjusted by training. round stands for the rounding function, i.e. q is taken _i,t Integer part of x t, thereby b _i,t ∈[0,t]。

Similarly, according to the second embodiment of the present disclosure, the third output l representing the length of attention with respect to the i-th element type at time step t may be obtained by training according to the following equations (21) to (23) _i,t 。

l _i,t ＝round(v _i,t ×t)∈N,0≤l _i,t ≤t (23)

Wherein the content of the first and second substances,

and

are parameters that can be adjusted by training. round stands for the rounding function, i.e. v _i,t Integer part of x t, thus l _i,t ∈[0,t]。

The determination unit 603 receives the first prediction y _0,t And a second prediction result y _i,t And subsequent elements are determined based thereon. For example, in an integrated development environment, the next element is the next highest possible slice.

Specifically, as shown in fig. 7, the determination unit 303 targets the first prediction result y of the first prediction unit 601 and the second prediction unit 602 according to the following equations (24) to (26) _0,t And a second prediction result y _i,t Calculate its probability distribution

And obtaining a final weighted prediction result

Wherein

And

respectively, a weight matrix and an offset that can be adjusted by training.

Prediction result obtained by the determination unit 603

According to the second embodiment of the present disclosure, different attention lengths and attention start positions can be set for different element types, thereby further improving prediction accuracy without increasing calculation cost.

Furthermore, according to the second embodiment of the present disclosure, attention lengths and attention start positions for different element types may be trained in an iterative manner based on labeled training data to finally determine an attention length and an attention start position suitable for the respective element type.

Fig. 9 is a flow chart illustrating an information processing method 900 according to the present disclosure. The information processing method 900 is used for processing an element sequence composed of a plurality of elements in an element set to predict a subsequent element, wherein a logical relationship exists between the elements in the element sequence

The information processing method 900 starts in step S901. Subsequently, in step S902, a sequence of elements is received and an intermediate state and a first prediction result are generated based on the sequence of elements. Step S902 may be implemented by the first prediction unit 601 according to the second embodiment of the present disclosure.

Subsequently, in step S903, for each element type, an intermediate state is received and one or more second predicted results are generated based on the intermediate state and parameters related to the respective element type. Step S903 may be implemented by the second prediction unit 602 according to the second embodiment of the present disclosure.

Next, in step S904, the first predicted result and the one or more second predicted results are received and the subsequent element is determined based on the first predicted result and the one or more second predicted results. Step S904 may be realized by the determination unit 603 according to the second embodiment of the present disclosure.

Finally, the information processing method 900 ends at step S905.

Although the embodiments of the present disclosure are described above in connection with code recommendation in an integrated development environment, it will be apparent to those skilled in the art that the embodiments of the present disclosure are equally applicable to other applications, such as natural language processing, speech processing, etc., that predict the next most likely element from an existing sequence of elements.

Fig. 10 is a block diagram showing a configuration of a general-purpose machine 1000 that can be used to implement an information processing apparatus and an information processing method according to an embodiment of the present disclosure. General purpose machine 1000 may be, for example, a computer system. It should be noted that the general purpose machine 1000 is only one example and is not intended to suggest any limitation as to the scope of use or functionality of the methods and apparatus of the present disclosure. Neither should the general-purpose machine 1000 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the information processing methods and apparatus described above.

In fig. 10, a Central Processing Unit (CPU)1001 executes various processes in accordance with a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 to a Random Access Memory (RAM) 1003. In the RAM 1003, data necessary when the CPU 1001 executes various processes and the like is also stored as necessary. The CPU 1001, ROM1002, and RAM 1003 are connected to each other via a bus 1004. An input/output interface 1005 is also connected to the bus 1004.

The following components are also connected to the input/output interface 1005: an input section 1006 (including a keyboard, a mouse, and the like), an output section 1007 (including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like), a storage section 1008 (including a hard disk and the like), a communication section 1009 (including a network interface card such as a LAN card, a modem, and the like). The communication section 1009 performs communication processing via a network such as the internet. A drive 1010 may also be connected to the input/output interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like can be mounted on the drive 1010 as needed, so that a computer program read out therefrom can be installed into the storage section 1008 as needed.

In the case where the above-described series of processes is realized by software, a program constituting the software may be installed from a network such as the internet or from a storage medium such as the removable medium 1011.

It will be understood by those skilled in the art that such a storage medium is not limited to the removable medium 1011 shown in fig. 10, in which the program is stored, distributed separately from the apparatus to provide the program to the user. Examples of the removable medium 1011 include a magnetic disk (including a flexible disk), an optical disk (including a compact disc read only memory (CD-ROM) and a Digital Versatile Disc (DVD)), a magneto-optical disk (including a mini-disk (MD) (registered trademark)), and a semiconductor memory. Alternatively, the storage medium may be the ROM1002, a hard disk included in the storage section 1008, or the like, in which programs are stored and which are distributed to users together with the device including them.

In addition, the present disclosure also provides a program product storing machine-readable instruction codes. The instruction codes are read by a machine and can execute the information processing method according to the disclosure when being executed. Accordingly, various storage media listed above for carrying such a program product are also included within the scope of the present disclosure.

Having described in detail in the foregoing through block diagrams, flowcharts, and/or embodiments, specific embodiments of apparatus and/or methods according to embodiments of the disclosure are illustrated. When such block diagrams, flowcharts, and/or implementations contain one or more functions and/or operations, it will be apparent to those skilled in the art that each function and/or operation in such block diagrams, flowcharts, and/or implementations can be implemented, individually and/or collectively, by a variety of hardware, software, firmware, or virtually any combination thereof. In one embodiment, portions of the subject matter described in this specification can be implemented by Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Digital Signal Processors (DSPs), or other integrated forms. However, those skilled in the art will recognize that some aspects of the embodiments described in this specification can be equivalently implemented in whole or in part in integrated circuits, in the form of one or more computer programs running on one or more computers (e.g., in the form of one or more computer programs running on one or more computer systems), in the form of one or more programs running on one or more processors (e.g., in the form of one or more programs running on one or more microprocessors), in the form of firmware, or in virtually any combination thereof, and, it is well within the ability of those skilled in the art to design circuits and/or write code for the present disclosure, software and/or firmware, in light of the present disclosure.

It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components. The terms "first," "second," and the like, as used in ordinal numbers, do not denote an order of execution or importance of the features, elements, steps, or components defined by the terms, but are used merely for identification among the features, elements, steps, or components for clarity of description.

In summary, in the embodiments according to the present disclosure, the present disclosure provides the following schemes, but is not limited thereto:

scheme 1. an information processing apparatus for processing an element sequence made up of a plurality of elements in an element set to predict a subsequent element, there being a logical relationship between the elements in the element sequence, the information processing apparatus comprising:

a first prediction unit that receives the sequence of elements and generates an intermediate state and a first prediction result based on the sequence of elements;

one or more second prediction units, the number of the one or more second prediction units corresponding to the number of element types in the set of elements, there being one corresponding second prediction unit for each element type, the one or more second prediction units receiving the intermediate state and generating one or more second prediction results based on the intermediate state and parameters related to the respective element type; and

a determination unit to receive the first prediction result and the one or more second prediction results and to determine the subsequent element based on the first prediction result and the one or more second prediction results.

Scheme 2. the information processing apparatus according to scheme 1, wherein the first prediction unit is implemented by a long-short term memory network.

Scheme 3. the information processing apparatus of scheme 1, wherein the one or more second prediction units are implemented by a sparse attention network.

Scheme 4. the information processing apparatus according to scheme 3, wherein the parameters related to the respective element types in the one or more second prediction units include respective attention start positions and attention lengths set for the respective element types.

Scheme 5. the information processing apparatus according to scheme 1, wherein the determination unit determines the subsequent element according to weights of the first prediction result and the one or more second prediction results.

Scheme 6. the information processing apparatus of scheme 1, wherein the information processing apparatus is capable of training to adjust parameters of the first prediction unit, the second prediction unit, and the determination unit based on an existing element sequence.

Scheme 7. the information processing apparatus according to scheme 1, wherein the elements are code chips of a computer program language, and the sequence of elements are code segments of the computer program language.

The scheme 8 is an information processing method, which is used for processing an element sequence formed by a plurality of elements in an element set to predict a subsequent element, wherein a logical relationship exists between the elements in the element sequence, and the information processing method comprises the following steps:

receiving the sequence of elements and generating an intermediate state and a first prediction result based on the sequence of elements;

for each element type, receiving the intermediate state and generating one or more second predicted results based on the intermediate state and parameters related to the respective element type; and

receiving the first predicted outcome and the one or more second predicted outcomes and determining the subsequent element based on the first predicted outcome and the one or more second predicted outcomes.

Scheme 9. the information processing method according to scheme 8, wherein the parameters related to the respective element types include respective attention start positions and attention lengths set for the respective element types.

Scheme 10. a method for recommending code chips in a programming environment, the method being implemented by using the information processing apparatus according to any one of schemes 1 to 7.

A computer-readable storage medium on which a computer program is stored, the computer program, when executed by a computer, causing the computer to execute the information processing method according to claim 8.

While the disclosure has been disclosed by the description of the specific embodiments thereof, it will be appreciated that those skilled in the art will be able to devise various modifications, improvements, or equivalents of the disclosure within the spirit and scope of the appended claims. Such modifications, improvements and equivalents are also intended to be included within the scope of the present disclosure.

Claims

1. An information processing apparatus for processing an element sequence made up of a number of elements in an element set to predict a subsequent element, there being a logical relationship between the elements in the element sequence, the apparatus comprising:

one or more second prediction units, a number of the one or more second prediction units corresponding to a number of element types in the set of elements, there being one corresponding second prediction unit for each element type, the one or more second prediction units receiving the intermediate state and generating one or more second prediction results based on the intermediate state and parameters related to the respective element type; and

a determination unit that receives the first prediction result and the one or more second prediction results and determines the subsequent element based on the first prediction result and the one or more second prediction results,

wherein the one or more second prediction units are implemented by a sparse attention network,

wherein the parameters related to the respective element types in the one or more second prediction units include respective attention start positions and attention lengths set individually for the respective element types, and

wherein the element is a code chip of a computer program language, the sequence of elements is a code segment of the computer program language, and the subsequent element is the next highest possible chip.

2. The information processing apparatus according to claim 1, wherein the first prediction unit is realized by a long-short term memory network.

3. The information processing apparatus according to claim 1, wherein the determination unit determines the subsequent element according to weights of the first predicted result and the one or more second predicted results.

4. The information processing apparatus according to claim 1, wherein the information processing apparatus is capable of training to adjust parameters of the first prediction unit, the second prediction unit, and the determination unit based on an existing element sequence.

5. An information processing method for processing an element sequence composed of a plurality of elements in an element set to predict a subsequent element, wherein a logical relationship exists between the elements in the element sequence, the information processing method comprising:

for each element type, receiving, over a sparse attention network, the intermediate state and generating one or more second predicted results based on the intermediate state and parameters related to the respective element type; and

receiving the first predicted outcome and the one or more second predicted outcomes and determining the subsequent element based on the first predicted outcome and the one or more second predicted outcomes,

wherein the parameters related to the respective element types include respective attention start positions and attention lengths set individually for the respective element types, and

6. A method for recommending code chips in a programming environment, the method being implemented by using the information processing apparatus according to any one of claims 1 to 4.