JPWO2021199442A5

JPWO2021199442A5 -

Info

Publication number: JPWO2021199442A5
Application number: JP2022511492A
Authority: JP
Filing date: 2020-04-03
Publication date: 2022-05-13
Anticipated expiration: 2040-04-03

Claims

The number of topics q (q is 2 to p), which is the number of topics that reflects the characteristics of the first software product, included in the number of topics from the number of topics 2 to the number of topics p (p is an integer of 3 or more). The first relational value, which is a value representing the relation between the first software artifact and the first software artifact, is acquired, and for each of the plurality of topics from the number of topics 2 to the number of topics p. A value representing the relationship between the second software product and the number of topics is calculated as the second relationship value, and based on the degree of similarity between the first relationship value and each of the plurality of second relationship values. , The topic generated to select the topic number r (r is an integer from 2 to p) from the topic number 2 to the topic number p and convert the first software artifact into vector data. An information processing apparatus having an inference processing unit that selects a dictionary of several r as a dictionary for converting the second software product into vector data.

The inference processing unit
The first software product is converted into vector data using an initial dictionary and then converted into vector data.
The vector data obtained by the conversion is divided into topic clusters for each of the plurality of topics from the number of topics 2 to the number of topics p.
For each of the plurality of topics from the number of topics 2 to the number of topics p, the first relational value is acquired from the learning processing unit that generates a dictionary for each topic cluster.
The information processing apparatus according to claim 1, wherein a plurality of dictionaries having the number of topics r generated by the learning processing unit are selected as dictionaries for converting the second software product into vector data.

The inference processing unit
As the first relational value, a latent vector having the topic distribution of the number of topics q and the potential probability of the word included in the first software product in the topic distribution of the number of topics q is acquired. ,
For each of the plurality of topics from the number of topics 2 to the number of topics p, as the second relational value, the topic distribution of the number of topics and the number of each topic of the words included in the second software deliverable. The information processing apparatus according to claim 1, wherein a latent vector having a potential probability as an element in the topic distribution of the above is calculated.

The inference processing unit
As a dictionary for converting the first software product into vector data, a dictionary for converting the first source code into vector data is generated.
The first relational value is obtained from the learning processing unit that calculates the value representing the relation between the first source code and the number of topics q as the first relational value.
For each of the plurality of topics from the number of topics 2 to the number of topics p, a value representing the relationship between the second source code and the number of each topic is calculated as the second relationship value.
The information processing apparatus according to claim 1, wherein the dictionary of the number of topics r generated by the learning processing unit is selected as a dictionary for converting the second source code into vector data.

The inference processing unit
As a dictionary for converting the first software product into vector data, a dictionary for converting the first software-related document into vector data is generated.
The first relational value is obtained from the learning processing unit that calculates the value representing the relation between the first software-related document and the number of topics q as the first relational value.
For each of the plurality of topics from the number of topics 2 to the number of topics p, a value representing the relationship between the second software-related document and the number of each topic is calculated as the second relationship value.
The information processing apparatus according to claim 1, wherein the dictionary of the number of topics r generated by the learning processing unit is selected as a dictionary for converting the second software-related document into vector data.

For each of a plurality of topics from the number of topics 2 to the number of topics p (p is an integer of 3 or more), the first software product is converted into vector data using an initial dictionary, and the vector data obtained by the conversion. Is divided into topic clusters for each of the plurality of topics from the number of topics 2 to the number of topics p, and for each of the plurality of topics from the number of topics 2 to the number of topics p, the first topic cluster is used. A dictionary for converting the software product of 1 into vector data is generated, and the number of topics q (q is 2) in which the characteristics of the first software product are reflected from the number of topics 2 to the number of topics p. An information processing device having a learning processing unit that selects (an integer from to p) and calculates a value representing the relationship between the first software product and the number of topics q as the first relationship value.

The learning processing unit
For each of the plurality of topics from the number of topics 2 to the number of topics p, a plurality of topic clusters obtained by dividing the vector data are analyzed, and the number of topics 2 to the number of topics p is described. Select the number of topics s (s is an integer from 2 to p) that reflects the characteristics of the first software product.
The first software product is converted into a plurality of vector data using a plurality of dictionaries having the number of topics, and the plurality of vector data obtained by the conversion are combined.
Request to analyze the join vector data obtained by joining and select the topic number q as the number of topics reflecting the characteristics of the first software product from the topic number 2 to the topic number p. Item 6. The information processing apparatus according to Item 6.

The learning processing unit
As the first relational value, a latent vector having the topic distribution of the number of topics q and the potential probability of the word included in the first software product in the topic distribution of the number of topics q is calculated. The information processing apparatus according to claim 7 .

The learning processing unit
The eighth aspect of the present invention, wherein a plurality of topic distributions having the number of topics q are calculated, an adjacency matrix representing the distance between the plurality of topic distributions is calculated, and the latent vector is calculated using the adjacency matrix. Information processing equipment.

The number of topics q (q is the number of topics) that reflects the characteristics of the first software product, which is included in the number of topics from the number of topics 2 to the number of topics p (p is an integer of 3 or more). The first relational value, which is a value representing the relation between (an integer from 2 to p) and the first software artifact, is acquired, and the number of topics from the number of topics 2 to the number of topics p is multiple. For each, a value representing the relationship between the second software product and the number of topics is calculated as the second relation value, and the similarity between the first relation value and each of the plurality of second relation values is obtained. Generated to select the number of topics r (r is an integer from 2 to p) from the number of topics 2 to the number of topics p based on the degree, and convert the first software artifact into vector data. An information processing method for selecting a dictionary having the number of topics r as a dictionary for converting the second software product into vector data.

The computer converts the first software artifact into vector data using an initial dictionary for each of a plurality of topic numbers from the number of topics 2 to the number of topics p (p is an integer of 3 or more), and is obtained by the conversion. The vector data is divided into topic clusters for each of the plurality of topics from the number of topics 2 to the number of topics p, and for each of the plurality of topics from the number of topics 2 to the number of topics p for each topic cluster. Generates a dictionary for converting the first software product into vector data, and from the number of topics 2 to the number of topics p, the number of topics q (which reflects the characteristics of the first software product). An information processing method in which q is an integer from 2 to p) and a value representing the relationship between the first software product and the number of topics q is calculated as the first relationship value.

The number of topics q (q is 2 to p), which is the number of topics that reflects the characteristics of the first software product, included in the number of topics from the number of topics 2 to the number of topics p (p is an integer of 3 or more). The first relational value, which is a value representing the relation between the first software artifact and the first software artifact, is acquired, and for each of the plurality of topics from the number of topics 2 to the number of topics p. A value representing the relationship between the second software product and the number of topics is calculated as the second relationship value, and based on the degree of similarity between the first relationship value and each of the plurality of second relationship values. , The topic generated to select the topic number r (r is an integer from 2 to p) from the topic number 2 to the topic number p and convert the first software artifact into vector data. An information processing program that causes a computer to execute an inference process that selects a dictionary of several r as a dictionary for converting the second software product into vector data.

For each of a plurality of topics from the number of topics 2 to the number of topics p (p is an integer of 3 or more), the first software product is converted into vector data using an initial dictionary, and the vector data obtained by the conversion. Is divided into topic clusters for each of the plurality of topics from the number of topics 2 to the number of topics p, and for each of the plurality of topics from the number of topics 2 to the number of topics p, the first topic cluster is used. A dictionary for converting the software product of 1 into vector data is generated, and the number of topics q (q is 2) in which the characteristics of the first software product are reflected from the number of topics 2 to the number of topics p. An information processing program that selects an integer from to p) and causes a computer to execute a learning process that calculates a value representing the relationship between the first software product and the number of topics q as the first relationship value.