JPWO2021199442A5 - - Google Patents
Download PDFInfo
- Publication number
- JPWO2021199442A5 JPWO2021199442A5 JP2022511492A JP2022511492A JPWO2021199442A5 JP WO2021199442 A5 JPWO2021199442 A5 JP WO2021199442A5 JP 2022511492 A JP2022511492 A JP 2022511492A JP 2022511492 A JP2022511492 A JP 2022511492A JP WO2021199442 A5 JPWO2021199442 A5 JP WO2021199442A5
- Authority
- JP
- Japan
- Prior art keywords
- topics
- topic
- vector data
- software product
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000010365 information processing Effects 0.000 claims 13
- 238000009826 distribution Methods 0.000 claims 8
- 238000006243 chemical reaction Methods 0.000 claims 5
- 239000011159 matrix material Substances 0.000 claims 2
- 238000000034 method Methods 0.000 claims 2
- 238000003672 processing method Methods 0.000 claims 2
Claims (13)
前記第1のソフトウェア成果物を初期辞書を用いてベクトルデータに変換し、
変換により得られたベクトルデータを前記トピック数2から前記トピック数pまでの複数のトピック数の各々でトピッククラスタに分割し、
前記トピック数2から前記トピック数pまでの複数のトピック数の各々で、トピッククラスタごとに辞書を生成する学習処理部から、前記第1の関係値を取得し、
前記学習処理部により生成された前記トピック数rの複数の辞書を前記第2のソフトウェア成果物をベクトルデータに変換するための辞書として選択する請求項1に記載の情報処理装置。 The inference processing unit
The first software product is converted into vector data using an initial dictionary and then converted into vector data.
The vector data obtained by the conversion is divided into topic clusters for each of the plurality of topics from the number of topics 2 to the number of topics p.
For each of the plurality of topics from the number of topics 2 to the number of topics p, the first relational value is acquired from the learning processing unit that generates a dictionary for each topic cluster.
The information processing apparatus according to claim 1, wherein a plurality of dictionaries having the number of topics r generated by the learning processing unit are selected as dictionaries for converting the second software product into vector data.
前記第1の関係値として、前記トピック数qのトピック分布と、前記第1のソフトウェア成果物に含まれる単語の前記トピック数qのトピック分布における潜在的な確率を要素とする潜在ベクトルを取得し、
前記トピック数2から前記トピック数pまでの複数のトピック数の各々について、前記第2の関係値として、各トピック数のトピック分布と、前記第2のソフトウェア成果物に含まれる単語の各トピック数のトピック分布における潜在的な確率を要素とする潜在ベクトルとを算出する請求項1に記載の情報処理装置。 The inference processing unit
As the first relational value, a latent vector having the topic distribution of the number of topics q and the potential probability of the word included in the first software product in the topic distribution of the number of topics q is acquired. ,
For each of the plurality of topics from the number of topics 2 to the number of topics p, as the second relational value, the topic distribution of the number of topics and the number of each topic of the words included in the second software deliverable. The information processing apparatus according to claim 1, wherein a latent vector having a potential probability as an element in the topic distribution of the above is calculated.
前記第1のソフトウェア成果物をベクトルデータに変換するための辞書として、第1のソースコードをベクトルデータに変換するための辞書を生成し、
前記第1のソースコードと前記トピック数qとの関係が表される値を前記第1の関係値として算出する学習処理部から、前記第1の関係値を取得し、
前記トピック数2から前記トピック数pまでの複数のトピック数の各々について、第2のソースコードと各トピック数との関係が表される値を前記第2の関係値として算出し、
前記学習処理部により生成された前記トピック数rの辞書を前記第2のソースコードをベクトルデータに変換するための辞書として選択する請求項1に記載の情報処理装置。 The inference processing unit
As a dictionary for converting the first software product into vector data, a dictionary for converting the first source code into vector data is generated.
The first relational value is obtained from the learning processing unit that calculates the value representing the relation between the first source code and the number of topics q as the first relational value.
For each of the plurality of topics from the number of topics 2 to the number of topics p, a value representing the relationship between the second source code and the number of each topic is calculated as the second relationship value.
The information processing apparatus according to claim 1, wherein the dictionary of the number of topics r generated by the learning processing unit is selected as a dictionary for converting the second source code into vector data.
前記第1のソフトウェア成果物をベクトルデータに変換するための辞書として、第1のソフトウェア関連ドキュメントをベクトルデータに変換するための辞書を生成し、
前記第1のソフトウェア関連ドキュメントと前記トピック数qとの関係が表される値を前記第1の関係値として算出する学習処理部から、前記第1の関係値を取得し、
前記トピック数2から前記トピック数pまでの複数のトピック数の各々について、第2のソフトウェア関連ドキュメントと各トピック数との関係が表される値を前記第2の関係値として算出し、
前記学習処理部により生成された前記トピック数rの辞書を前記第2のソフトウェア関連ドキュメントをベクトルデータに変換するための辞書として選択する請求項1に記載の情報処理装置。 The inference processing unit
As a dictionary for converting the first software product into vector data, a dictionary for converting the first software-related document into vector data is generated.
The first relational value is obtained from the learning processing unit that calculates the value representing the relation between the first software-related document and the number of topics q as the first relational value.
For each of the plurality of topics from the number of topics 2 to the number of topics p, a value representing the relationship between the second software-related document and the number of each topic is calculated as the second relationship value.
The information processing apparatus according to claim 1, wherein the dictionary of the number of topics r generated by the learning processing unit is selected as a dictionary for converting the second software-related document into vector data.
前記トピック数2から前記トピック数pまでの複数のトピック数の各々で、前記ベクトルデータの分割により得られた複数のトピッククラスタを解析して、前記トピック数2から前記トピック数pの中から前記第1のソフトウェア成果物の特徴が反映されるトピック数s(sは2からpまでの整数)を選択し、
前記トピック数sの複数の辞書を用いて前記第1のソフトウェア成果物を複数のベクトルデータに変換し、変換により得られた前記複数のベクトルデータを結合し、
結合により得られた結合ベクトルデータを解析して、前記トピック数2から前記トピック数pの中から、前記第1のソフトウェア成果物の特徴が反映されるトピック数として前記トピック数qを選択する請求項6に記載の情報処理装置。 The learning processing unit
For each of the plurality of topics from the number of topics 2 to the number of topics p, a plurality of topic clusters obtained by dividing the vector data are analyzed, and the number of topics 2 to the number of topics p is described. Select the number of topics s (s is an integer from 2 to p) that reflects the characteristics of the first software product.
The first software product is converted into a plurality of vector data using a plurality of dictionaries having the number of topics, and the plurality of vector data obtained by the conversion are combined.
Request to analyze the join vector data obtained by joining and select the topic number q as the number of topics reflecting the characteristics of the first software product from the topic number 2 to the topic number p. Item 6. The information processing apparatus according to Item 6.
前記第1の関係値として、前記トピック数qのトピック分布と、前記第1のソフトウェア成果物に含まれる単語の前記トピック数qのトピック分布における潜在的な確率を要素とする潜在ベクトルを算出する請求項7に記載の情報処理装置。 The learning processing unit
As the first relational value, a latent vector having the topic distribution of the number of topics q and the potential probability of the word included in the first software product in the topic distribution of the number of topics q is calculated. The information processing apparatus according to claim 7 .
前記トピック数qの複数のトピック分布を算出し、前記複数のトピック分布の相互間の距離が表される隣接行列を算出し、前記隣接行列を用いて前記潜在ベクトルを算出する請求項8に記載の情報処理装置。 The learning processing unit
The eighth aspect of the present invention, wherein a plurality of topic distributions having the number of topics q are calculated, an adjacency matrix representing the distance between the plurality of topic distributions is calculated, and the latent vector is calculated using the adjacency matrix. Information processing equipment.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/015380 WO2021199442A1 (en) | 2020-04-03 | 2020-04-03 | Information processing device, information processing method, and information processing program |
Publications (3)
Publication Number | Publication Date |
---|---|
JPWO2021199442A1 JPWO2021199442A1 (en) | 2021-10-07 |
JPWO2021199442A5 true JPWO2021199442A5 (en) | 2022-05-13 |
JP7113998B2 JP7113998B2 (en) | 2022-08-05 |
Family
ID=77928725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2022511492A Active JP7113998B2 (en) | 2020-04-03 | 2020-04-03 | Information processing device, information processing method and information processing program |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP7113998B2 (en) |
WO (1) | WO2021199442A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024069741A1 (en) * | 2022-09-27 | 2024-04-04 | 三菱電機株式会社 | Software technological field extraction device and software technological field extraction method |
-
2020
- 2020-04-03 JP JP2022511492A patent/JP7113998B2/en active Active
- 2020-04-03 WO PCT/JP2020/015380 patent/WO2021199442A1/en active Application Filing
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108959246B (en) | Answer selection method and device based on improved attention mechanism and electronic equipment | |
US10055686B2 (en) | Dimensionally reduction of linguistics information | |
JP6265921B2 (en) | Method, apparatus and product for semantic processing of text | |
US20220051104A1 (en) | Accelerating inference of traditional ml pipelines with neural network frameworks | |
US10025773B2 (en) | System and method for natural language processing using synthetic text | |
JP4332129B2 (en) | Document classification program, document classification method, and document classification apparatus | |
JP2005302043A5 (en) | ||
US8019594B2 (en) | Method and apparatus for progressively selecting features from a large feature space in statistical modeling | |
CN111460812B (en) | Sentence emotion classification method and related equipment | |
US11538481B2 (en) | Speech segmentation based on combination of pause detection and speaker diarization | |
US11874866B2 (en) | Multiscale quantization for fast similarity search | |
CN113591093B (en) | Industrial software vulnerability detection method based on self-attention mechanism | |
JP7193000B2 (en) | Similar document search method, similar document search program, similar document search device, index information creation method, index information creation program, and index information creation device | |
JP7355123B2 (en) | Program generation device, program generation method, and program | |
JPWO2021199442A5 (en) | ||
Zhang et al. | OpenFE: automated feature generation with expert-level performance | |
CN109299260B (en) | Data classification method, device and computer readable storage medium | |
JPWO2018235177A1 (en) | Information processing apparatus, information processing system, information processing method, and program | |
WO2021199442A1 (en) | Information processing device, information processing method, and information processing program | |
Zagagy et al. | ACKEM: automatic classification, using KNN based ensemble modeling | |
JP4332161B2 (en) | Vocabulary twist elimination program, vocabulary twist elimination method and vocabulary twist elimination apparatus | |
WO2013153725A1 (en) | Data search device, data search method, and program for data search | |
US11810572B2 (en) | Multi-threaded speaker identification | |
WO2024134784A1 (en) | Data analysis device and data analysis program | |
WO2024034196A1 (en) | Trained model selection method, trained model selection device, and trained model selection program |