CN111090462A - API (application program interface) matching method and device based on API document - Google Patents
API (application program interface) matching method and device based on API document Download PDFInfo
- Publication number
- CN111090462A CN111090462A CN201911239725.5A CN201911239725A CN111090462A CN 111090462 A CN111090462 A CN 111090462A CN 201911239725 A CN201911239725 A CN 201911239725A CN 111090462 A CN111090462 A CN 111090462A
- Authority
- CN
- China
- Prior art keywords
- api
- information
- description
- similarity value
- document
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 17
- 239000008186 active pharmaceutical agent Substances 0.000 claims abstract description 95
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 7
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 7
- 239000000284 extract Substances 0.000 abstract description 2
- 238000001035 drying Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 3
- 238000013508 migration Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/73—Program documentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an API matching method and device based on an API document. The method extracts API information by analyzing the description document of the API. The API information includes: input information, output information, behavior information. And then, respectively carrying out similarity calculation on input information, output information and behavior information of the two API information, and judging whether the two APIs are matched or not after synthesis. The invention integrates the information of input, output, behavior and the like, and improves the accuracy of API matching.
Description
Technical Field
The present invention relates to the field of automation of software design and development.
Background
Software developers often need to rewrite a project using different programming languages in order to migrate the project to different platforms. With the dramatic increase in the number of software, relying solely on manual migration is time consuming and laborious. Many code migration tools have been developed to speed up the migration of the same item in different languages, but they all face the challenge of API matching, i.e. how to match the API in language a to the API in language B.
To solve the API matching challenge, the mainstream method is to obtain API mapping by analyzing and learning the project source codes of different languages. However, this method has strict requirements for the data set. For example, multiple large-scale identical items in different languages, identical code fragments, larger API mapping data sets, etc. are required. Another current method based on the API documents mainly uses methods of statistics and text similarity to realize API mapping, but the methods do not fully utilize semantic information of the documents, such as form parameter description, return value description and API signature, and cannot well realize API matching.
Therefore, in order to avoid the defect of strict data requirement based on codes, the better implementation of API matching by fully utilizing semantic information of API documents is a problem to be solved at present.
Disclosure of Invention
The problems to be solved by the invention are as follows: the APIs of the two languages match the corresponding problem with each other.
In order to solve the problems, the invention adopts the following scheme:
the API matching method based on the API document comprises the following steps:
s1: obtaining description documents of at least two APIs;
s2: extracting API information by analyzing the description document of the API;
the API information includes: input information, output information, behavior information;
s3: respectively carrying out similarity calculation on input information, output information and behavior information of the two API information, and judging whether the two APIs are matched or not after synthesis;
the step S2 includes the steps of:
s21: extracting an API name, input parameters and a return type in a description document of the API;
s22: extracting key words in an API function description text of the description document of the API as behavior information;
s23: extracting key words in an API parameter description text of the description document of the API, and forming corresponding input information by the key words and corresponding input parameters;
s24: extracting key words in an API return description text of the description document of the API, and forming corresponding output information by the return types of the key words;
the step S3 includes:
s31: similarity calculation is carried out on the keywords in the behavior information of the two API information to obtain a first similarity value;
s32: similarity calculation is carried out on the keywords and the input parameters in the input information of the two API information, and a second similarity value is obtained;
s33: similarity calculation is carried out on the keywords and the return types in the output information of the two API information, and a third similarity value is obtained;
s34: and carrying out weighted average on the first similarity value, the second similarity value and the third similarity value to obtain an API similarity value.
Further, according to the API matching method based on API documents of the present invention, in step S1, the description documents of APIs in two languages are obtained: a description document of an API of a first language and a description document of an API of a second language; wherein the description document of the API of the second language relates to description documents of a plurality of APIs; through the steps S2 and S3, API similarity values of the profiles of the APIs in the first language and the profiles of the APIs in the respective second languages are calculated, and then the profile of the API in the second language with the highest API similarity value is selected as a matching result of the profiles of the APIs in the first language.
The API matching device based on the API document comprises the following modules:
m1, used for: obtaining description documents of at least two APIs;
m2, used for: extracting API information by analyzing the description document of the API;
the API information includes: input information, output information, behavior information;
m3, used for: respectively carrying out similarity calculation on input information, output information and behavior information of the two API information, and judging whether the two APIs are matched or not after synthesis;
the module M2 includes:
m21, used for: extracting an API name, input parameters and a return type in a description document of the API;
m22, used for: extracting key words in an API function description text of the description document of the API as behavior information;
m23, used for: extracting key words in an API parameter description text of the description document of the API, and forming corresponding input information by the key words and corresponding input parameters;
m24, used for: extracting key words in an API return description text of the description document of the API, and forming corresponding output information by the return types of the key words;
the module M3 includes:
m31, used for: similarity calculation is carried out on the keywords in the behavior information of the two API information to obtain a first similarity value;
m32, used for: similarity calculation is carried out on the keywords and the input parameters in the input information of the two API information, and a second similarity value is obtained;
m33, used for: similarity calculation is carried out on the keywords and the return types in the output information of the two API information, and a third similarity value is obtained;
s34: and carrying out weighted average on the first similarity value, the second similarity value and the third similarity value to obtain an API similarity value.
Further, according to the API matching apparatus based on API documents of the present invention, in the module M1, the description documents of APIs in two languages are obtained: a description document of an API of a first language and a description document of an API of a second language; wherein the description document of the API of the second language relates to description documents of a plurality of APIs; through the modules M2 and M3, API similarity values of the description documents of the API of the first language and the description documents of the APIs of the respective second languages are calculated, and then the description document of the API of the second language with the highest API similarity value is selected as a matching result of the description documents of the API of the first language.
The invention has the following technical effects: the invention integrates the information of input, output, behavior and the like, and improves the accuracy of the API.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
FIG. 2 is an example of a description document of an API entered by an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The API matching method based on the API document is used for matching the API of the Java language with the API of the Swift language. As is well known, mobile applications on the android system are typically developed based on the Java language, while mobile applications on the apple system are typically developed based on the Swift language. In the API matching method based on the API document of the embodiment, specifically, given an API in Java language of a certain project, an API in Swift language corresponding to the API in Java language is found from the API set in Swift language of the same project. More specifically, the description document of the API corresponding to the specified Java language API and the description documents of the APIs corresponding to the APIs in the Swift language API set are first obtained, and then the API of the Swift language corresponding to the description document of the API in the Swift language with the largest API similarity value is selected as the API of the Swift language corresponding to the API in the Java language matching the API in the corresponding Swift language by calculating the API similarity values of the description documents of the APIs in the two languages. The process of calculating the API similarity value of the description document of the two languages API as shown in fig. 1 mainly includes two steps, that is, the step S2 extracts the input/output behavior information and the step S3 calculates the similarity by the input/output behavior information.
Step S2 specifically includes the following steps:
s21: extracting an API name, input parameters and a return type in a description document of the API;
s22: extracting key words in an API function description text of the description document of the API as behavior information;
s23: extracting key words in an API parameter description text of the description document of the API, and forming corresponding input information by the key words and corresponding input parameters;
s24: and extracting key words in the API return description text of the description document of the API, and forming corresponding output information by the return types of the key words.
Taking FIG. 2 as an example, the API description in FIG. 2 defines an API named addAlll. The description document of the API comprises four parts: the first part is API definition text, namely "LinkedList Boolean addAll (LinkedList, int index, Collection c)"; the second part is API function Description text, namely part of the content defined by Description; the third part is API parameter description text, namely part of the content defined by Parameters; the fourth part is the API return description text, i.e., the part of the content defined by Return.
In step S21, the first part of API definition text of the API description document is processed, and the API name is extracted as: addAll; the input parameters are: { { LinkedList, anonym }, { int, index }, { Collection, c } }; the return type is Boolean. Wherein the input parameters can be expressed as a set of { p _ type, p _ name }, p _ type representing the type of the input parameters, and p _ name representing the name of the input parameters.
In step S22, namely, the second part of the API function description text in the description document of the API is processed, specifically: and removing stop words and numbers from the first sentence, and drying each word to obtain the keyword. Specifically in the example of fig. 2, the keywords "insert", "element", "spec", "collect", "list", "start", "posit" for word desiccation may be obtained. The set formed by the word drying keywords is behavior information.
In step S23, namely, the third part of the API parameter description text in the description document of the API is processed, specifically: and eliminating stop words and numbers from the first sentence corresponding to the input parameters, drying each word and word to obtain a keyword, and then corresponding to the parameters. In the example of fig. 2, the keywords "insert", "first", "element", "spec", "collect" of word mummification can be obtained for the input parameter { int, index }; for the input parameters { Collection, c }, the keywords "collect", "contact", "element", "add", "list" of word mummification can be obtained. The input information thus composed can be divided into two parts: the first part is input parameter type information, namely { "LinkedList", "int", "Collection" }; the second part is input parameter semantic information, { }, { "insert", "first", "element", "spec", "collect" }, { "collect", "contact", "element", "add", "list" }.
In step S24, a fourth part API of the API description document returns a description text for processing, specifically, stop words and numbers are removed from the first sentence, and each word and word is dried to obtain a keyword. Specifically, in the example of fig. 2, the output information is { Boolean, "true," "list," "change," "result" }.
It should be noted that, in the above steps S22, S23, and S24, the removal of stop words and numbers from the text and the drying of words in the text are familiar to those skilled in the art, and the specific processing procedures are not described in detail herein.
Step S3 is to calculate similarity values for the behavior information, the input information, and the output information, respectively, and then synthesize them to obtain an API similarity value, specifically:
s31: similarity calculation is carried out on the keywords in the behavior information of the two API information to obtain a first similarity value;
s32: similarity calculation is carried out on the keywords and the input parameters in the input information of the two API information, and a second similarity value is obtained;
s33: similarity calculation is carried out on the keywords and the return types in the output information of the two API information, and a third similarity value is obtained;
s34: and carrying out weighted average on the first similarity value, the second similarity value and the third similarity value to obtain an API similarity value.
In step S31, the first similarity value is calculated as a similarity value of a set of the word-drying keywords as the behavior information. The similarity calculation of the keyword sets is well known to those skilled in the art, and will not be described in detail herein.
In step S32, the similarity value is calculated for the input parameter type information, and then the similarity value is calculated according to the input parameter semantic information, and finally, the similarity value is integrated.
The similarity value is calculated for the input parameter type information, i.e. the similarity value between the input parameter type information of the two APIs is calculated. For API matching, the two matching API input parameter types must be consistent, and therefore, the similarity value between the input parameter type information of the two APIs is either 1 or 0. If the input parameter type information of the two APIs is the same, the similarity is 1, and otherwise, the similarity is 0. Specifically, in the example of fig. 2, it is sufficient to compare whether the input parameter type information of another API is the same as { "LinkedList", "int", "Collection" }.
And calculating a similarity value according to the input parameter semantic information, namely comparing the input parameter semantic information of the two APIs. Because there are a plurality of input parameters, each input parameter can be calculated independently and then an average similarity value is calculated comprehensively, specifically in the example of fig. 2, similarity is calculated for the keyword sets { }, { "insert", "first", "element", "spec", "collect" }, { "collect", "contact", "element", "add", "list" } corresponding to each input parameter and the keyword sets corresponding to each input parameter of another API, respectively, and then an average similarity value is calculated. Those skilled in the art understand that semantic information of each input parameter may also be synthesized to calculate a similarity value, specifically, in the example of fig. 2, a keyword set { "insert", "first", "element", "spec", "collect", "contact", "element", "add", "list" is obtained after synthesis, and then the keyword set is compared with a keyword set obtained after synthesis of another API input parameter to calculate a similarity value.
The integration between the similarity values calculated for the input parameter type information and the similarity values calculated according to the input parameter semantic information can generally be performed in two ways: the first is a weighted average and the second is a calculated product. Considering that the two matching API input parameter types must be kept consistent, the latter implementation manner is adopted in this embodiment, that is, assuming that the similarity value calculated for the input parameter type information is a, and the similarity value calculated according to the input parameter semantic information is b, then the second similarity value after the two are integrated is a × b.
In addition, another preferable mode is that if the two API input parameter type information are different, the second similarity value is directly taken as 0, otherwise, the similarity value is calculated according to the input parameter semantic information and taken as the second similarity value.
Step S33 is similar to step S32, and first, it is compared whether the return types of the two APIs are consistent, if not, the third similarity value is 0, otherwise, the similarity value is calculated as the third similarity value according to the keyword set of the word drying in the output information. Specifically, in the example of fig. 2, that is, the similarity values between the word-dried keyword set and the keyword sets { "true", "list", "change", "result" } in the output information of the other API are calculated to obtain a third similarity value.
Step S34 can be formulated as: s ═ w1×s1+w2×s2+w3×s3. Wherein s is1,s2,s3Respectively a first similarity value, a second similarity value and a third similarity value; w is a1,w2,w3The weighting coefficients are respectively corresponding to the first similarity value, the second similarity value and the third similarity value, and s is an API similarity value. Weighting coefficients w corresponding to the first similarity value, the second similarity value and the third similarity value1,w2,w3Preset, and comprises the following steps: w is a1+w2+w3=1。
Claims (4)
1. An API matching method based on an API document is characterized by comprising the following steps:
s1: obtaining description documents of at least two APIs;
s2: extracting API information by analyzing the description document of the API;
the API information includes: input information, output information, behavior information;
s3: respectively carrying out similarity calculation on input information, output information and behavior information of the two API information, and judging whether the two APIs are matched or not after synthesis;
the step S2 includes:
s21: extracting an API name, input parameters and a return type in a description document of the API;
s22: extracting key words in an API function description text of the description document of the API as behavior information;
s23: extracting key words in an API parameter description text of the description document of the API, and forming corresponding input information by the key words and corresponding input parameters;
s24: extracting key words in an API return description text of the description document of the API, and forming corresponding output information by the return types of the key words;
the step S3 includes:
s31: similarity calculation is carried out on the keywords in the behavior information of the two API information to obtain a first similarity value;
s32: similarity calculation is carried out on the keywords and the input parameters in the input information of the two API information, and a second similarity value is obtained;
s33: similarity calculation is carried out on the keywords and the return types in the output information of the two API information, and a third similarity value is obtained;
s34: and carrying out weighted average on the first similarity value, the second similarity value and the third similarity value to obtain an API similarity value.
2. The API matching method based on API document as recited in claim 1, wherein in said step S1, profiles of APIs in two languages are obtained: a description document of an API of a first language and a description document of an API of a second language; wherein the description document of the API of the second language relates to description documents of a plurality of APIs; through the steps S2 and S3, API similarity values of the profiles of the APIs in the first language and the profiles of the APIs in the respective second languages are calculated, and then the profile of the API in the second language with the highest API similarity value is selected as a matching result of the profiles of the APIs in the first language.
3. An API matching device based on an API document is characterized by comprising the following modules:
m1, used for: obtaining description documents of at least two APIs;
m2, used for: extracting API information by analyzing the description document of the API;
the API information includes: input information, output information, behavior information;
m3, used for: respectively carrying out similarity calculation on input information, output information and behavior information of the two API information, and judging whether the two APIs are matched or not after synthesis;
the module M2 includes:
m21, used for: extracting an API name, input parameters and a return type in a description document of the API;
m22, used for: extracting key words in an API function description text of the description document of the API as behavior information;
m23, used for: extracting key words in an API parameter description text of the description document of the API, and forming corresponding input information by the key words and corresponding input parameters;
m24, used for: extracting key words in an API return description text of the description document of the API, and forming corresponding output information by the return types of the key words;
the module M3 includes:
m31, used for: similarity calculation is carried out on the keywords in the behavior information of the two API information to obtain a first similarity value;
m32, used for: similarity calculation is carried out on the keywords and the input parameters in the input information of the two API information, and a second similarity value is obtained;
m33, used for: similarity calculation is carried out on the keywords and the return types in the output information of the two API information, and a third similarity value is obtained;
s34: and carrying out weighted average on the first similarity value, the second similarity value and the third similarity value to obtain an API similarity value.
4. The API matching apparatus based on API document as set forth in claim 3, wherein in said module M1, the profile of API in two languages is obtained: a description document of an API of a first language and a description document of an API of a second language; wherein the description document of the API of the second language relates to description documents of a plurality of APIs; through the modules M2 and M3, API similarity values of the description documents of the API of the first language and the description documents of the APIs of the respective second languages are calculated, and then the description document of the API of the second language with the highest API similarity value is selected as a matching result of the description documents of the API of the first language.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911239725.5A CN111090462B (en) | 2019-12-06 | 2019-12-06 | API (application program interface) matching method and device based on API document |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911239725.5A CN111090462B (en) | 2019-12-06 | 2019-12-06 | API (application program interface) matching method and device based on API document |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111090462A true CN111090462A (en) | 2020-05-01 |
CN111090462B CN111090462B (en) | 2021-04-30 |
Family
ID=70394819
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911239725.5A Active CN111090462B (en) | 2019-12-06 | 2019-12-06 | API (application program interface) matching method and device based on API document |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111090462B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112650833A (en) * | 2020-12-25 | 2021-04-13 | 哈尔滨工业大学(深圳) | API (application program interface) matching model establishing method and cross-city government affair API matching method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070255718A1 (en) * | 2006-04-28 | 2007-11-01 | Sap Ag | Method and system for generating and employing a dynamic web services interface model |
CN101547218A (en) * | 2009-05-07 | 2009-09-30 | 南京大学 | Multi-stage semantic Web service finding method |
CN101567005A (en) * | 2009-05-07 | 2009-10-28 | 浙江大学 | Semantic service registration and query method based on WordNet |
CN102780580A (en) * | 2012-06-21 | 2012-11-14 | 东南大学 | Trust-based composite service optimization method |
CN103036931A (en) * | 2011-09-30 | 2013-04-10 | 富士通株式会社 | Generating equipment and method of semantic network service document and web ontology language (OWL) concept analysis method |
CN103473243A (en) * | 2012-06-08 | 2013-12-25 | 富士通株式会社 | Method and device for generating semantic network service document |
-
2019
- 2019-12-06 CN CN201911239725.5A patent/CN111090462B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070255718A1 (en) * | 2006-04-28 | 2007-11-01 | Sap Ag | Method and system for generating and employing a dynamic web services interface model |
CN101547218A (en) * | 2009-05-07 | 2009-09-30 | 南京大学 | Multi-stage semantic Web service finding method |
CN101567005A (en) * | 2009-05-07 | 2009-10-28 | 浙江大学 | Semantic service registration and query method based on WordNet |
CN103036931A (en) * | 2011-09-30 | 2013-04-10 | 富士通株式会社 | Generating equipment and method of semantic network service document and web ontology language (OWL) concept analysis method |
CN103473243A (en) * | 2012-06-08 | 2013-12-25 | 富士通株式会社 | Method and device for generating semantic network service document |
CN102780580A (en) * | 2012-06-21 | 2012-11-14 | 东南大学 | Trust-based composite service optimization method |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112650833A (en) * | 2020-12-25 | 2021-04-13 | 哈尔滨工业大学(深圳) | API (application program interface) matching model establishing method and cross-city government affair API matching method |
Also Published As
Publication number | Publication date |
---|---|
CN111090462B (en) | 2021-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110502361B (en) | Fine granularity defect positioning method for bug report | |
CN110348214B (en) | Method and system for detecting malicious codes | |
CN107301170B (en) | Method and device for segmenting sentences based on artificial intelligence | |
CN111680494A (en) | Similar text generation method and device | |
CN104008166A (en) | Dialogue short text clustering method based on form and semantic similarity | |
CN112084794A (en) | Tibetan-Chinese translation method and device | |
JP6738769B2 (en) | Sentence pair classification device, sentence pair classification learning device, method, and program | |
CN111210432A (en) | Image semantic segmentation method based on multi-scale and multi-level attention mechanism | |
CN108536735A (en) | Multi-modal lexical representation method and system based on multichannel self-encoding encoder | |
CN111723192B (en) | Code recommendation method and device | |
CN111090462B (en) | API (application program interface) matching method and device based on API document | |
CN111783843A (en) | Feature selection method and device and computer system | |
CN111354354B (en) | Training method, training device and terminal equipment based on semantic recognition | |
CN113220996B (en) | Scientific and technological service recommendation method, device, equipment and storage medium based on knowledge graph | |
CN113935387A (en) | Text similarity determination method and device and computer readable storage medium | |
CN117573084A (en) | Code complement method based on layer-by-layer fusion abstract syntax tree | |
KR102474042B1 (en) | Method for analyzing association of diseases using data mining | |
CN115794105A (en) | Micro-service extraction method and device and electronic equipment | |
CN112988999A (en) | Construction method, device, equipment and storage medium of Buddha question and answer pair | |
CN109783820B (en) | Semantic parsing method and system | |
CN112597776A (en) | Keyword extraction method and system | |
CN117435246B (en) | Code clone detection method based on Markov chain model | |
CN112559841A (en) | Method and system for processing item comments, electronic equipment and readable storage medium | |
CN118587017B (en) | Big data marketing service method and system based on multi-mode generation type artificial intelligence | |
CN118332387B (en) | Classification method of text content classification system based on BERT model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |