CN106934254B - Analysis method and device for open source license - Google Patents
Analysis method and device for open source license Download PDFInfo
- Publication number
- CN106934254B CN106934254B CN201710081702.0A CN201710081702A CN106934254B CN 106934254 B CN106934254 B CN 106934254B CN 201710081702 A CN201710081702 A CN 201710081702A CN 106934254 B CN106934254 B CN 106934254B
- Authority
- CN
- China
- Prior art keywords
- open source
- source license
- conflict
- license
- detected
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title abstract description 10
- 238000012502 risk assessment Methods 0.000 claims abstract description 24
- 238000001514 detection method Methods 0.000 claims description 70
- 239000011159 matrix material Substances 0.000 claims description 34
- 238000000034 method Methods 0.000 claims description 25
- 230000014509 gene expression Effects 0.000 claims description 16
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 239000012634 fragment Substances 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 11
- 238000012545 processing Methods 0.000 description 9
- 238000011156 evaluation Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/75—Structural analysis for program understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/105—Arrangements for software license management or administration, e.g. for managing licenses at corporate level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/03—Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
- G06F2221/033—Test or assess software
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Technology Law (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention relates to the technical field of computers, in particular to an analysis method and device of an open source license, which comprises the following steps: receiving a file to be tested and planning conditions; detecting an open source license related to the file to be detected; performing conflict matching on the detected open source license and the planning condition, and determining a first conflict between the detected open source license and the planning condition; and generating a first risk assessment report according to the first conflict. The embodiment of the invention is used for analyzing and evaluating the use risk of the source license.
Description
Technical Field
The invention relates to the technical field of computers, in particular to an open source license analysis method and device.
Background
An open source license is a license that is friendly to business applications. The open source is open source code, which provides final source material in the product production and development, and usually indicates the source software, and the copyright holder of the software reserves a part of rights under the regulation of the agreement and allows the user to learn, modify and improve the quality of the software. Open source software is not entirely without limitation. The most fundamental limitation is that the open source software forces anyone who uses and modifies the software to agree on the promoter's copyright and all the participants' contributions. Anyone has the right to freely copy, modify, and use these source codes, and no restrictions on the domain of anyone or the community must be set. Commercial use of open source software, etc. is not limited. A license is one such legal document that guarantees these restrictions.
The open source license specifies terms regarding modifying, copying, and reissuing the source code. The number and variety of the existing open source licenses in the industry are various, and the size and range of each right granted to a licensee by different open source licenses are different. Because the same software often involves multiple open source licenses, which may conflict with each other or with the intended goals of the user, the use of open source software or secondary development based on open source software in a commercial environment faces many potential legal issues and risks.
The open source license detection tool automatically locates and identifies a particular open source license by scanning the software source code, etc. The existing license detection and analysis tool can only perform simple detection, marking and statistical operation, cannot support further risk assessment and analysis, and needs to be strengthened in the aspects of content and risk analysis of the open source license.
Disclosure of Invention
The application provides an evaluation method and device of an open source license, which are used for analyzing and evaluating the use risk of the open source license.
The embodiment of the invention provides an evaluation method of an open source license, which comprises the following steps:
receiving a file to be tested and planning conditions;
detecting an open source license related to the file to be detected;
performing conflict matching on the detected open source license and the planning condition, and determining a first conflict between the detected open source license and the planning condition;
and generating a first risk assessment report according to the first conflict.
Optionally, the detecting the open source license related to the file to be tested includes:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
Optionally, after detecting the open-source license related to the file to be tested, the method further includes:
performing conflict matching on the detected open source licenses, and determining a second conflict among the detected open source licenses;
and generating a second risk assessment report according to the second conflict.
Optionally, after determining the first conflict between the detected open-source license and the planning condition, the method further includes:
determining a risk level corresponding to the first conflict;
after determining the detected second conflict between the plurality of open source licenses, the method further comprises:
and determining a risk level corresponding to the second conflict.
Optionally, the method further includes:
receiving an identification and/or a snippet of an open source license;
determining a corresponding open source license from a database according to the identification and/or the segment;
and generating a license list according to the corresponding open source license.
An apparatus for evaluating an open source license, comprising:
the receiving unit is used for receiving the file to be tested and the planning condition;
the detection unit is used for detecting the open source license related to the file to be detected;
a matching unit, configured to perform conflict matching on the detected open-source license and the planning condition, and determine a first conflict between the detected open-source license and the planning condition;
and the reporting unit is used for generating a first risk assessment report according to the first conflict.
Optionally, the detection unit is specifically configured to:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
Optionally, the matching unit is further configured to perform conflict matching on the detected multiple open-source licenses, and determine a second conflict between the detected multiple open-source licenses;
the reporting unit is further configured to generate a second risk assessment report according to the second conflict.
Optionally, the matching unit is further configured to:
determining a risk level corresponding to the first conflict;
and determining a risk level corresponding to the second conflict.
Optionally, the receiving unit is further configured to receive an identifier and/or a fragment of an open source license;
the matching unit is further used for determining a corresponding open source license from a database according to the identification and/or the segment;
the report unit is further configured to generate a license list according to the corresponding open-source license.
In the embodiment of the invention, the server receives the file to be tested uploaded by the user and detects the open source license related to the file to be tested. Meanwhile, the server also receives the planning condition input by the user, wherein the planning condition is a relevant condition for the future planning of the software project. And performing conflict matching on the detected open source license and the planning condition, namely determining that the content of the related open source license conflicts with the planning condition of the software, and finally producing a first risk assessment report according to the first conflict and feeding back the first risk assessment report to the user. The embodiment of the invention automatically identifies the open source license contained in the software, determines the conflict between the open source license and the planning condition, and finally generates a risk assessment report based on the conflict, thereby providing support and reference for better tracking and developing the software and making reasonable decisions for users of the open source software.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a diagram illustrating a system architecture suitable for use with an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an evaluation of an open source license according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating evaluation of an open source license in an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for evaluating an open source license according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, a system architecture to which the embodiment of the present invention is applicable includes a web service module 101, a processing engine module 102, a database module 103, and an update maintenance module 104. The web service module 101, the processing engine module 102, the database module 103, and the update maintenance module 104 may be integrated in one server, or may be modules in different servers, where the servers may be network devices such as computers. Preferably, the web service module 101, the processing engine module 102, the database module 103 and the update maintenance module 104 may use cloud computing technology for information processing.
The web service module 101 provides the user with entries of functions of querying information of the open-source license, detecting the open-source license, evaluating risk, and the like, that is, the user inputs a file to be tested, planning conditions, and the like to the server through the web service module 101. In addition, the web service module 101 presents the query and analysis results to the user in the form of a list, a graph, text, etc.
The processing engine module 102 retrieves information meeting the conditions from the database module 103 according to the user's input and feeds back the result to the web service module 101 so as to be presented to the user in different forms of a search list, detailed information, and the like. The module supports keyword fuzzy query, namely, relevant data is searched for through character fragments.
The processing engine module 102 further detects an open source license related to the file to be tested by analyzing the received software source code, detects a usage report of the open source license generated by a copyrighted file, a non-copyrighted file and the like, and feeds back the usage report to the user in a format of a PDF file, a chart and the like. After detecting the open source license, detecting the open source license with conflict in the file to be tested or the conflict between the open source license and the planning condition based on the software project planning condition set or input by the user and the conflict rule preset by the expert, analyzing the advanced legal risk based on the conflict, generating a risk assessment report, and then feeding back to the user in a PDF file or chart format.
The database module 103 can be divided into an open source license information base and a conflict rule base. The open source license information base stores information such as agreement terms, applicable scenes, use conditions, limitations and the like of various open source licenses in the market. On one hand, the conflict rule base stores conflict rule expressions among different open source licenses, and known conflicts between the two open source licenses can be judged according to the expressions; on the other hand, the conflict rule base also stores service scene expressions which are not applicable to each open source license, and the potential conflict between the project planning related options and the open source license can be judged according to the expressions.
Fig. 2 exemplarily shows a flowchart of an evaluation method for an open source license according to an embodiment of the present invention, and as shown in fig. 2, the evaluation method for an open source license according to an embodiment of the present invention includes the following steps:
and step 204, generating a first risk assessment report according to the first conflict.
In the embodiment of the invention, the server receives the file to be tested uploaded by the user and detects the open source license related to the file to be tested. Meanwhile, the server also receives the planning condition input by the user, wherein the planning condition is a relevant condition for the future planning of the software project. And performing conflict matching on the detected open source license and the planning condition, namely determining that the content of the related open source license conflicts with the planning condition of the software, and finally producing a first risk assessment report according to the first conflict and feeding back the first risk assessment report to the user. The embodiment of the invention automatically identifies the open source license contained in the software, determines the conflict between the open source license and the planning condition, and finally generates a risk assessment report based on the conflict, thereby providing support and reference for better tracking and developing the software and making reasonable decisions for users of the open source software.
For the detection of an open source license, a method based on keyword matching is generally implemented in the prior art, and the problem of low identification precision exists, and the situations of missing check and errors of the license can occur. In the embodiment of the present invention, detecting the open source license related to the file to be detected includes:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
In the face of the problem that an open source license detection tool is low in identification precision, the solution provided by the text mainly helps a user to identify potential open source license information by means of text similarity calculation, and reduces the occurrence probability of missing and mistaken investigation of the open source license. Besides the text similarity method, a regular expression-based method is available, but the method using the regular expression requires manual setting of a large number of rules, and the situation that the identified open source license cannot be classified is easy to occur, and the text similarity method can well overcome the defects.
Specifically, in the embodiment of the invention, the K value of the K-shift algorithm is defined according to the text characteristics of different opening licenses, and the Similarity between the detected text and each opening license in the database is calculated by using the Jaccard Similarity algorithm, so that better time efficiency, accuracy and recall rate are achieved.
Because the file to be tested comprises a plurality of detection texts, the detection texts may be the open source licenses related to the file to be tested, or may be source codes, or other related data, in the embodiment of the present invention, the detection texts and the open source licenses in the database calculate the text similarity, so as to detect the open source licenses in the texts.
For a detection text, the text similarity calculation method of the embodiment of the invention is as follows:
1. the vocabulary of the text is statistically tested by the k-shift algorithm. And k is a self-defined variable and represents that k characters in the detection text are extracted. And traversing the detected text and sequentially storing k characters, wherein the text content is abcdefg, and k is 2, so that vocabularies ab, bc, cd, de, ef and fg are obtained.
2. Counting the word frequency of each word in the vocabulary table in the detection text, and constructing a first characteristic matrix of the detection text; meanwhile, the word frequency of each word in the vocabulary table in the open source license of the database is counted, and a second feature matrix of each open source license is constructed.
3. And calculating the Similarity between the detection text and each open source license in the database by using a Jaccard Similarity algorithm according to the first characteristic matrix and the second characteristic matrix. The Jaccard Similarity algorithm divides the intersection of the two sets by the union of the two sets to obtain the Similarity of the two sets. In the embodiment of the invention, the two sets are respectively a detection text and an open source license text, and words appearing in the text are elements in the sets, so that the calculation of the similarity between the detection text and the open source license is that the similarity between the two detection texts and the open source license is calculated by utilizing the first feature matrix of the detection text and the second feature matrix of the open source license.
4. And selecting the open source license with the highest text similarity as a matching result by using the calculated text similarity between the detection text and each open source license.
After the open source license related to the file to be tested is detected, whether conflict exists between the open source license and the planning condition and what conflict exists are analyzed based on the use information of the open source license and the software project planning condition input by a user. Specifically, the open source license usage information, such as open source licenses like GPL (GNU General public license), BSD (Berkeley Software Distribution), Apache (Apache web server Software), and Software project planning conditions (e.g., whether there is a closed source demand in the future, whether other licenses are to be introduced, etc.), are matched one by one with the rule expressions in the conflict rule base as input conditions. The regular expressions here are exemplified by:
if((LGPL||Mozilla||GPL)&&(closed source==true)){Conflict=true;RiskLevel=high;}
the code above indicates that if there is a license agreement of the LGPL, Mozilla or GPL type and the planning condition for the development of the software project is closed-source software, then there is an agreement conflict and the risk level is high.
In addition to the conflict between the open source license and the planning condition, the conflict between the open source licenses is analyzed in the embodiment of the present invention. After the open source license related to the file to be tested is detected, the method further comprises the following steps:
performing conflict matching on the detected open source licenses, and determining a second conflict among the detected open source licenses;
and generating a second risk assessment report according to the second conflict.
Open source licenses can be divided into five categories: 1. license possession this can use the software anywhere for any purpose; 2. the license owner can only freely copy the open source software; 3. the licensee can only copy freely or re-develop the software; 4. the license owner has free access to the software and use of the source code of the software, but cannot combine with other components; 5. the license owner is free to combine the open source software with other software. The use of licenses that are in conflict has a significant impact on software development, particularly the development of commercial software. Therefore, in the embodiment of the invention, the open-source licenses with conflicts are detected, and the conflicts between two or more open-source licenses related to the same file to be tested are determined. Specifically, the detected open source licenses are matched with rule expressions in a conflict rule base, so that whether conflicts exist among the open source licenses or not is determined. For example, a regular expression may be:
if(GPL&&BSD){Conflict=true;Risk Level=medium;}
the above code indicates that if both GPL and BSD protocols exist, then there is a protocol conflict and the risk level is medium.
After determining the first conflict between the detected open source license and the planning condition, the method further includes:
determining a risk level corresponding to the first conflict;
after determining the detected second conflict between the plurality of open source licenses, the method further comprises:
and determining a risk level corresponding to the second conflict.
According to the embodiment of the invention, risk grades are divided for various conflicts, so that reasonable decisions are made for the open source software user on the file to be tested and the related open source license to serve as reference.
In addition to detecting open source licenses, embodiments of the invention may retrieve a list of matching licenses based on a name or segment of an open source license entered by a user. The embodiment of the invention also comprises the following steps:
receiving an identification and/or a snippet of an open source license;
determining a corresponding open source license from a database according to the identification and/or the segment;
and generating a license list according to the corresponding open source license.
Thus, the user can quickly inquire various information of each open source license conveniently. The user clicks on a single entry in the list into the details page of the corresponding open source license, which contains an introduction to the open source license content, typical application cases, conditions and limitations of use, etc.
In addition, because the number of rules in the conflict rule base of the database is increasing, and there may be multiple variations of the same rule, traversing the entire conflict rule base for each analysis may make the detection analysis inefficient. The embodiment of the invention optimizes the analysis mode by establishing the classification index, sets the classification index to the conflict rule base according to different types related to the open source license, and quickly positions the rule set associated with the specific open source license through the index list, thereby improving the analysis efficiency.
The specific classification index method can adopt an index structure based on a graph, store the graph in the form of an adjacency list, and define the header of the list as an open source license or other entities and the edge as a rule among different open source licenses. The scheme can quickly locate and query the rule under specific conditions, is easy for dynamic expansion of the rule and is convenient to update and maintain.
In order to more clearly understand the present invention, the following detailed description of the above process is provided by using specific embodiments, and the specific steps are shown in fig. 3, and include:
And 305, calculating the text Similarity between the detection text and the open source license by using a Jaccard Similarity algorithm according to the first characteristic matrix and the second characteristic matrix.
And step 306, determining the open source license with the highest text similarity as the open source license corresponding to the detection text.
And 307, performing conflict matching on each detected open-source license and the planning condition, determining a first conflict between the open-source license and the planning condition, and determining a risk level corresponding to the first conflict.
And 308, performing conflict matching on the detected open-source licenses, determining a second conflict among the open-source licenses, and determining a risk level corresponding to the second conflict.
And 309, generating a risk assessment report, and feeding back to the user in a PDF format.
Fig. 4 schematically shows a structural diagram of an identity authentication apparatus according to an embodiment of the present invention.
As shown in fig. 4, an identity authentication apparatus provided in an embodiment of the present invention includes:
a receiving unit 401, configured to receive a file to be tested and a planning condition;
a detecting unit 402, configured to detect an open source license related to the file to be tested;
a matching unit 403, configured to perform conflict matching on the detected open-source license and the planning condition, and determine a first conflict between the detected open-source license and the planning condition;
a reporting unit 404, configured to generate a first risk assessment report according to the first conflict.
Optionally, the detecting unit 402 is specifically configured to:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
Optionally, the matching unit 403 is further configured to perform conflict matching on the detected multiple open-source licenses, and determine a second conflict between the detected multiple open-source licenses;
the reporting unit 404 is further configured to generate a second risk assessment report according to the second conflict.
Optionally, the matching unit 403 is further configured to:
determining a risk level corresponding to the first conflict;
and determining a risk level corresponding to the second conflict.
Optionally, the receiving unit 401 is further configured to receive an identifier and/or a fragment of an open source license;
the matching unit 403 is further configured to determine a corresponding open-source license from a database according to the identifier and/or the segment;
optionally, the reporting unit 404 is further configured to generate a license list according to the corresponding open-source license.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (12)
1. A method for evaluating an open source license, comprising:
receiving a file to be tested and a planning condition input by a user; the planning condition is a planning condition of the software project input by the user;
detecting an open source license related to a file to be detected according to text similarity between the open source license in an open source license information base of a database module and a detected text in the file to be detected; the database module is divided into an open source license information base and a conflict rule base; the open source license information base stores the agreement terms, the applicable scenes, the use conditions and the limitations of various open source licenses in the market; on one hand, the conflict rule base stores conflict rule expressions among different open source licenses; on the other hand, service scene expressions which are not applicable to each open source license are also stored;
taking the detected open source license and the planning condition as input conditions, performing conflict matching with the service scene expression in the conflict rule base, and determining a first conflict between the detected open source license and the planning condition;
and generating a first risk assessment report according to the first conflict.
2. The method of claim 1, wherein the detecting the open-source license to which the file under test relates comprises:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
3. The method according to claim 1 or 2, wherein after detecting the open source license to which the file under test relates, the method further comprises:
performing conflict matching on the detected open source licenses, and determining a second conflict among the detected open source licenses;
and generating a second risk assessment report according to the second conflict.
4. The method of claim 3, wherein after determining the first conflict between the detected open source license and the planning condition, further comprising:
determining a risk level corresponding to the first conflict;
after determining the detected second conflict between the plurality of open source licenses, the method further comprises:
and determining a risk level corresponding to the second conflict.
5. The method of claim 1, further comprising:
receiving an identification and/or a snippet of an open source license;
determining a corresponding open source license from a database according to the identification and/or the segment;
and generating a license list according to the corresponding open source license.
6. An apparatus for evaluating an open source license, comprising:
the receiving unit is used for receiving the file to be tested and the planning conditions input by the user; the planning condition is a planning condition of the software project input by the user;
the detection unit is used for detecting the open source license related to the file to be detected according to the text similarity between the open source license in the open source license information base of the database module and the detected text in the file to be detected; the database module is divided into an open source license information base and a conflict rule base; the open source license information base stores the agreement terms, the applicable scenes, the use conditions and the limitations of various open source licenses in the market; on one hand, the conflict rule base stores conflict rule expressions among different open source licenses; on the other hand, service scene expressions which are not applicable to each open source license are also stored;
a matching unit, configured to perform conflict matching on the detected open-source license and the planning condition as input conditions, and the service scene expression in the conflict rule base, and determine a first conflict between the detected open-source license and the planning condition;
and the reporting unit is used for generating a first risk assessment report according to the first conflict.
7. The apparatus of claim 6, wherein the detection unit is specifically configured to:
the file to be detected comprises a plurality of detection texts, and a vocabulary of each detection text is determined by using a k-shift algorithm aiming at one detection text;
counting the word frequency of each word in the vocabulary in the detection text, and determining a first characteristic matrix of the detection text;
aiming at one open source license stored in a database, determining the word frequency of each word in the vocabulary in the open source license so as to determine a second feature matrix of the open source license;
calculating text similarity between the detection text and the open source license according to the first feature matrix and the second feature matrix;
and taking the open source license with the highest text similarity as the open source license related to the detection text.
8. The apparatus of claim 6 or 7,
the matching unit is further used for performing conflict matching on the detected open source licenses and determining a second conflict among the detected open source licenses;
the reporting unit is further configured to generate a second risk assessment report according to the second conflict.
9. The apparatus of claim 8, wherein the matching unit is further configured to:
determining a risk level corresponding to the first conflict;
and determining a risk level corresponding to the second conflict.
10. The apparatus of claim 6,
the receiving unit is further configured to receive an identifier and/or a fragment of an open source license;
the matching unit is further used for determining a corresponding open source license from a database according to the identification and/or the segment;
the report unit is further configured to generate a license list according to the corresponding open-source license.
11. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to execute the method of any one of claims 1 to 5 in accordance with the obtained program.
12. A computer-readable non-transitory storage medium including computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1 to 5.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710081702.0A CN106934254B (en) | 2017-02-15 | 2017-02-15 | Analysis method and device for open source license |
EP17896537.2A EP3584728B1 (en) | 2017-02-15 | 2017-11-15 | Method and device for analyzing open-source license |
PCT/CN2017/111095 WO2018149187A1 (en) | 2017-02-15 | 2017-11-15 | Method and device for analyzing open-source license |
US16/485,358 US10942733B2 (en) | 2017-02-15 | 2017-11-15 | Open-source-license analyzing method and apparatus |
TW107103031A TWI662431B (en) | 2017-02-15 | 2018-01-29 | Analysis method and device for open source license |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710081702.0A CN106934254B (en) | 2017-02-15 | 2017-02-15 | Analysis method and device for open source license |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106934254A CN106934254A (en) | 2017-07-07 |
CN106934254B true CN106934254B (en) | 2020-05-26 |
Family
ID=59424093
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710081702.0A Active CN106934254B (en) | 2017-02-15 | 2017-02-15 | Analysis method and device for open source license |
Country Status (5)
Country | Link |
---|---|
US (1) | US10942733B2 (en) |
EP (1) | EP3584728B1 (en) |
CN (1) | CN106934254B (en) |
TW (1) | TWI662431B (en) |
WO (1) | WO2018149187A1 (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106934254B (en) * | 2017-02-15 | 2020-05-26 | 中国银联股份有限公司 | Analysis method and device for open source license |
CN108984391B (en) * | 2018-06-06 | 2022-07-12 | 阿里巴巴(中国)有限公司 | Application program analysis method and device and electronic equipment |
CN109063421B (en) * | 2018-06-28 | 2022-03-04 | 东南大学 | Open source license compliance analysis and conflict detection method |
CN110826834B (en) * | 2018-08-14 | 2023-04-18 | 中国石油天然气股份有限公司 | Comparison method and device between different responsibility separation rule sets |
CN111291331B (en) * | 2019-06-27 | 2022-02-22 | 北京关键科技股份有限公司 | Mixed source file license conflict detection method |
CN111400672A (en) * | 2020-03-18 | 2020-07-10 | 中国信息安全测评中心 | Open source software monitoring method and device |
CN112084309B (en) * | 2020-09-17 | 2024-06-04 | 北京中科微澜科技有限公司 | License selection method and system based on open source software map |
CN113282965A (en) * | 2021-05-20 | 2021-08-20 | 苏州棱镜七彩信息科技有限公司 | Open source license and copyright information tampering detection method and system |
CN113268713A (en) * | 2021-06-03 | 2021-08-17 | 西南大学 | Open source software license selection method based on software dependence |
JP7055232B1 (en) * | 2021-08-24 | 2022-04-15 | ビジョナル・インキュベーション株式会社 | Processing equipment and processing method |
CN115080924B (en) * | 2022-07-25 | 2022-11-15 | 南开大学 | Software license clause extraction method based on natural language understanding |
CN116302042B (en) * | 2023-05-25 | 2023-09-15 | 南方电网数字电网研究院有限公司 | Protocol element content recommendation method and device and computer equipment |
CN118051889A (en) * | 2024-04-16 | 2024-05-17 | 北京安普诺信息技术有限公司 | LLM-based SCA license risk analysis method, device and equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101223549A (en) * | 2005-07-14 | 2008-07-16 | 微软公司 | Digital application operating according to aggregation of plurality of licenses |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040068734A1 (en) * | 2002-10-07 | 2004-04-08 | Microsoft Corporation | Software license isolation layer |
US10437964B2 (en) * | 2003-10-24 | 2019-10-08 | Microsoft Technology Licensing, Llc | Programming interface for licensing |
US9489687B2 (en) * | 2003-12-04 | 2016-11-08 | Black Duck Software, Inc. | Methods and systems for managing software development |
US8700533B2 (en) * | 2003-12-04 | 2014-04-15 | Black Duck Software, Inc. | Authenticating licenses for legally-protectable content based on license profiles and content identifiers |
US7747533B2 (en) | 2005-07-14 | 2010-06-29 | Microsoft Corporation | Digital application operating according to aggregation of plurality of licenses |
US8359655B1 (en) * | 2008-10-03 | 2013-01-22 | Pham Andrew T | Software code analysis and classification system and method |
US9020857B2 (en) * | 2009-02-11 | 2015-04-28 | Johnathan C. Mun | Integrated risk management process |
CN101651564B (en) | 2009-09-08 | 2011-07-06 | 杭州华三通信技术有限公司 | License detection method, distributed network management system and server |
US8875301B2 (en) * | 2011-10-12 | 2014-10-28 | Hewlett-Packard Development Company, L. P. | Software license incompatibility determination |
US8589306B1 (en) * | 2011-11-21 | 2013-11-19 | Forst Brown Todd LLC | Open source license management |
US9424401B2 (en) * | 2012-03-15 | 2016-08-23 | Microsoft Technology Licensing, Llc | Automated license management |
KR20140050323A (en) * | 2012-10-19 | 2014-04-29 | 삼성전자주식회사 | Method and apparatus for license verification of binary file |
FR3009634B1 (en) * | 2013-08-09 | 2015-08-21 | Viaccess Sa | METHOD FOR PROVIDING A LICENSE IN A SYSTEM FOR PROVIDING MULTIMEDIA CONTENT |
CN103440441A (en) | 2013-08-28 | 2013-12-11 | 北京华胜天成科技股份有限公司 | Software protection method and system |
CN106934254B (en) | 2017-02-15 | 2020-05-26 | 中国银联股份有限公司 | Analysis method and device for open source license |
-
2017
- 2017-02-15 CN CN201710081702.0A patent/CN106934254B/en active Active
- 2017-11-15 EP EP17896537.2A patent/EP3584728B1/en active Active
- 2017-11-15 WO PCT/CN2017/111095 patent/WO2018149187A1/en unknown
- 2017-11-15 US US16/485,358 patent/US10942733B2/en active Active
-
2018
- 2018-01-29 TW TW107103031A patent/TWI662431B/en active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101223549A (en) * | 2005-07-14 | 2008-07-16 | 微软公司 | Digital application operating according to aggregation of plurality of licenses |
Also Published As
Publication number | Publication date |
---|---|
TW201832118A (en) | 2018-09-01 |
EP3584728A1 (en) | 2019-12-25 |
TWI662431B (en) | 2019-06-11 |
EP3584728A4 (en) | 2020-05-20 |
US20200026512A1 (en) | 2020-01-23 |
CN106934254A (en) | 2017-07-07 |
EP3584728B1 (en) | 2022-05-04 |
US10942733B2 (en) | 2021-03-09 |
WO2018149187A1 (en) | 2018-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106934254B (en) | Analysis method and device for open source license | |
CN111767716B (en) | Method and device for determining enterprise multi-level industry information and computer equipment | |
CN110929125A (en) | Search recall method, apparatus, device and storage medium thereof | |
CN110162754B (en) | Method and equipment for generating post description document | |
CN110851729A (en) | Resource information recommendation method, device, equipment and computer storage medium | |
CN110737824B (en) | Content query method and device | |
CN113283675A (en) | Index data analysis method, device, equipment and storage medium | |
JP7040535B2 (en) | Security information processing equipment, information processing methods and programs | |
CN112559526A (en) | Data table export method and device, computer equipment and storage medium | |
CN114493255A (en) | Enterprise abnormity monitoring method based on knowledge graph and related equipment thereof | |
CN108460049B (en) | Method and system for determining information category | |
CN111427900B (en) | Label library updating method, device, equipment and readable storage medium | |
CN114238768A (en) | Information pushing method and device, computer equipment and storage medium | |
US20210374559A1 (en) | Computerized method of training a computer executed model for recognizing numerical quantities | |
CN112800215A (en) | Text processing method and device, readable storage medium and electronic equipment | |
CN109508185B (en) | Code review method and device | |
JP2020067700A (en) | Information collecting method, information collecting processing device, and information collecting program | |
US20180189803A1 (en) | A method and system for providing business intelligence | |
CN117114142B (en) | AI-based data rule expression generation method, apparatus, device and medium | |
US20230289522A1 (en) | Deep Learning Systems and Methods to Disambiguate False Positives in Natural Language Processing Analytics | |
US20220374914A1 (en) | Regulatory obligation identifier | |
JP2007304950A (en) | Document processing device and document processing method | |
Hu et al. | Neighborhood hypergraph based classification algorithm for incomplete information system | |
CN117033552A (en) | Information evaluation method, device, electronic equipment and storage medium | |
CN114398640A (en) | Method and device for determining target vulnerability validity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1239904 Country of ref document: HK |
|
GR01 | Patent grant | ||
GR01 | Patent grant |