CN116841610A - Software component analysis system based on data mining - Google Patents

Software component analysis system based on data mining Download PDF

Info

Publication number
CN116841610A
CN116841610A CN202310621243.6A CN202310621243A CN116841610A CN 116841610 A CN116841610 A CN 116841610A CN 202310621243 A CN202310621243 A CN 202310621243A CN 116841610 A CN116841610 A CN 116841610A
Authority
CN
China
Prior art keywords
software
license
component
open source
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310621243.6A
Other languages
Chinese (zh)
Inventor
温胤鑫
黄永军
谢耘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dongfang Tongwangxin Technology Co ltd
Original Assignee
Beijing Dongfang Tongwangxin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dongfang Tongwangxin Technology Co ltd filed Critical Beijing Dongfang Tongwangxin Technology Co ltd
Priority to CN202310621243.6A priority Critical patent/CN116841610A/en
Publication of CN116841610A publication Critical patent/CN116841610A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/75Structural analysis for program understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Stored Programmes (AREA)

Abstract

The application provides a software component analysis system based on data mining, which comprises: the static component analysis module is used for carrying out static analysis and detection on the software components and obtaining the component distribution information of the software by extracting the information and the characteristics contained in the software; the vulnerability scanning module is used for carrying out vulnerability scanning and matching on the component distribution information of the software based on a preset open source vulnerability database; the open source code license analysis module is used for detecting and analyzing an open source code license in software to acquire license information of the software; and the quick deployment module is used for carrying out seamless connection with an external system through a plurality of API interfaces based on virtualized deployment, distributed deployment and mirror image deployment. The application has high identification efficiency and accuracy: the scanning speed is high and the efficiency is high. Support scanning and matching of the code assembly, the code file and the code fragment level, ensure the recognition accuracy; the coverage range is wide, and analysis of software component constitution, open source code license analysis, security hole analysis and the like can be simultaneously carried out.

Description

Software component analysis system based on data mining
Technical Field
The application relates to the technical field of software analysis, in particular to a software component analysis system based on data mining.
Background
A software component analysis system is a product that can identify third party components used in an item. The current software component analysis system mainly realizes the identification of a third party component in a component dependency management file analysis, binary character string feature comparison and file metadata feature comparison peer-to-peer mode.
When the third party component is identified by adopting the file metadata feature comparison mode at present, only file metadata features of a single dimension are often considered, for example, only file metadata of a project level are considered, or only file metadata of a single file are considered, which results in insufficient detection rate of the third party component.
Disclosure of Invention
In view of the above, the present application aims to provide a software component analysis system based on data mining, which can solve the existing problems in a targeted manner.
Based on the above object, the present application also provides a software component analysis system based on data mining, including:
the static component analysis module is used for carrying out static analysis and detection on the software components and obtaining the component distribution information of the software by extracting the information and the characteristics contained in the software;
the vulnerability scanning module is used for carrying out vulnerability scanning and matching on the component distribution information of the software based on a preset open source vulnerability database;
the open source code license analysis module is used for detecting and analyzing an open source code license in software to acquire license information of the software;
and the quick deployment module is used for carrying out seamless connection with an external system through a plurality of API interfaces based on virtualized deployment, distributed deployment and mirror image deployment.
Further, the static component analysis module includes:
performing a binary static analysis of the binary software component to determine one or more security features of the binary software component;
generating a security manifest for the binary software component, including the determined one or more security features of the binary software component;
and determining the safety list as the component distribution information of the software.
Further, the one or more security features are one or more compiler defensive features indicating one or more defensive techniques employed by a compiler for constructing a binary software component, the defensive techniques being Address Space Layout Randomization (ASLR) or stack cookies; alternatively, the one or more security features are one or more Common Vulnerability Exposures (CVEs) representing known security vulnerabilities present in the binary software component; alternatively, the one or more security features are one or more unsecure Application Programming Interfaces (APIs) used by the binary software component.
Further, the vulnerability scanning module includes:
identifying an open source component for use in the item of software;
for each determined open source component, generating a graph database query indicating an identity of the open source component and indicating a version of the open source component;
submitting the graph database query to a graph database of the open source component loopholes, and performing loophole scanning and matching on the component distribution information of the software, wherein the graph database is a preset open source loophole database;
and generating a vulnerability report for the software item by using the results of the graph database query.
Further, generating the graph database query includes generating the graph database query according to a schema of the graph database, the schema indicating a graph structure including: a first vertex representing a vulnerability source, a second vertex representing a vulnerability, a third vertex representing a software component version or version range, the edges between the vertices representing the type of relationship between the vertices.
Further, the open source code license analysis module includes:
acquiring an open source code and first license information associated with the open source code, the first license information including a classification for each license, the classification of the license based on an attribute of the license, wherein the classification defines at least one term associated with each class of license;
receiving input software code to be analyzed for at least one term or condition;
parsing the input software code to determine if the input software code matches the open source code;
determining a second license associated with the input software code that matches the open source code to determine a classification of the second license,
an output report is generated containing the input software code classifications.
Further, the classification of the second license includes:
the first class does not require that the derivative work use the same open source license as the original code from which it was derived;
second, a derivative work containing the original code must use the same license, but a file not containing the original code may be licensed in any way;
third, any file that is combined with the original code must be licensed under the same license as the original code.
Further, the rapid deployment module comprises:
a virtualized deployment unit that deploys the security manifest, vulnerability report, output report containing input software code classification on a server configured to identify potential errors of a target application program that is being developed that includes source code;
a distributed deployment unit searching a solution template repository for a solution template matching the security manifest, vulnerability report, output report containing input software code classification, finding a matching solution template: creating a modified source file according to the matched solution template, wherein the modified source file comprises software defect mitigation; creating updated software distribution according to the modified source file; distributing and deploying the updated software to a plurality of electronic computing devices;
a mirror deployment unit that creates a markup language (yaml) file of the server from the security manifest, the vulnerability report, and an output report containing input software code classifications; making a basic mirror image of the server; integrating the client side of the software component detection into a basic mirror image of a server; deploying the server to a preset containerized application for management; the trigger server builds a software component detection task.
Overall, the advantages of the application and the experience brought to the user are:
1. the identification is efficient and accurate: the scanning speed is high and the efficiency is high. Support scanning and matching of code components, code files and code segment levels, have perfect known open source vulnerability databases, and guarantee accurate identification;
2. the coverage range is wide: the analysis of software component constitution, analysis of open source code License (License), security hole and the like can be simultaneously carried out;
3. quick deployment capability: and the rich API interfaces can be in seamless connection with other systems. The device is simple and easy to use, provides a plurality of floor use modes and realizes quick deployment;
4. professional service support: the security holes, licenses and software errors in the open source component are continuously tracked to ensure the application security, and perfect technical support service is provided.
Drawings
In the drawings, the same reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily drawn to scale. It is appreciated that these drawings depict only some embodiments according to the disclosure and are not therefore to be considered limiting of its scope.
Fig. 1 shows a constitution diagram of a data mining-based software component analysis system according to an embodiment of the present application.
FIG. 2 illustrates a schematic diagram of a specific implementation of a static component analysis module according to an embodiment of the application.
Fig. 3 is a schematic diagram of a specific implementation method of the vulnerability scanning module according to an embodiment of the present application.
FIG. 4 illustrates a schematic diagram of a specific implementation of an open source code license analysis module according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Fig. 6 is a schematic diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
The application adopts a static component analysis technology to carry out static analysis and detection aiming at software composition. The identification, management and tracking of the software are realized by analyzing the information and the characteristics contained in the target software, the component distribution information in the software is efficiently detected, the safety risk is identified, the alarm is accurately positioned, the development personnel is effectively helped to reduce the loopholes in the application, the protocol is forbidden and the potential safety hazard is effectively avoided, and the information safety of the software is protected. The application can be used for safety inspection of the source code and the open source component by a safety department, a development department and a test department.
As shown in fig. 1, an embodiment of the application provides a software component analysis system based on data mining, the system comprising:
the static component analysis module is used for carrying out static analysis and detection on the software components and obtaining the component distribution information of the software by extracting the information and the characteristics contained in the software;
the vulnerability scanning module is used for carrying out vulnerability scanning and matching on the component distribution information of the software based on a preset open source vulnerability database;
the open source code license analysis module is used for detecting and analyzing an open source code license in software to acquire license information of the software;
and the quick deployment module is used for carrying out seamless connection with an external system through a plurality of API interfaces based on virtualized deployment, distributed deployment and mirror image deployment.
Specific implementation and technical details of each module are described in detail below:
the static component analysis module, as shown in fig. 2, includes:
s1, performing binary static analysis of the binary software component to determine one or more security features of the binary software component, which may be one or more compiler defense features, indicative of one or more defense techniques employed by a compiler used to construct the binary software component, the defense techniques being Address Space Layout Randomization (ASLR) or stack cookie; the one or more security features may also be one or more Common Vulnerability Exposures (CVEs) representing known security vulnerabilities present in the binary software component, or the one or more security features may also be one or more unsecure Application Programming Interfaces (APIs) used by the binary software component.
S2, generating a safety list for the binary software component, wherein the safety list comprises one or more determined safety characteristics of the binary software component;
s3, determining the safety list as component distribution information of the software.
The vulnerability scanning module, as shown in fig. 3, includes:
s31, identifying an open source component used in a software project;
s32, for each determined open source component, generating a graph database query, wherein the graph database query indicates the identification of the open source component and indicates the version of the open source component;
s33, inquiring a graph database, submitting the graph database to a graph database of the open source assembly loopholes, and performing loophole scanning and matching on assembly distribution information of the software, wherein the graph database is a preset open source loophole database;
s34, generating a vulnerability report for the software project by using the query result of the graph database.
Wherein generating the vulnerability report further comprises determining an impact of each identified vulnerability and indicating the impact in the vulnerability report, wherein determining the impact of each identified vulnerability comprises determining a frequency of use of open source components corresponding to the identified vulnerability and using a different number of code units of the software item corresponding to the identified vulnerability.
Wherein generating the graph database query includes generating the graph database query according to a schema of the graph database, the schema indicating a graph structure comprising: a first vertex representing a vulnerability source, a second vertex representing a vulnerability, a third vertex representing a software component version or version range, the edges between the vertices representing the type of relationship between the vertices.
The open source code license analysis module, as shown in fig. 4, includes:
s41, acquiring an open source code and first license information associated with the open source code, wherein the first license information comprises a classification of each license, the classification of the license is based on the attribute of the license, and the classification defines at least one term related to each type of license;
s42, receiving input software codes to be analyzed for at least one term or condition;
s43, analyzing the input software code to determine whether the input software code matches the open source code;
s44, determining a second license associated with the input software code that matches the open source code to determine a classification of the second license, the classification of the second license comprising:
the first class does not require that the derivative work use the same open source license as the original code from which it was derived;
second, a derivative work containing the original code must use the same license, but a file not containing the original code may be licensed in any way;
third, any file combined with the original code must be licensed under the same license as the original code;
s45, generating an output report containing the input software code classification.
A quick deployment module, comprising:
a virtualized deployment unit that deploys the security manifest, vulnerability report, output report containing input software code classification on a server configured to identify potential errors of a target application program that is being developed that includes source code;
a distributed deployment unit searching a solution template repository for a solution template matching the security manifest, vulnerability report, output report containing input software code classification, finding a matching solution template: creating a modified source file according to the matched solution template, wherein the modified source file comprises software defect mitigation; creating updated software distribution according to the modified source file; distributing and deploying the updated software to a plurality of electronic computing devices;
a mirror deployment unit that creates a markup language (yaml) file of the server from the security manifest, the vulnerability report, and an output report containing input software code classifications; making a basic mirror image of the server; integrating the client side of the software component detection into a basic mirror image of a server; deploying the server to a preset containerized application for management; the trigger server builds a software component detection task.
The software component analysis system based on data mining has the following advantages:
1. open source component identification capability: 200 ten thousand+ components and 3000 ten thousand+ versions can be identified, and the medium which can be analyzed comprises binary, byte code scanning, source code and Docker mirror image;
2. perfect security defect library: and CVE, CNNVD, CNVD and a multi-community vulnerability library are built in the system, so that multi-vulnerability joint search is supported, and the defect recognition rate is improved.
Referring to fig. 5, a schematic diagram of an electronic device according to some embodiments of the present application is shown. As shown in fig. 5, the electronic device 20 includes: a processor 200, a memory 201, a bus 202 and a communication interface 203, the processor 200, the communication interface 203 and the memory 201 being connected by the bus 202; the memory 201 stores a computer program that can be executed on the processor 200, and the processor 200 executes the data mining-based software component analysis system provided in any one of the foregoing embodiments of the present application when executing the computer program.
The memory 201 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 203 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 202 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 201 is configured to store a program, and the processor 200 executes the program after receiving an execution instruction, and the data mining-based software component analysis system disclosed in any of the foregoing embodiments of the present application may be applied to the processor 200 or implemented by the processor 200.
The processor 200 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 200 or by instructions in the form of software. The processor 200 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201, and in combination with its hardware, performs the steps of the above method.
The electronic equipment provided by the embodiment of the application and the software component analysis system based on data mining provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the electronic equipment based on the same inventive concept.
The present application further provides a computer readable storage medium corresponding to the data mining-based software component analysis system provided in the foregoing embodiment, referring to fig. 6, the computer readable storage medium is shown as an optical disc 30, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the data mining-based software component analysis system provided in any of the foregoing embodiments.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The computer readable storage medium provided by the above embodiment of the present application has the same advantages as the method adopted, operated or implemented by the application program stored in the computer readable storage medium, because of the same inventive concept as the data mining-based software component analysis system provided by the embodiment of the present application.
It should be noted that:
the algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present application is not directed to any particular programming language. It will be appreciated that the teachings of the present application described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present application.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Various component embodiments of the application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components in a virtual machine creation system according to embodiments of the application may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present application can also be implemented as an apparatus or system program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present application may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that various changes and substitutions are possible within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A data mining-based software component analysis system, comprising:
the static component analysis module is used for carrying out static analysis and detection on the software components and obtaining the component distribution information of the software by extracting the information and the characteristics contained in the software;
the vulnerability scanning module is used for carrying out vulnerability scanning and matching on the component distribution information of the software based on a preset open source vulnerability database;
the open source code license analysis module is used for detecting and analyzing an open source code license in software to acquire license information of the software;
and the quick deployment module is used for carrying out seamless connection with an external system through a plurality of API interfaces based on virtualized deployment, distributed deployment and mirror image deployment.
2. The system of claim 1, wherein the system further comprises a controller configured to control the controller,
the static component analysis module includes:
performing a binary static analysis of the binary software component to determine one or more security features of the binary software component;
generating a security manifest for the binary software component, including the determined one or more security features of the binary software component;
and determining the safety list as the component distribution information of the software.
3. The system of claim 2, wherein the system further comprises a controller configured to control the controller,
the one or more security features are one or more compiler defensive features indicative of one or more defensive techniques employed by a compiler for constructing a binary software component, the defensive techniques being Address Space Layout Randomization (ASLR) or stack cookie; alternatively, the one or more security features are one or more Common Vulnerability Exposures (CVEs) representing known security vulnerabilities present in the binary software component; alternatively, the one or more security features are one or more unsecure Application Programming Interfaces (APIs) used by the binary software component.
4. The system of claim 3, wherein the system further comprises a controller configured to control the controller,
the vulnerability scanning module comprises:
identifying an open source component for use in the item of software;
for each determined open source component, generating a graph database query indicating an identity of the open source component and indicating a version of the open source component;
submitting the graph database query to a graph database of the open source component loopholes, and performing loophole scanning and matching on the component distribution information of the software, wherein the graph database is a preset open source loophole database;
and generating a vulnerability report for the software item by using the results of the graph database query.
5. The system of claim 4, wherein the system further comprises a controller configured to control the controller,
generating the graph database query includes generating the graph database query according to a schema of the graph database, the schema indicating a graph structure including: a first vertex representing a vulnerability source, a second vertex representing a vulnerability, a third vertex representing a software component version or version range, the edges between the vertices representing the type of relationship between the vertices.
6. The system of claim 5, wherein the system further comprises a controller configured to control the controller,
the open source code license analysis module includes:
acquiring an open source code and first license information associated with the open source code, the first license information including a classification for each license, the classification of the license based on an attribute of the license, wherein the classification defines at least one term associated with each class of license;
receiving input software code to be analyzed for at least one term or condition;
parsing the input software code to determine if the input software code matches the open source code;
determining a second license associated with the input software code that matches the open source code to determine a classification of the second license;
an output report is generated containing the input software code classifications.
7. The system of claim 6, wherein the system further comprises a controller configured to control the controller,
the classification of the second license includes:
the first class does not require that the derivative work use the same open source license as the original code from which it was derived;
second, a derivative work containing the original code must use the same license, but a file not containing the original code may be licensed in any way;
third, any file that is combined with the original code must be licensed under the same license as the original code.
8. The system of claim 7, wherein the system further comprises a controller configured to control the controller,
the rapid deployment module comprises:
a virtualized deployment unit that deploys the security manifest, vulnerability report, output report containing input software code classification on a server configured to identify potential errors of a target application program that is being developed that includes source code;
a distributed deployment unit searching a solution template repository for a solution template matching the security manifest, vulnerability report, output report containing input software code classification, finding a matching solution template: creating a modified source file according to the matched solution template, wherein the modified source file comprises software defect mitigation; creating updated software distribution according to the modified source file; distributing and deploying the updated software to a plurality of electronic computing devices;
a mirror deployment unit that creates a markup language (yaml) file of the server from the security manifest, the vulnerability report, and an output report containing input software code classifications; making a basic mirror image of the server; integrating the client side of the software component detection into a basic mirror image of a server; deploying the server to a preset containerized application for management; the trigger server builds a software component detection task.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor runs the computer program to implement the system of any one of claims 1-8.
10. A computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement the system of any of claims 1-8.
CN202310621243.6A 2023-05-29 2023-05-29 Software component analysis system based on data mining Pending CN116841610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310621243.6A CN116841610A (en) 2023-05-29 2023-05-29 Software component analysis system based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310621243.6A CN116841610A (en) 2023-05-29 2023-05-29 Software component analysis system based on data mining

Publications (1)

Publication Number Publication Date
CN116841610A true CN116841610A (en) 2023-10-03

Family

ID=88160679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310621243.6A Pending CN116841610A (en) 2023-05-29 2023-05-29 Software component analysis system based on data mining

Country Status (1)

Country Link
CN (1) CN116841610A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806624A (en) * 2023-12-11 2024-04-02 北京北大软件工程股份有限公司 Extraction method of reusable component

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806624A (en) * 2023-12-11 2024-04-02 北京北大软件工程股份有限公司 Extraction method of reusable component

Similar Documents

Publication Publication Date Title
CN108763928B (en) Open source software vulnerability analysis method and device and storage medium
CN109992970B (en) JAVA deserialization vulnerability detection system and method
US11308218B2 (en) Open source vulnerability remediation tool
US11048798B2 (en) Method for detecting libraries in program binaries
US8875303B2 (en) Detecting pirated applications
CN104317599B (en) Whether detection installation kit is by the method and apparatus of secondary packing
Nouh et al. Binsign: fingerprinting binary functions to support automated analysis of code executables
US10642965B2 (en) Method and system for identifying open-source software package based on binary files
EP3566166B1 (en) Management of security vulnerabilities
US10296326B2 (en) Method and system for identifying open-source software package based on binary files
CN108446559B (en) APT organization identification method and device
US20220277081A1 (en) Software package analysis for detection of malicious properties
Wu et al. Measuring the declared SDK versions and their consistency with API calls in Android apps
CN116841610A (en) Software component analysis system based on data mining
Samhi et al. Difuzer: Uncovering suspicious hidden sensitive operations in android apps
US20170249143A1 (en) Detecting open source components built into mobile applications
Nadeem et al. High false positive detection of security vulnerabilities: a case study
Li et al. Large-scale third-party library detection in android markets
US20220108023A1 (en) Docker image vulnerability inspection device and method for performing docker file analysis
US11250127B2 (en) Binary software composition analysis
CN116483888A (en) Program evaluation method and device, electronic equipment and computer readable storage medium
Samhi et al. TriggerZoo: a dataset of android applications automatically infected with logic bombs
US11356853B1 (en) Detection of malicious mobile apps
Hong et al. Circuit: a Javascript memory heap-based approach for precisely detecting Cryptojacking websites
CN106372508B (en) Malicious document processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination