CN117406967B - Component identification method and device, electronic equipment and storage medium - Google Patents
Component identification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117406967B CN117406967B CN202311725580.6A CN202311725580A CN117406967B CN 117406967 B CN117406967 B CN 117406967B CN 202311725580 A CN202311725580 A CN 202311725580A CN 117406967 B CN117406967 B CN 117406967B
- Authority
- CN
- China
- Prior art keywords
- component
- slice
- code
- code slice
- software product
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013507 mapping Methods 0.000 claims description 36
- 238000012545 processing Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G06F8/427—Parsing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Stored Programmes (AREA)
Abstract
The embodiment of the invention provides a component identification method, a device, electronic equipment and a storage medium. The method comprises the following steps: the method comprises the steps of performing slice acquisition on a software product to obtain a first code slice of the software product; under the condition that the first code slice is not a preset blacklist slice, analyzing the first code slice to obtain component characteristics corresponding to the first code slice; determining a component identification result corresponding to the software product according to a preset component feature library and a component feature corresponding to the first code slice; the component identification results are used to characterize third party components included in the software product. In the process, the component identification result corresponding to the software product can be generated without analyzing the source code or the binary package in the software product, and the third party component included in the software product is determined, so that the use scene of component identification is expanded, and the accuracy of the component identification result is improved.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a component identification method, a device, an electronic apparatus, and a storage medium.
Background
With the wide application of informatization technology, the software research and development requirements are in explosive growth, and in order to improve the software research and development efficiency and reduce the software research and development cost, current developers widely use third-party open source components to improve the development work efficiency. However, third party components may present a security risk for which component identification of the third party components contained in the software product is required during the software product development process.
The component recognition method in the prior art generally extracts a source code or a binary package in a software product, and determines a third party component contained in the software product by parsing the source code or the binary package. However, some software products may encrypt the source code or binary package, which results in an inaccurate extraction of the source code or binary package of the software product, thereby reducing the accuracy of the component identification result.
Disclosure of Invention
Based on the foregoing, it is necessary to provide a component recognition method, device, electronic apparatus and storage medium to solve the technical problem that the accuracy of the existing component recognition result is low.
A component identification method, comprising:
the method comprises the steps of performing slice acquisition on a software product to obtain a first code slice of the software product;
Analyzing the first code slice under the condition that the first code slice is not a preset blacklist slice, and obtaining component characteristics corresponding to the first code slice;
determining a component identification result corresponding to the software product according to a preset component feature library and the component features corresponding to the first code slice; the component identification result is used to characterize a third party component included in the software product.
Optionally, in the case that the first code slice is not a preset blacklist slice, before the first code slice is parsed to obtain the component feature corresponding to the first code slice, the method further includes:
acquiring a component identification path corresponding to the first code slice;
inquiring the component path corresponding to the first code slice in a preset filtering blacklist;
if the component path corresponding to the first code slice is not inquired, determining that the first code slice is not a preset blacklist slice;
if the component path corresponding to the first code slice is obtained through inquiry, determining that the first code slice is a preset blacklist slice;
and the filtered blacklist stores component paths corresponding to the blacklist slices.
Optionally, after the acquiring the slice of the software product and acquiring the first code slice of the software product, the method further includes:
and re-acquiring the slice of the software product under the condition that the first code slice is a preset blacklist slice.
Optionally, the component feature library comprises a feature basic table and a feature increment table, wherein the feature basic table stores a mapping relation between component features and component numbers;
the determining the component recognition result corresponding to the software product according to the preset component feature library and the component feature corresponding to the first code slice comprises the following steps:
querying component characteristics corresponding to the first code slice in the characteristic basic table;
under the condition that the component numbers with mapping relation to the component features corresponding to the first code slices are obtained through inquiry, determining component identification results corresponding to the software products according to the feature increment table;
and under the condition that the component numbers with the mapping relation to the component features corresponding to the first code slice are not queried, carrying out slice collection on the software product again.
Optionally, the feature increment table stores a mapping relationship between the number and the component version;
And determining a component identification result corresponding to the software product according to the characteristic increment table, wherein the component identification result comprises:
querying the component numbers in the feature increment table;
if the unique component version with the mapping relation with the component number is obtained by inquiry, determining that the component associated with the component number is a third party component included in the software product;
if a plurality of component versions with mapping relation with the component numbers are obtained through inquiry, determining a component identification result corresponding to the software product based on a second code slice; the second code slice is obtained based on the expansion slice processing of the first code slice, and the data volume corresponding to the second code slice is larger than the data volume corresponding to the first code slice.
Optionally, the determining, based on the second code slice, a component identification result corresponding to the software product includes:
performing expansion slicing processing on the first code slice to obtain the second code slice;
acquiring slice information corresponding to the second code slice; the slice information comprises folders in the data packet to which the second code slice belongs, files in the data packet to which the second code slice belongs, and hash values corresponding to the data packet to which the second code slice belongs;
And determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the characteristic increment table.
Optionally, the feature increment table further stores increment types, wherein the increment types comprise a folder type, a file type and a changed file type;
the determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table comprises the following steps:
under the condition that slice information corresponding to the second code slice and the characteristic increment table meet preset conditions, determining that a component associated with the second code slice is a third party component included in the software product;
wherein the preset condition includes any one of the following:
the folder in the data packet to which the second code slice belongs is matched with the folder type;
the file in the data packet to which the second code slice belongs is matched with the file type;
and the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type.
A component identification device, comprising:
the first acquisition module is used for carrying out slice acquisition on the software product and acquiring a first code slice of the software product;
The analyzing module is used for analyzing the first code slice to obtain the component characteristics corresponding to the first code slice under the condition that the first code slice is not a preset blacklist slice;
the first determining module is used for determining a component identification result corresponding to the software product according to a preset component feature library and the component features corresponding to the first code slice; the component identification result is used to characterize a third party component included in the software product.
An electronic device comprising a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor implementing the component identification method described above when executing the computer readable instructions.
One or more readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform a component identification method as described above.
The embodiment of the invention provides a component identification method, a device, electronic equipment and a storage medium. In this embodiment, a code slice is obtained by slicing and collecting a software product, and then, according to a preset component feature library and a component feature corresponding to the code slice, a component recognition result corresponding to the software product is determined. In the process, the component identification result corresponding to the software product can be generated without analyzing the source code or the binary package in the software product, and the third party component included in the software product is determined, so that the use scene of component identification is expanded, the third party component in the software product can be identified under the condition that the source code or the binary package of the software product cannot be provided, and the accuracy of the component identification result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of an application environment of a component recognition method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a component recognition method according to an embodiment of the invention;
FIG. 3 is a flow chart of an application of a component identification method in an embodiment of the invention;
FIG. 4 is a schematic diagram of a component recognition apparatus according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The component identification method provided in this embodiment may be applied in an application environment as shown in fig. 1, where a client communicates with a server. Clients include, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented by a stand-alone server or a server cluster formed by a plurality of servers.
As shown in fig. 2, in an embodiment, a component recognition method is provided, and an example of application of the method to the service end in fig. 1 is described, where the service end may be a component recognition device, and the method includes the following steps:
s210, performing slice acquisition on the software product to acquire a first code slice of the software product.
The software product may be an application program and the software product is a software product to be subjected to component recognition.
Alternatively, the JVM bytecode instrumentation technique may be used to access the running application program process without awareness, and collect a code slice of the running application program, thereby obtaining a first code slice of the software product. It should be appreciated that the JVM bytecode instrumentation technique performs slice collection as the traffic scene triggers, so this mode is also referred to as a minimum run-state slice snapshot, i.e., the first code slice described above is the minimum code slice.
S220, analyzing the first code slice under the condition that the first code slice is not a preset blacklist slice, and obtaining component characteristics corresponding to the first code slice.
The blacklist slice is preset, and the blacklist slice is a self-defined code slice which does not need component identification.
In this step, if the first code slice is not a preset blacklist slice, it indicates that component identification needs to be performed on a component associated with the first code slice, the first code slice is parsed, and component features corresponding to the first code slice are obtained. For specific embodiments of determining whether the first code slice is a blacklist slice, please refer to the following examples.
It should be understood that, in the slice snapshot in the minimum running state, that is, the first code slice, byte code information is included, and a vendor identifier and a component name are obtained by analyzing the byte code information, where the component name is used to represent a name of a component associated with the first code slice, the vendor represents a vendor used to represent a component associated with the first code slice, and the vendor identifier and the component name are determined as component features corresponding to the first code slice.
Specifically, as can be seen from table one, 8 bytes are read back from the class file of the byte code information, the constant_pool number is obtained, which is the constant pool number, as cpInfoOffsets array, the data in the array is the offset of the current constant in class, and the constant is obtained based on the offset. After the constant pool data is completely traversed, the constant pool data is stored in the cpInfoOffset variable, and the position access_flags are at the moment. And (3) shifting two bits at the current position, and obtaining a current class index this_class to obtain the supplier identifier and the component name.
Table one:
s230, determining a component identification result corresponding to the software product according to a preset component feature library and the component features corresponding to the first code slice.
The component feature library is preset. In this step, after obtaining the component features corresponding to the first code slice, determining the component recognition result corresponding to the software product according to the preset component feature library and the component features corresponding to the first code slice. Wherein the component identification results are used to characterize a third party component comprised by the software product.
In this embodiment, a code slice is obtained by slicing and collecting a software product, and then, according to a preset component feature library and a component feature corresponding to the code slice, a component recognition result corresponding to the software product is determined. In the process, the component identification result corresponding to the software product can be generated without analyzing the source code or the binary package in the software product, and the third party component included in the software product is determined, so that the use scene of component identification is expanded, the third party component in the software product can be identified under the condition that the source code or the binary package of the software product cannot be provided, and the accuracy of the component identification result is improved.
The following describes in detail an embodiment how to determine whether the first code slice is a blacklist slice:
optionally, in the case that the first code slice is not a preset blacklist slice, before the first code slice is parsed to obtain the component feature corresponding to the first code slice, the method further includes:
acquiring a component identification path corresponding to the first code slice;
inquiring the component path corresponding to the first code slice in a preset filtering blacklist;
if the component path corresponding to the first code slice is not inquired, determining that the first code slice is not a preset blacklist slice;
if the component path corresponding to the first code slice is obtained through inquiry, determining that the first code slice is a preset blacklist slice;
and the filtered blacklist stores component paths corresponding to the blacklist slices.
In this embodiment, a filtering blacklist is preset, and the filtering blacklist stores component paths corresponding to blacklist slices, that is, component paths corresponding to slices that do not need component identification.
As described above, the first code slice includes byte code information, the component identification path is read from the byte code information, and the component path corresponding to the first code slice is queried in a preset filtered blacklist.
If the component path corresponding to the first code slice is not queried in the filtered blacklist, determining that the first code slice is not a preset blacklist slice; if the component path corresponding to the first code slice is queried in the filtered blacklist, determining that the first code slice is a preset blacklist slice.
The component paths include, but are not limited to, a base packet path and an application body path.
In this embodiment, since the filtered blacklist stores the component paths corresponding to the blacklist slices, the component paths corresponding to the first code slices are queried in the filtered blacklist, so that whether the first code slices are blacklist slices which do not need component identification can be accurately determined.
Optionally, after the acquiring the slice of the software product and acquiring the first code slice of the software product, the method further includes:
and re-acquiring the slice of the software product under the condition that the first code slice is a preset blacklist slice.
In this embodiment, if the first code slice is a preset blacklist slice, which indicates that component identification is not required for the first code slice, slice acquisition is performed on the software product again to obtain another first code slice.
Optionally, the component feature library comprises a feature basic table and a feature increment table, wherein the feature basic table stores a mapping relation between component features and component numbers;
the determining the component recognition result corresponding to the software product according to the preset component feature library and the component feature corresponding to the first code slice comprises the following steps:
querying component characteristics corresponding to the first code slice in the characteristic basic table;
under the condition that the component numbers with mapping relation to the component features corresponding to the first code slices are obtained through inquiry, determining component identification results corresponding to the software products according to the feature increment table;
and under the condition that the component numbers with the mapping relation to the component features corresponding to the first code slice are not queried, carrying out slice collection on the software product again.
It should be noted that the component feature library includes a feature basic table and a feature increment table, where the feature basic table identifies components by using a GAV coordinate system, and the coordinates include a vendor identifier (groupId), a component name (artifactId), and a component version (version) three elements to form the basic features of the components.
Referring to table two, which is a representation of the feature base table:
and (II) table:
as shown in Table two, the feature base table includes a component number, vendor identification, component name, and version number. As described above, the component characteristics include the vendor identification and the component name, i.e., the characteristic base table stores the mapping relationship between the component characteristics and the component numbers.
In this embodiment, the component features corresponding to the first code slice are queried in the feature base table, in other words, the vendor identifier and the component name corresponding to the first code slice are queried in the feature base table.
If the component numbers with the mapping relation with the component features corresponding to the first code slices are obtained in the feature basic table, and the components related to the first code slices are third-party components included in the software product, further determining a component identification result corresponding to the software product according to the feature increment table. In specific terms, how to determine the technical solution of the component recognition result according to the feature increment table, please refer to the subsequent embodiment.
If the component numbers with the mapping relation with the component features corresponding to the first code slices are not obtained in the feature basic table, and the components related to the first code slices are not third-party components included in the software product, the software product is subjected to slice collection again, and another first code slice is obtained.
In this embodiment, the component feature library includes a feature basic table and a feature increment table, that is, the component feature library only stores feature information of an increment specific to a certain component version, and does not store the same redundant feature information of multiple versions of the same component, so that the storage space occupied by the component feature library is less, and the storage space is saved. In the subsequent component identification process by utilizing the component feature library, component identification can be realized by only comparing the special increment information of each component, so that the number of feature matching times in the component identification process is reduced, the calculated amount is reduced, and the component identification efficiency is further improved.
Optionally, the feature increment table stores a mapping relationship between the number and the component version;
and determining a component identification result corresponding to the software product according to the characteristic increment table, wherein the component identification result comprises:
querying the component numbers in the feature increment table;
if the unique component version with the mapping relation with the component number is obtained by inquiry, determining that the component associated with the component number is a third party component included in the software product;
if a plurality of component versions with mapping relation with the component numbers are obtained through inquiry, determining a component identification result corresponding to the software product based on a second code slice; the second code slice is obtained based on the expansion slice processing of the first code slice, and the data volume corresponding to the second code slice is larger than the data volume corresponding to the first code slice.
It should be noted that the feature increment table stores a mapping relationship between the number and the component version.
In this embodiment, after the component number is obtained through the feature basic table, the component number is input into the feature increment table for query.
If the unique component version with the mapping relation of the component numbers is obtained by inquiring in the feature increment table, determining that the component associated with the component numbers is a third party component included in the software product; if multiple component versions with mapping relation of component numbers are obtained by inquiring in the feature increment table, further judgment on third party components included in the software product is needed.
Specifically, the first code slice is subjected to expansion slice processing, a second code slice is obtained, and a component identification result corresponding to the software product is determined based on the second code slice. The data amount corresponding to the second code slice is larger than the data amount corresponding to the first code slice.
Optionally, the determining, based on the second code slice, a component identification result corresponding to the software product includes:
performing expansion slicing processing on the first code slice to obtain the second code slice;
acquiring slice information corresponding to the second code slice; the slice information comprises folders in the data packet to which the second code slice belongs, files in the data packet to which the second code slice belongs, and hash values corresponding to the data packet to which the second code slice belongs;
And determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the characteristic increment table.
In this embodiment, the first code slice is subjected to an expansion slice processing by an expansion slice snapshot manner, and slice acquisition is performed again on the basis of the first code slice to obtain an expansion code slice snapshot, that is, a second code slice. And acquiring a folder and a file included in the jar data packet and a hash value corresponding to the jar data packet according to the position of the jar data packet to which the class file belongs in the first code slice, and acquiring slice information corresponding to the second code slice.
The slice information corresponding to the second code slice comprises folders in the data packet to which the second code slice belongs, namely folders included in the jar data packet; the second code slice belongs to the file in the data packet, namely the file included in the jar data packet; the hash value corresponding to the data packet to which the second code slice belongs, namely the hash value corresponding to the jar data packet.
Further, based on the slice information and the feature increment table corresponding to the second code slice, a component identification result corresponding to the software product is determined.
In this embodiment, the first code slice and the second code slice are used in the component identification process, that is, the minimum code slice snapshot and the expansion code slice snapshot of the running code slice are used as complements, so as to expand the component identification range. And acquiring slice information of the second code slice, and carrying out component identification based on more slice characteristics, so that accuracy of a component identification result is improved.
Optionally, the feature increment table further stores increment types, wherein the increment types comprise a folder type, a file type and a changed file type;
the determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table comprises the following steps:
under the condition that slice information corresponding to the second code slice and the characteristic increment table meet preset conditions, determining that a component associated with the second code slice is a third party component included in the software product;
wherein the preset condition includes any one of the following:
the folder in the data packet to which the second code slice belongs is matched with the folder type;
the file in the data packet to which the second code slice belongs is matched with the file type;
and the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type.
The feature increment table includes a component number, an increment type, an increment content, and a file md5 value. The increment types are three, namely, a new folder, a new file and a changed file, namely, the increment types are stored in the characteristic increment table, and include the folder type, the file type and the changed file type.
Referring to table three, which is a representation of the feature base table:
table three:
in this embodiment, the slice information corresponding to the second code slice is queried in the feature increment table, and if the slice information corresponding to the second code slice and the feature increment table meet the preset conditions, it is determined that the component associated with the second code slice is a third party component included in the software product.
Optionally, the preset condition includes any one of the following:
1. the folder in the data packet to which the second code slice belongs is matched with the folder type.
Specifically, if the folder in the feature increment table can be obtained by analyzing the second code slice, determining that the folder in the data packet to which the second code slice belongs is matched with the folder type.
2. The file in the data packet to which the second code slice belongs matches the file type.
Specifically, if the file in the feature increment table can be obtained by analyzing the second code slice, determining that the folder in the data packet to which the second code slice belongs is matched with the file type.
3. The hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type
Specifically, reading a hash value corresponding to the data packet to which the second code slice belongs, and if the hash value is the same as the hash value in the characteristic increment table, determining that the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type.
In order to facilitate understanding of the overall technical solution, referring to fig. 3, as shown in fig. 3, a software product is firstly sliced in step S1 to obtain a minimum slice snapshot, i.e. a first code slice. The component identification path in the first code slice, feature a, is extracted. And comparing the component identification path with the path in the filtered blacklist, if the filtered blacklist does not comprise the component identification path, matching the feature A in a component feature library, and if the feature A is not matched with all features in the component feature library, namely determining that a component associated with the first code slice is not a third party component through a feature basic table and a feature increment table, returning to the step S1 to acquire the slice again. And if the component associated with the first code slice is determined to be the unique third party component included in the software product through the component feature library, generating a component identification result corresponding to the software product. If the component associated with the first code slice is not the unique third party component through the component feature library, expanding the slice snapshot range to obtain a second code slice, and applying the component feature library to determine whether the component associated with the second code slice is the third party component, so as to generate a component identification result corresponding to the software product.
In one embodiment, a component recognition device is provided, where the component recognition device corresponds to the component recognition method in the above embodiment one by one. As shown in fig. 4, the component recognition apparatus 400 includes an acquisition module 410, a parsing module 420, and a first determination module 430.
The functional modules are described in detail as follows:
a first acquisition module 410, configured to acquire a slice of a software product, and acquire a first code slice of the software product;
the parsing module 420 is configured to parse the first code slice to obtain component features corresponding to the first code slice if the first code slice is not a preset blacklist slice;
a first determining module 430, configured to determine a component recognition result corresponding to the software product according to a preset component feature library and a component feature corresponding to the first code slice; the component identification result is used to characterize a third party component included in the software product.
Optionally, the component recognition apparatus 400 further includes:
the acquisition module is used for acquiring a component identification path corresponding to the first code slice;
the query module is used for querying the component path corresponding to the first code slice in a preset filtering blacklist;
A third determining module, configured to determine that the first code slice is not a preset blacklist slice if the component path corresponding to the first code slice is not obtained by querying;
a fourth determining module, configured to determine, if the component path corresponding to the first code slice is obtained by querying, that the first code slice is a preset blacklist slice;
and the filtered blacklist stores component paths corresponding to the blacklist slices.
Optionally, the component recognition apparatus 400 further includes:
and the second acquisition module is used for carrying out slice acquisition on the software product again under the condition that the first code slice is a preset blacklist slice.
Optionally, the component feature library comprises a feature basic table and a feature increment table, wherein the feature basic table stores a mapping relation between component features and component numbers;
the first determining module 430 is specifically configured to:
querying component characteristics corresponding to the first code slice in the characteristic basic table;
under the condition that the component numbers with mapping relation to the component features corresponding to the first code slices are obtained through inquiry, determining component identification results corresponding to the software products according to the feature increment table;
And under the condition that the component numbers with the mapping relation to the component features corresponding to the first code slice are not queried, carrying out slice collection on the software product again.
Optionally, the feature increment table stores a mapping relationship between the number and the component version;
the first determining module 430 is further specifically configured to:
querying the component numbers in the feature increment table;
if the unique component version with the mapping relation with the component number is obtained by inquiry, determining that the component associated with the component number is a third party component included in the software product;
if a plurality of component versions with mapping relation with the component numbers are obtained through inquiry, determining a component identification result corresponding to the software product based on a second code slice; the second code slice is obtained based on the expansion slice processing of the first code slice, and the data volume corresponding to the second code slice is larger than the data volume corresponding to the first code slice.
Optionally, the first determining module 430 is further specifically configured to:
performing expansion slicing processing on the first code slice to obtain the second code slice;
Acquiring slice information corresponding to the second code slice; the slice information comprises folders in the data packet to which the second code slice belongs, files in the data packet to which the second code slice belongs, and hash values corresponding to the data packet to which the second code slice belongs;
and determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the characteristic increment table.
Optionally, the feature increment table further stores increment types, wherein the increment types comprise a folder type, a file type and a changed file type;
the first determining module 430 is further specifically configured to:
under the condition that slice information corresponding to the second code slice and the characteristic increment table meet preset conditions, determining that a component associated with the second code slice is a third party component included in the software product;
wherein the preset condition includes any one of the following:
the folder in the data packet to which the second code slice belongs is matched with the folder type;
the file in the data packet to which the second code slice belongs is matched with the file type;
and the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type.
For specific limitations of the component recognition apparatus, reference may be made to the above limitations of the component recognition method, and no further description is given here. The respective modules in the above-described component recognition apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or independent of a processor in the electronic device, or may be stored in software in a memory in the electronic device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, an electronic device is provided, which may be a server, and the internal structure of which may be as shown in fig. 5. The electronic device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a readable storage medium, an internal memory. The readable storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the execution of an operating system and computer-readable instructions in a readable storage medium. The database of the electronic device is used for storing data related to the component identification method. The network interface of the electronic device is used for communicating with an external terminal through a network connection. The computer readable instructions when executed by a processor implement a component identification method. The readable storage medium provided by the present embodiment includes a nonvolatile readable storage medium and a volatile readable storage medium.
In one embodiment, an electronic device is provided, which may be a terminal device, and an internal structure diagram thereof may be as shown in fig. 5. The electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a readable storage medium. The readable storage medium stores computer readable instructions. The network interface of the electronic device is used for communicating with an external terminal through a network connection. The computer readable instructions when executed by a processor implement a component identification method. The readable storage medium provided by the present embodiment includes a nonvolatile readable storage medium and a volatile readable storage medium.
In one embodiment, an electronic device is provided that includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, when executing the computer readable instructions, implementing the steps of a component identification method as described above.
In one embodiment, a readable storage medium is provided, the readable storage medium storing computer readable instructions which, when executed by a processor, implement the component identification method steps as described above. Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by instructing the associated hardware by computer readable instructions stored on a non-volatile readable storage medium or a volatile readable storage medium, which when executed may comprise the above described embodiment methods. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.
Claims (6)
1. A component identification method, comprising:
the method comprises the steps of performing slice acquisition on a software product to obtain a first code slice of the software product;
analyzing the first code slice under the condition that the first code slice is not a preset blacklist slice, and obtaining component characteristics corresponding to the first code slice;
Determining a component identification result corresponding to the software product according to a preset component feature library and the component features corresponding to the first code slice; the component identification result is used for characterizing a third party component included in the software product;
the determining the component recognition result corresponding to the software product according to the preset component feature library and the component feature corresponding to the first code slice comprises the following steps:
the component feature library comprises a feature basic table and a feature increment table, wherein the feature basic table stores the mapping relation between component features and component numbers;
the feature increment table stores a mapping relation between numbers and component versions and increment types, wherein the increment types comprise folder types, file types and changed file types;
querying component characteristics corresponding to the first code slice in the characteristic basic table;
under the condition that the component numbers with mapping relation to the component features corresponding to the first code slices are obtained through inquiry, determining component identification results corresponding to the software products according to the feature increment table;
and determining a component identification result corresponding to the software product according to the characteristic increment table, wherein the component identification result comprises:
Querying the component numbers in the feature increment table;
if the unique component version with the mapping relation with the component number is obtained by inquiry, determining that the component associated with the component number is a third party component included in the software product;
if a plurality of component versions with mapping relation with the component numbers are obtained through inquiry, determining a component identification result corresponding to the software product based on a second code slice; the second code slice is obtained based on the expansion slice processing of the first code slice, and the data volume corresponding to the second code slice is larger than the data volume corresponding to the first code slice;
the determining, based on the second code slice, a component identification result corresponding to the software product includes:
performing expansion slicing processing on the first code slice to obtain the second code slice;
acquiring slice information corresponding to the second code slice; the slice information comprises folders in the data packet to which the second code slice belongs, files in the data packet to which the second code slice belongs, and hash values corresponding to the data packet to which the second code slice belongs;
determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table;
The determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table comprises the following steps:
under the condition that slice information corresponding to the second code slice and the characteristic increment table meet preset conditions, determining that a component associated with the second code slice is a third party component included in the software product;
wherein the preset condition includes any one of the following:
the folder in the data packet to which the second code slice belongs is matched with the folder type;
the file in the data packet to which the second code slice belongs is matched with the file type;
the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type;
and under the condition that the component numbers with the mapping relation to the component features corresponding to the first code slice are not queried, carrying out slice collection on the software product again.
2. The method of claim 1, wherein, in the case that the first code slice is not a preset blacklist slice, analyzing the first code slice, and before obtaining the component feature corresponding to the first code slice, the method further comprises:
Acquiring a component identification path corresponding to the first code slice;
inquiring the component path corresponding to the first code slice in a preset filtering blacklist;
if the component path corresponding to the first code slice is not inquired, determining that the first code slice is not a preset blacklist slice;
if the component path corresponding to the first code slice is obtained through inquiry, determining that the first code slice is a preset blacklist slice;
and the filtered blacklist stores component paths corresponding to the blacklist slices.
3. The method of claim 1, wherein the slicing acquisition of the software product, after obtaining the first code slice of the software product, further comprises:
and re-acquiring the slice of the software product under the condition that the first code slice is a preset blacklist slice.
4. A component identification device, comprising:
the first acquisition module is used for carrying out slice acquisition on the software product and acquiring a first code slice of the software product;
the analyzing module is used for analyzing the first code slice to obtain the component characteristics corresponding to the first code slice under the condition that the first code slice is not a preset blacklist slice;
The first determining module is used for determining a component identification result corresponding to the software product according to a preset component feature library and the component features corresponding to the first code slice; the component identification result is used for characterizing a third party component included in the software product; the determining the component recognition result corresponding to the software product according to the preset component feature library and the component feature corresponding to the first code slice comprises the following steps: the component feature library comprises a feature basic table and a feature increment table, wherein the feature basic table stores the mapping relation between component features and component numbers; the feature increment table stores a mapping relation between numbers and component versions and increment types, wherein the increment types comprise folder types, file types and changed file types; querying component characteristics corresponding to the first code slice in the characteristic basic table; under the condition that the component numbers with mapping relation to the component features corresponding to the first code slices are obtained through inquiry, determining component identification results corresponding to the software products according to the feature increment table; and determining a component identification result corresponding to the software product according to the characteristic increment table, wherein the component identification result comprises: querying the component numbers in the feature increment table; if the unique component version with the mapping relation with the component number is obtained by inquiry, determining that the component associated with the component number is a third party component included in the software product; if a plurality of component versions with mapping relation with the component numbers are obtained through inquiry, determining a component identification result corresponding to the software product based on a second code slice; the second code slice is obtained based on the expansion slice processing of the first code slice, and the data volume corresponding to the second code slice is larger than the data volume corresponding to the first code slice; the determining, based on the second code slice, a component identification result corresponding to the software product includes: performing expansion slicing processing on the first code slice to obtain the second code slice; acquiring slice information corresponding to the second code slice; the slice information comprises folders in the data packet to which the second code slice belongs, files in the data packet to which the second code slice belongs, and hash values corresponding to the data packet to which the second code slice belongs; determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table; the determining a component identification result corresponding to the software product based on the slice information corresponding to the second code slice and the feature increment table comprises the following steps: under the condition that slice information corresponding to the second code slice and the characteristic increment table meet preset conditions, determining that a component associated with the second code slice is a third party component included in the software product; wherein the preset condition includes any one of the following: the folder in the data packet to which the second code slice belongs is matched with the folder type; the file in the data packet to which the second code slice belongs is matched with the file type; the hash value corresponding to the data packet to which the second code slice belongs is matched with the changed file type; and under the condition that the component numbers with the mapping relation to the component features corresponding to the first code slice are not queried, carrying out slice collection on the software product again.
5. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the component identification method of any of claims 1 to 3 when the computer program is executed.
6. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the component identification method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311725580.6A CN117406967B (en) | 2023-12-15 | 2023-12-15 | Component identification method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311725580.6A CN117406967B (en) | 2023-12-15 | 2023-12-15 | Component identification method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117406967A CN117406967A (en) | 2024-01-16 |
CN117406967B true CN117406967B (en) | 2024-03-22 |
Family
ID=89500347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311725580.6A Active CN117406967B (en) | 2023-12-15 | 2023-12-15 | Component identification method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117406967B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763928A (en) * | 2018-05-03 | 2018-11-06 | 北京邮电大学 | A kind of open source software leak analysis method, apparatus and storage medium |
CN109918285A (en) * | 2018-12-28 | 2019-06-21 | 北京奇安信科技有限公司 | A kind of safety recognizing method and device of open source software |
CN111625839A (en) * | 2020-05-29 | 2020-09-04 | 深圳前海微众银行股份有限公司 | Third-party component vulnerability detection method, device, equipment and computer storage medium |
CN112631586A (en) * | 2020-12-24 | 2021-04-09 | 软通动力信息技术(集团)股份有限公司 | Application development method and device, electronic equipment and storage medium |
CN112800430A (en) * | 2021-02-01 | 2021-05-14 | 苏州棱镜七彩信息科技有限公司 | Safety and compliance management method suitable for open source assembly |
CN113177001A (en) * | 2021-05-24 | 2021-07-27 | 深圳前海微众银行股份有限公司 | Vulnerability detection method and device for open source component |
CN115906104A (en) * | 2023-02-23 | 2023-04-04 | 国网山东省电力公司泰安供电公司 | Safety detection method and device for secondary packaged open-source assembly |
CN116932406A (en) * | 2023-07-27 | 2023-10-24 | 中移动信息技术有限公司 | Component detection method, device, terminal equipment and storage medium |
CN117032782A (en) * | 2023-08-16 | 2023-11-10 | 北京安全共识科技有限公司 | SCA-based software security version identification method, system and processing equipment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10162612B2 (en) * | 2016-01-04 | 2018-12-25 | Syntel, Inc. | Method and apparatus for inventory analysis |
-
2023
- 2023-12-15 CN CN202311725580.6A patent/CN117406967B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763928A (en) * | 2018-05-03 | 2018-11-06 | 北京邮电大学 | A kind of open source software leak analysis method, apparatus and storage medium |
CN109918285A (en) * | 2018-12-28 | 2019-06-21 | 北京奇安信科技有限公司 | A kind of safety recognizing method and device of open source software |
CN111625839A (en) * | 2020-05-29 | 2020-09-04 | 深圳前海微众银行股份有限公司 | Third-party component vulnerability detection method, device, equipment and computer storage medium |
CN112631586A (en) * | 2020-12-24 | 2021-04-09 | 软通动力信息技术(集团)股份有限公司 | Application development method and device, electronic equipment and storage medium |
CN112800430A (en) * | 2021-02-01 | 2021-05-14 | 苏州棱镜七彩信息科技有限公司 | Safety and compliance management method suitable for open source assembly |
CN113177001A (en) * | 2021-05-24 | 2021-07-27 | 深圳前海微众银行股份有限公司 | Vulnerability detection method and device for open source component |
CN115906104A (en) * | 2023-02-23 | 2023-04-04 | 国网山东省电力公司泰安供电公司 | Safety detection method and device for secondary packaged open-source assembly |
CN116932406A (en) * | 2023-07-27 | 2023-10-24 | 中移动信息技术有限公司 | Component detection method, device, terminal equipment and storage medium |
CN117032782A (en) * | 2023-08-16 | 2023-11-10 | 北京安全共识科技有限公司 | SCA-based software security version identification method, system and processing equipment |
Non-Patent Citations (1)
Title |
---|
Maven在企业Java软件产品中的应用;李俊杰;电脑知识与技术;20110305(第07期);第108-111+134页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117406967A (en) | 2024-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110069449B (en) | File processing method, device, computer equipment and storage medium | |
CN111245548B (en) | Data synchronization method and device based on time stamp and computer equipment | |
US10938961B1 (en) | Systems and methods for data deduplication by generating similarity metrics using sketch computation | |
CN109189367B (en) | Data processing method, device, server and storage medium | |
WO2019148712A1 (en) | Phishing website detection method, device, computer equipment and storage medium | |
CN111190901B (en) | Business data storage method and device, computer equipment and storage medium | |
US11119995B2 (en) | Systems and methods for sketch computation | |
CN110908778B (en) | Task deployment method, system and storage medium | |
CN111858467B (en) | File data processing method, device, equipment and medium based on artificial intelligence | |
CN112395157B (en) | Audit log acquisition method and device, computer equipment and storage medium | |
CN110659297A (en) | Data processing method, data processing device, computer equipment and storage medium | |
CN110795171B (en) | Service data processing method, device, computer equipment and storage medium | |
CN110888872A (en) | Data storage method and device, computer equipment and storage medium | |
CN112100134A (en) | Method and device for exporting large file, storage medium and computer equipment | |
CN112613271A (en) | Data paging method and device, computer equipment and storage medium | |
CN111291083B (en) | Webpage source code data processing method and device and computer equipment | |
CN112783866B (en) | Data reading method, device, computer equipment and storage medium | |
CN108460116B (en) | Search method, search device, computer equipment, storage medium and search system | |
CN114398520A (en) | Data retrieval method, system, device, electronic equipment and storage medium | |
CN111259012B (en) | Data homogenizing method, device, computer equipment and storage medium | |
CN117406967B (en) | Component identification method and device, electronic equipment and storage medium | |
CN112631833A (en) | Data archiving and querying method, system, storage medium and equipment | |
CN115277678B (en) | File downloading method, device, computer equipment and storage medium | |
CN108241710A (en) | A kind of file creating method, device and file polling method, apparatus | |
CN114461606A (en) | Data storage method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |