CN113626100A - Shared library processing method and device, electronic equipment and storage medium - Google Patents

Shared library processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113626100A
CN113626100A CN202110872545.1A CN202110872545A CN113626100A CN 113626100 A CN113626100 A CN 113626100A CN 202110872545 A CN202110872545 A CN 202110872545A CN 113626100 A CN113626100 A CN 113626100A
Authority
CN
China
Prior art keywords
target
file
target file
shared library
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110872545.1A
Other languages
Chinese (zh)
Inventor
吴俊洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202110872545.1A priority Critical patent/CN113626100A/en
Publication of CN113626100A publication Critical patent/CN113626100A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44557Code layout in executable memory
    • G06F9/44563Sharing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The disclosure relates to a shared library processing method and device, electronic equipment and a storage medium. The method comprises the following steps: determining a first target file set, wherein the first target file set is a set of target files commonly contained in a plurality of first shared library files; determining a second target file set corresponding to each first shared file library, wherein the second target file set is obtained by removing target files contained in common by the first shared library files; performing symbol analysis based on each second target file set to obtain a target symbol set, wherein the target symbol set comprises symbols which are not defined by any target file contained in the second target file set; when a symbol used by an object file contained in the first set of object files and defined exists in the set of object symbols, the derived identification of the symbol is determined to be derivable. Therefore, the files of the multiple shared libraries are further automatically split, the redundancy of the shared libraries is avoided, and the symbols are automatically exported in the splitting process.

Description

Shared library processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a shared library processing method and apparatus, an electronic device, and a storage medium.
Background
The computer program is a binary file that runs on top of an operating system. For a simple computer program, only one binary file with an entry function is required. When the computer program runs, the binary file is loaded by the operating system and the entry function of the binary file is called. As the complexity of computer programs increases, if a single file binary file is also used, any minor modifications will result, requiring the entire binary file to be reconstructed and also not shared by other computer programs (i.e., one computer program calls a function of another computer program). To this end, a computer program in the related art is generally composed of a binary file with an entry function, and a plurality of shared libraries. When a certain function of the computer program is modified, only the shared library to which the computer program belongs needs to be reconstructed, and all binary files corresponding to the computer program do not need to be reconstructed. The shared library is also a binary file, also called a shared library file, and is dynamically loaded into the memory and operated when the computer program is operated.
The computer program (which may be a binary file with an entry function or a shared library file) acquires the address of the method or the global variable in the shared library through the symbol, and then calls the method or the global variable because the symbol establishes a mapping relationship with the address information of the method or the global variable in the shared library. Since there are multiple functions or global variables in the shared library, but not all of them need to be used by the external computer program, and the more symbols, the larger the files of the shared library, when building the shared library, it needs to specify which symbols can be used by the external computer program, this behavior is called symbol derivation, and the specified derivable symbols can be used by the external computer program.
In the process of building the shared library, relevant symbols in the stripped public code need to be exported so as to be called by other shared libraries. In the related art, there are various ways to derive symbols through a shared library construction tool:
the method I is to export a program source file (source file for short). The method or global variable defined in the source file is derived by default, and whether a symbol is derivable can also be specified by the associated macro. A symbol designated as exportable may still be changed to non-exportable during a subsequent compile and link time period, but a symbol already designated as non-exportable cannot be changed to exportable during a subsequent compile and link time period.
And a second mode is compiling period derivation. The compilation stage may compile the source file to generate the target file. During compiling, whether the exportable symbols in a source file are exported or not is specified through compiling options, the exportable symbols are exported at a file level, and all exportable symbols in the source file are effective. If certain symbols are indicated in the source file as exportable, these symbols can all be changed to non-exportable by the compile option. However, if certain symbols are indicated in the source file as not exportable, they cannot be changed to exportable.
And thirdly, link period derivation. The link period is to link a plurality of target files to generate a shared library file. In the linking process, whether the derivable symbols can be finally derived is specified through a linking option. If a certain symbol of the object file generated during compile time is exportable, it may be changed to be non-exportable by a link option. However, if a symbol in the target file is already exportable, it cannot be changed to exportable either.
It can be seen from the above that, when the symbols are exported by the shared library construction tool, a large number of manual operations are required, which is very easy to introduce errors and has poor maintainability, and the shared library construction tool limits the symbols whose source file, compilation period and link period are specified as non-exportable symbols to be unable to be exported, which may result in that some necessary symbols cannot be exported, and at the same time, may also result in more redundantly exported symbols in the shared library, increasing the volume of the shared library.
Disclosure of Invention
The present disclosure provides a shared library processing method, apparatus, electronic device and storage medium, to at least solve the problems in the related art that essential symbols cannot be derived and that shared library derived symbols are redundant. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a shared library processing method, including:
determining a first target file set, wherein the first target file set is a set of target files commonly contained in a plurality of first shared library files, the first shared library files contain a plurality of target files, the target files are obtained by compiling source files, the target files contain used symbols and derived identifiers of the symbols, and the derived identifiers are used for identifying whether the symbols can be derived or not;
determining a second target file set corresponding to each first shared file library, wherein the second target file set is obtained by removing the target files contained in common from the first shared library files;
performing symbol analysis on the basis of each second target file set to obtain a target symbol set, wherein the target symbol set comprises symbols which are not defined by any target file contained in the second target file set;
when the symbol used and defined by the target file contained in the first target file set exists in the target symbol set, determining that the derived identifier of the symbol is derivable.
In a possible implementation manner, the step of performing symbol analysis based on each second target file set to obtain a target symbol set includes:
performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set, wherein the first candidate symbol set comprises symbols which are not defined by any target file contained in the second target file set;
and merging the first candidate symbol sets corresponding to the second target file sets to obtain the target symbol sets.
In a possible implementation manner, the step of performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set includes:
for each target file contained in the second target file set, obtaining symbols defined by the target file contained in each target file to obtain a second candidate symbol set, and obtaining symbols undefined by the target file contained in each target file to obtain a third candidate symbol set;
and taking a difference set of the third candidate symbol set and the second candidate symbol set to obtain the first candidate symbol set.
In a possible embodiment, the step of determining the first set of target files includes:
analyzing each target file contained in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file;
and taking intersection of all the third target file sets to obtain the first target file set.
In a possible implementation manner, the generating of the first shared library file is based on a plurality of input file links, where the plurality of input files include the target file and/or a static library file, where the static library file includes a plurality of target files, and the parsing, from the first shared library file, each target file included in the first shared library file to obtain a third target file set corresponding to the first shared library file includes:
traversing each input file in the plurality of input files, adding the input file to the third target file set when the input file is the target file, and analyzing the target file contained in the static library file and adding the target file to the third target file set when the input file is the static library file.
In one possible embodiment, the method further comprises:
and generating a second shared library file by each target file link contained in the first target file set and generating a third shared library file by each target file link contained in the second target file set based on the export result, wherein the third shared library file depends on the second shared library file.
According to a second aspect of the embodiments of the present disclosure, there is provided a shared library processing apparatus including:
a first determining unit, configured to perform determining a first target file set, where the first target file set is a set of target files commonly contained in a plurality of first shared library files, the first shared library files contain a plurality of target files, the target files are compiled from a source file, the target files contain symbols used and derived identifiers of the symbols, and the derived identifiers are used for identifying whether the symbols can be derived or not;
a second determining unit, configured to perform determination of a second target file set corresponding to each of the first shared library files, where the second target file set is obtained by removing the target files contained in common from the first shared library files;
a symbol analysis unit configured to perform symbol analysis based on each of the second target file sets to obtain a target symbol set, where the target symbol set includes symbols that are not defined by any of the target files included in the second target file set;
a third determination unit configured to perform, when a symbol used and defined by the target file included in the first target file set exists in the target symbol set, determining that an export identifier of the symbol is exportable.
In a possible implementation, the symbol analysis unit is specifically configured to perform:
performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set, wherein the first candidate symbol set comprises symbols which are not defined by any target file contained in the second target file set;
and merging the first candidate symbol sets corresponding to the second target file sets to obtain the target symbol sets.
In a possible implementation, the symbol analysis unit is specifically configured to perform:
for each target file contained in the second target file set, obtaining symbols defined by the target file contained in each target file to obtain a second candidate symbol set, and obtaining symbols undefined by the target file contained in each target file to obtain a third candidate symbol set;
and taking a difference set of the third candidate symbol set and the second candidate symbol set to obtain the first candidate symbol set.
In a possible implementation, the first determining unit is specifically configured to perform:
analyzing each target file contained in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file;
and taking intersection of all the third target file sets to obtain the first target file set.
In a possible implementation manner, the first shared library file is generated based on a plurality of input file links, the plurality of input files include the target file and/or a static library file, the static library file includes a plurality of target files, and the first determining unit is specifically configured to perform:
traversing each input file in the plurality of input files, adding the input file to the third target file set when the input file is the target file, and analyzing the target file contained in the static library file and adding the target file to the third target file set when the input file is the static library file.
In a possible embodiment, the apparatus further comprises:
and the link generating unit is configured to generate a second shared library file by using each target file link contained in the first target file set and generate a third shared library file by using each target file link contained in the second target file set based on the export result, wherein the third shared library file depends on the second shared library file.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute instructions to implement the shared library processing method as in any of the first aspects.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having instructions that, when executed by a processor of an electronic device, enable the electronic device to perform the shared library processing method as in any one of the first aspects.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by an electronic device, implement the shared library processing method as in any one of the first aspects.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: firstly, a first target file set is determined, the first target file set is a set of target files commonly contained in a plurality of first shared library files and can be used as common codes, so that the automatic extraction of the common codes is realized, secondly, a second target file set corresponding to each shared library file is determined, the second target file set is obtained by removing the target files commonly contained, namely the second target file set is a target file set after removing redundancy, so that the automatic stripping of the common codes is realized, thirdly, symbol analysis is carried out based on each second target file set so as to obtain a target symbol set, the target symbol set contains symbols which are not defined by any target file contained in the second target file set and are external symbols for the second target file set, the common code possibly from the stripping, namely the first target file set, for the target files in the first target file set, if the symbols used and defined in the target files exist in the target symbol set, it is stated that the first symbols may be external symbols used by the second target file set, and therefore need to be exported, so that the symbols needed to be exported for the common code are obtained, and automatic export of the symbols of the common code is realized. Therefore, the scheme realizes the further automatic splitting of the files of the plurality of shared libraries, avoids the redundancy of the shared libraries, improves the processing efficiency, the accuracy and the maintainability, and automatically leads out the symbols in the splitting process. Compared with the scheme in the related technology, the scheme provides logic for automatically exporting the symbols needing to be exported, and avoids the problem that the building tool of the shared library in the related technology limits the source file, the compiling period and the link period to be the symbols which cannot be exported, so that part of the required symbols cannot be exported. In addition, the method for automatically deriving the symbols in the scheme can more accurately derive the necessary symbols, avoid redundancy caused by excessive symbol derivation, facilitate the correctness and integrity of the follow-up establishment of the dependency of the link library, and ensure the operation of the shared library.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow diagram illustrating a shared library processing method in accordance with an exemplary embodiment.
FIG. 2 is a block diagram illustrating a shared library according to an exemplary embodiment.
FIG. 3 is a schematic diagram illustrating a symbol analysis in accordance with an exemplary embodiment.
FIG. 4 is a block diagram illustrating a shared library processing apparatus according to an example embodiment.
FIG. 5 is a block diagram illustrating an apparatus in accordance with an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating a shared library processing method according to an exemplary embodiment, which is used in an electronic device such as a computer, as shown in fig. 1, and includes the following steps.
In step S11, a first target file set is determined, where the first target file set is a set of target files commonly included in a plurality of first shared library files, the first shared library file includes a plurality of target files, the target files are obtained by compiling the source file, the target files include the used symbols and derivation identifiers of the symbols, and the derivation identifiers are used to identify whether the symbols are exportable.
In practical application, the plurality of first shared library files are to-be-split shared library files with redundancy. Each first shared library file comprises a plurality of target files obtained by compiling the source files. For example, the first shared library file F1 includes a target file F1 and a target file F2, the first shared library file F2 includes a target file F1 and a target file F3, and the first shared library file F1 and the first shared library file F2 both include the target file F1. Through the step, all target files which bring redundancy phenomena are determined, namely the first target file set is determined, and the target files can be used as common codes, so that the extraction of the common codes is realized. Still taking the first shared library file F1 and the first shared library file F2 as examples, the first shared library file F1 and the first shared library file F2 both contain the target file F1, and then the target file F1, i.e. the target file commonly contained by the above two first shared library files, can be taken as an element of the first target file set.
In step S12, a second target file set corresponding to each first shared library is determined, where the second target file set is obtained by removing the target files that are commonly included in the first shared library files.
The second target file set in this step is obtained by removing the target file commonly contained in the first shared library file, that is, the target file which brings a redundancy phenomenon is not contained, so that the stripping of the common code of the first shared library file is realized. Still taking the example of the first shared library file F1 and the first shared library file F2, the target file F1 in the first shared library file F1 and the first shared library file F2 exists in the first target file set, and after the removal, the second target file set corresponding to the first shared library file F1 and the second target file set corresponding to the first shared library file F2 do not contain the target file F1 any more.
In step S13, a symbol analysis is performed based on each second target file set to obtain a target symbol set, where the target symbol set includes symbols that are not defined by any target file included in the second target file set.
In practical application, when the shared library is split, relevant symbols in the stripped public code need to be derived so as to be called by other shared libraries, and therefore, the symbols need to be analyzed. For the second set of target files, the symbol used in the second set of target files and defined by a certain target file included in the second set of target files may be referred to as an internal symbol of the second set of target files, and the symbol used in the second set of target files but not defined by any target file included in the second set of target files is a symbol from the outside of the second set of target files, also referred to as an external symbol of the second set of target files, and for these second set of target files, these external symbols may be from a stripped-off common code, i.e., the first set of target files.
In step S14, when the symbol used and defined by the target file included in the first set of target files exists in the set of target symbols, it is determined that the derived identification of the symbol is derivable.
For the object files in the first object file set, if the first symbol used and defined in the object file exists in the object symbol set, it means that the first symbol may be an external symbol used by the second object file set, and therefore needs to be derived, and thus, a symbol that needs to be derived for the common code is obtained.
In this embodiment, first, a first target file set is determined, where the first target file set is a set of target files commonly included in a plurality of first shared library files and may be used as a common code, so as to implement automatic extraction of the common code, second, a second target file set corresponding to each shared library file is determined, where the second target file set is obtained by removing the target files commonly included, that is, the second target file set is a target file set from which redundancy is removed, so as to implement automatic stripping of the common code, and then, symbol analysis is performed based on each second target file set to obtain a target symbol set, where the target symbol set includes symbols that are not defined by any target file included in the second target file set and are external symbols for the second target file set, the common code possibly from the stripping, namely the first target file set, for the target files in the first target file set, if the symbols used and defined in the target files exist in the target symbol set, it is stated that the first symbols may be external symbols used by the second target file set, and therefore need to be exported, so that the symbols needed to be exported for the common code are obtained, and automatic export of the symbols of the common code is realized. Therefore, the scheme realizes the further automatic splitting of the files of the plurality of shared libraries, avoids the redundancy of the shared libraries, improves the processing efficiency, the accuracy and the maintainability, and automatically leads out the symbols in the splitting process. Compared with the scheme in the related technology, the scheme provides logic for automatically exporting the symbols needing to be exported, and avoids the problem that the building tool of the shared library in the related technology limits the source file, the compiling period and the link period to be the symbols which cannot be exported, so that part of the required symbols cannot be exported. In addition, the method for automatically deriving the symbols in the scheme can more accurately derive the necessary symbols, avoid redundancy caused by excessive symbol derivation, facilitate the correctness and integrity of the follow-up establishment of the dependency of the link library, and ensure the operation of the shared library.
In an exemplary embodiment, the shared library processing method may further include: and generating a second shared library file by linking each target file contained in the first target file set and generating a third shared library file by linking each target file contained in the second target file set based on the export result, wherein the third shared library file depends on the second shared library file.
The obtained third shared library file is the new shared library file with redundancy removed, the second shared library file is the external public shared library file corresponding to each third shared library file, and each second shared library file can call the target file in the public shared library file.
In this embodiment, based on the export result, each target file link included in the first target file set is generated into a second shared library file serving as a common shared library file, and each target file link included in the second target file set is generated into a third shared library file with redundancy removed, so that a shared library file with a smaller size is formed, redundancy is reduced, and memory consumed during the operation of a computer program is reduced.
In practical application, the link can be performed through a linker, when the link is performed, a third shared library file can be specified to depend on a second shared library file, the linker analyzes symbols used in target files of the target file set, if the used symbols cannot be found in all the target files of the target file set, the symbols can be searched in an external shared library file specified when the link is performed, and if the symbols cannot be found in the external shared library file, the link can fail.
In an exemplary embodiment, a specific implementation manner of the step of determining the first target file set may include: analyzing each target file contained in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file; and taking intersection of all the third target file sets to obtain a first target file set.
In practical application, for a plurality of first shared library files, a third target file set corresponding to each first shared library file can be obtained independently, then an intersection is taken, the intersection part is shared by the first shared library files, if the intersection is not empty, the shared library files are indicated to have redundant target files, and thus all the redundant target files can be analyzed quickly to obtain the common codes of the plurality of first shared library files.
Of course, the first target file set may also be obtained in other manners, for example, the following processing may be performed for each first shared file library: and judging whether the target file exists in other first shared library files or not for each target file in the first shared library, and if so, adding the target file to the first target file set.
In an exemplary embodiment, the first shared library file is generated based on a plurality of input file links. The plurality of input files may include a target file and/or a static library file. The static library file may contain a plurality of target files.
In practical applications, another intermediate product appears when constructing a shared library: static library, suffix name may be ". a". The static library is only a container, a plurality of target files are simply packaged into one file, the static library file is also called as a static library file, in the linking process, the static library file and the target files can be linked together to generate a final shared library file, and the linking is more convenient. Therefore, the input file for generating the shared library may be a target file or a static library file.
For example, as shown in fig. 2, the input files for generating the shared library file liba.so include not only the object file a.o and the object file b.o, but also a static library file c.a and a static library file d.a, wherein the static library file c.a is packaged by the object file c1.o and the object file c2.o, and the static library file d.a is packaged by the object file d1.o and the object file d2. o.
Correspondingly, the step of analyzing each target file included in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file may specifically include: and traversing each input file in the plurality of input files, adding the input file to a third target file set when the input file is a target file, and analyzing the target file contained in the static library file and adding the target file to the third target file set when the input file is a static library file.
In an exemplary embodiment, the step of performing symbol analysis based on each second target file set to obtain a target symbol set may specifically include: performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set, wherein the first candidate symbol set comprises symbols which are not defined by any target file contained in the second target file set; and merging the first candidate symbol sets corresponding to the second target file sets to obtain target symbol sets. In practical application, each second target file set may be analyzed separately to obtain respective external symbols, and the symbols are combined, that is, all external symbols that may come from a common code, that is, the target symbol set, may be obtained quickly.
In an exemplary embodiment, the step of performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set may specifically include: for each target file contained in the second target file set, obtaining symbols defined by the target file contained in each target file to obtain a second candidate symbol set, and obtaining symbols undefined by the target file contained in each target file to obtain a third candidate symbol set; and taking a difference set of the third candidate symbol set and the second candidate symbol set to obtain a first candidate symbol set.
The difference set between the third candidate symbol set and the second candidate symbol set is a difference set obtained by subtracting the second candidate symbol set from the third candidate symbol set.
In practical applications, for the second set of target files, if a symbol defined by one target file is a symbol that is used but undefined by another target file in the set, the symbol is still an internal symbol for the whole set, and if a symbol that is used but undefined by one target file is not found in the set, that is, the whole set is undefined, the symbol is an external symbol for the whole set, that is, the first candidate symbol set obtained by taking the difference set of the third candidate symbol set and the second candidate symbol set is an external symbol for the whole set. In this way, the first candidate symbol set can be analyzed quickly and accurately.
Of course, the first candidate symbol set may also be obtained in other manners, for example, obtaining symbols of each target file included in the second target file set to obtain a fourth candidate symbol set, obtaining symbols of each definition included in each target file to obtain a fifth candidate symbol set, and obtaining a difference set (i.e., a difference set obtained by subtracting the fifth candidate symbol set from the fourth candidate symbol set) of the fourth candidate symbol set and the fifth candidate symbol set to obtain the first candidate symbol set. Thus, the symbols defined in all symbols in the fourth set of candidate symbols are subtracted, leaving all undefined symbols, i.e. external symbols.
In an exemplary embodiment, the step of performing symbol analysis based on each second target file set to obtain a target symbol set may specifically include: for each target file included in all the second target file sets, obtaining symbols of each target file to obtain a sixth candidate symbol set, obtaining each defined symbol included in each target file to obtain a seventh candidate symbol set, and obtaining a difference set (i.e., a difference set obtained by subtracting the seventh candidate symbol set from the sixth candidate symbol set) between the sixth candidate symbol set and the seventh candidate symbol set to obtain a target symbol set. Thus, all symbols defined in all the second set of target files are subtracted, and all the rest are undefined symbols, i.e. external symbols.
A shared library processing method provided in the embodiment of the present disclosure is described in more detail below by taking a split scenario of two first shared library files as an example.
The technical scheme of the embodiment comprises two parts: a common code lookup and a sign derivation of the common code.
First part, common code lookup.
In practical application, the basic construction process of the adopted shared library file is as follows:
first, a source file (e.g., a C or C + + source file) is compiled to generate a target file with a suffix name of ". o". The target file is a file in an Executable and Linkable (ELF) Format, and is widely used in a Portable Operating System Interface (POSIX), and except the target file, the Executable binary file and the shared library file can be in such a file Format.
Then, the plurality of target files are linked, and a shared library file is generated with a suffix ". so". In the linking process, the linker analyzes the symbols used in the target files, if the used symbols cannot be found in all the target files, the symbols can be searched in an external shared library designated during linking, and if the symbols cannot be found in the external shared library, the linking can fail.
In this embodiment, through the basic construction process of the shared library file, two first shared library files are obtained: so, and if the two first shared library files have redundancy, splitting the two first shared library files, wherein the process of searching the public code is as follows:
the method comprises the steps of firstly, traversing an input file set (namely, the plurality of input files) of a first shared library file libx.
And secondly, traversing an input file set of the first shared library file liby.so, if the input file is a target file, directly entering a target file set B (namely the third target file set), and if the input file is a static library file, analyzing the static library file to obtain a target file set and entering the target file set B.
And thirdly, acquiring an intersection of the target file set A and the target file set B to obtain a target file set C (namely the first target file set), wherein the target file set C is used for linking and generating a public shared library libz.
And fourthly, taking a difference set between the target file set A and the target file set C to form a new target file set A '(namely the second target file set), wherein the target files in the target file set A' are used for linking a new shared library libx.so (namely the third shared library file) after the common codes are stripped, and the new libx.so depends on libz.so.
And fifthly, taking a difference set between the target file set B and the target file set C to form a new target file set B '(namely the second target file set), wherein the target files in the target file set B' are used for linking a new shared library liby.so (namely the third shared library file) after the common codes are stripped, and the new liby.so depends on libz.so.
Second part, the sign derivation of the common code.
The object file stores internally defined symbol information and undefined but used external symbol information. In addition, an identification of whether the symbol is derivable, i.e. a derived identification, is also stored in the target file. Also taking the above two first shared library files libx.so and liby.so as for example, the symbol derivation flow of the common code is as follows:
the first step is to extract the external symbols in the target file set A'. Traversing each target file in the target file set A', as shown in FIG. 3, extracting symbols defined by the target file, putting the symbols into the symbol set A1 (i.e. the second candidate symbol set), where s1, s2, s3, s4 are elements in the symbol set A1, extracting symbols used but not defined by the target file, putting the symbols into the symbol set A2 (i.e. the third candidate symbol set), and s3, s4, s5, s6 are elements in the symbol set A2. If a target file uses but undefined symbols that are defined by another target file in the set of target files A ', then the symbol is still an internal symbol for the set of target files A' as a whole (i.e., the symbol exists in both the symbol sets A1 and A2); if a symbol, which is used but undefined by the target file, is not found in the set of target files A2, then this symbol is an external symbol to the set of target files A' as a whole. Taking the difference set A3 between the symbol set A2 and the symbol set A1 as the external symbol set (i.e. the first candidate symbol set mentioned above) of the target file set A', s5 and s6 are the elements in the symbol set A3.
And secondly, extracting external symbols in the target file set B'. Traversing each target file in the target file set B', extracting symbols defined by the target file, putting the symbols into the symbol set B1 (i.e. the second candidate symbol set), extracting symbols used by but not defined by the target file, and putting the symbols into the symbol set B2 (i.e. the third candidate symbol set). If a symbol used but not defined by an object file is a symbol defined by another object file in the object file set B ', the symbol is still an internal symbol for the whole symbol set B' (i.e., the symbol exists in both symbol sets B1 and B2); if a symbol, which is used but undefined by the target file, is not found in the symbol set B2, then this symbol is an external symbol to the target file set B' as a whole. The symbol set B3 and the difference set B3 of the symbol set B1 are taken as the outer symbol set (i.e., the first candidate symbol set) of the target file set B'. The process of generating the symbol set B3 can be seen in the flow chart generated by A3 above.
And thirdly, taking a union of the symbol set A3 and the symbol set B3 to obtain a symbol set C3 (namely the target symbol set). The set of symbols derived externally by the common shared library is a subset of the set of symbols C3 because some of the symbols in the set of symbols C3 are provided by the operating system and none of these symbols are in the common code.
And fourthly, traversing each target file in the target file set C, extracting a symbol defined by the target file, determining that the derived identifier of the symbol is derivable if the symbol is in the symbol set C3, wherein the derived identifier of the symbol can be modified to be derivable, and if the symbol is not in the symbol set C3, not modifying.
And fifthly, linking the target files in the target file set C to generate a public shared library libz.
And sixthly, linking the files in the target file set A' to generate libx.so (namely the third shared library file), and specifying that libx.so depends on libz.so through a link option.
And seventhly, the files in the target file set B' are linked to generate liby.
According to the technical scheme provided by the embodiment of the disclosure, the common codes of a plurality of shared libraries can be extracted to form a common shared library, so that the size of each shared library is reduced, and the memory consumption during operation is reduced. For the application program installation package of the shared library obtained by adopting the public scheme, the size of the program installation package is reduced, and the success rate of a user in the process of downloading the program installation package is improved.
FIG. 4 is a block diagram illustrating a shared library processing apparatus according to an example embodiment. Referring to fig. 4, the apparatus 400 includes a first determination unit 401, a second determination unit 402, a symbol analysis unit 403, and a third determination unit 404.
A first determining unit 401 configured to perform determining a first target file set, where the first target file set is a set of target files commonly included in a plurality of first shared library files, the first shared library files include a plurality of target files, the target files are obtained by compiling a source file, the target files include used symbols and derived identifiers of the symbols, and the derived identifiers are used for identifying whether the symbols can be derived;
a second determining unit 402, configured to perform determining a second target file set corresponding to each first shared library, where the second target file set is obtained by removing target files included in common from the first shared library files;
a symbol analysis unit 403, configured to perform symbol analysis based on each second target file set to obtain a target symbol set, where the target symbol set includes symbols that are not defined by any target file included in the second target file set;
a third determining unit 404 configured to perform determining that the derived identity of the symbol used by the target file contained in the first set of target files is derivable when the symbol exists in the set of target symbols.
In a possible implementation, the symbol analysis unit is specifically configured to perform:
performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set, wherein the first candidate symbol set comprises symbols which are not defined by any target file contained in the second target file set;
and merging the first candidate symbol sets corresponding to the second target file sets to obtain target symbol sets.
In a possible implementation, the symbol analysis unit is specifically configured to perform:
for each target file contained in the second target file set, obtaining symbols defined by the target file contained in each target file to obtain a second candidate symbol set, and obtaining symbols undefined by the target file contained in each target file to obtain a third candidate symbol set;
and taking a difference set of the third candidate symbol set and the second candidate symbol set to obtain a first candidate symbol set.
In a possible implementation, the first determining unit is specifically configured to perform:
analyzing each target file contained in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file;
and taking intersection of all the third target file sets to obtain a first target file set.
In a possible implementation manner, the first shared library file is generated based on a plurality of input file links, the plurality of input files include a target file and/or a static library file, the static library file includes a plurality of target files, and the first determining unit is specifically configured to perform:
and traversing each input file in the plurality of input files, adding the input file to a third target file set when the input file is a target file, and analyzing the target file contained in the static library file and adding the target file to the third target file set when the input file is a static library file.
In a possible implementation, the apparatus may further include:
and the link generating unit is configured to generate a second shared library file by linking each target file contained in the first target file set and generate a third shared library file by linking each target file contained in the second target file set based on the export result, wherein the third shared library file depends on the second shared library file.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
FIG. 5 is a block diagram illustrating an apparatus 500 for shared library processing in accordance with an example embodiment. For example, the apparatus 500 may be provided as an electronic device. Referring to fig. 5, the apparatus 500 includes a processing component 522 that further includes one or more processors and memory resources, represented by memory 532, for storing instructions, such as applications, that are executable by the processing component 522. The application programs stored in memory 532 may include one or more modules that each correspond to a set of instructions. Further, the processing component 522 is configured to execute instructions to perform the shared library processing method described above.
The apparatus 500 may also include a power component 526 configured to perform power management of the apparatus 500, a wired or wireless network interface 550 configured to connect the apparatus 500 to a network, and an input/output (I/O) interface 558. The apparatus 500 may operate based on an operating system stored in the memory 532, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a storage medium comprising instructions, such as the memory 532 comprising instructions, executable by the processor of the apparatus 500 to perform the shared library processing method described above is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided that includes computer instructions executable by a processor of the apparatus 500 to perform the shared library processing method described above. Alternatively, the computer instructions may be stored in a storage medium of the apparatus 500, which may be a non-transitory computer readable storage medium, such as a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method for processing a shared library, comprising:
determining a first target file set, wherein the first target file set is a set of target files commonly contained in a plurality of first shared library files, the first shared library files contain a plurality of target files, the target files are obtained by compiling source files, the target files contain used symbols and derived identifiers of the symbols, and the derived identifiers are used for identifying whether the symbols can be derived or not;
determining a second target file set corresponding to each first shared file library, wherein the second target file set is obtained by removing the target files contained in common from the first shared library files;
performing symbol analysis on the basis of each second target file set to obtain a target symbol set, wherein the target symbol set comprises symbols which are not defined by any target file contained in the second target file set;
when the symbol used and defined by the target file contained in the first target file set exists in the target symbol set, determining that the derived identifier of the symbol is derivable.
2. The method of claim 1, wherein the step of performing a symbol analysis based on each of the second target file sets to obtain a target symbol set comprises:
performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set, wherein the first candidate symbol set comprises symbols which are not defined by any target file contained in the second target file set;
and merging the first candidate symbol sets corresponding to the second target file sets to obtain the target symbol sets.
3. The method according to claim 2, wherein the step of performing symbol analysis on the second target file set to obtain a first candidate symbol set corresponding to the second target file set comprises:
for each target file contained in the second target file set, obtaining symbols defined by the target file contained in each target file to obtain a second candidate symbol set, and obtaining symbols undefined by the target file contained in each target file to obtain a third candidate symbol set;
and taking a difference set of the third candidate symbol set and the second candidate symbol set to obtain the first candidate symbol set.
4. The shared library processing method of claim 1, wherein the determining a first set of target files step comprises:
analyzing each target file contained in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file;
and taking intersection of all the third target file sets to obtain the first target file set.
5. The method according to claim 4, wherein the first shared library file is generated based on a plurality of input file links, the plurality of input files include the target file and/or a static library file, the static library file includes a plurality of target files, and the parsing step of each target file included in the first shared library file from the first shared library file to obtain a third target file set corresponding to the first shared library file includes:
traversing each input file in the plurality of input files, adding the input file to the third target file set when the input file is the target file, and analyzing the target file contained in the static library file and adding the target file to the third target file set when the input file is the static library file.
6. The shared library processing method of any of claims 1 to 5, further comprising:
and generating a second shared library file by each target file link contained in the first target file set and generating a third shared library file by each target file link contained in the second target file set based on the export result, wherein the third shared library file depends on the second shared library file.
7. A shared library processing apparatus, comprising:
a first determining unit, configured to perform determining a first target file set, where the first target file set is a set of target files commonly contained in a plurality of first shared library files, the first shared library files contain a plurality of target files, the target files are compiled from a source file, the target files contain symbols used and derived identifiers of the symbols, and the derived identifiers are used for identifying whether the symbols can be derived or not;
a second determining unit, configured to perform determination of a second target file set corresponding to each of the first shared library files, where the second target file set is obtained by removing the target files contained in common from the first shared library files;
a symbol analysis unit configured to perform symbol analysis based on each of the second target file sets to obtain a target symbol set, where the target symbol set includes symbols that are not defined by any of the target files included in the second target file set;
a third determination unit configured to perform, when a symbol used and defined by the target file included in the first target file set exists in the target symbol set, determining that an export identifier of the symbol is exportable.
8. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the shared library processing method of any of claims 1 to 6.
9. A storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform the shared library processing method of any one of claims 1 to 6.
10. A computer program product comprising computer instructions, characterized in that said computer instructions, when executed by an electronic device, implement the shared library processing method of any of claims 1 to 6.
CN202110872545.1A 2021-07-30 2021-07-30 Shared library processing method and device, electronic equipment and storage medium Pending CN113626100A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110872545.1A CN113626100A (en) 2021-07-30 2021-07-30 Shared library processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110872545.1A CN113626100A (en) 2021-07-30 2021-07-30 Shared library processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113626100A true CN113626100A (en) 2021-11-09

Family

ID=78381834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110872545.1A Pending CN113626100A (en) 2021-07-30 2021-07-30 Shared library processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113626100A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004151822A (en) * 2002-10-29 2004-05-27 Hitachi Ltd Method and program for binding common library
CN107797820A (en) * 2017-11-13 2018-03-13 北京百度网讯科技有限公司 Method and apparatus for generating patch
CN112947987A (en) * 2021-01-29 2021-06-11 视若飞信息科技(上海)有限公司 Processing method and device for shared library with multiple versions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004151822A (en) * 2002-10-29 2004-05-27 Hitachi Ltd Method and program for binding common library
CN107797820A (en) * 2017-11-13 2018-03-13 北京百度网讯科技有限公司 Method and apparatus for generating patch
US20190146777A1 (en) * 2017-11-13 2019-05-16 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating patch
CN112947987A (en) * 2021-01-29 2021-06-11 视若飞信息科技(上海)有限公司 Processing method and device for shared library with multiple versions

Similar Documents

Publication Publication Date Title
CN112394942B (en) Distributed software development compiling method and software development platform based on cloud computing
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
CN109032631B (en) Application program patch package obtaining method and device, computer equipment and storage medium
CN106547520B (en) Code path analysis method and device
CN109597618B (en) Program development method, program development device, computer device, and storage medium
CN112965720A (en) Component compiling method, device, equipment and computer readable storage medium
CN110737437A (en) compiling method and device based on code integration
CN111124480A (en) Application package generation method and device, electronic equipment and storage medium
CN110659210A (en) Information acquisition method and device, electronic equipment and storage medium
CN116431520A (en) Test scene determination method, device, electronic equipment and storage medium
CN112052039A (en) Method and device for synchronously constructing multiple projects and electronic equipment
CN111309332A (en) File content on-demand loading method and device, electronic equipment and storage medium
CN111679852A (en) Detection method and device for conflict dependency library
CN115951916A (en) Component processing method and device, electronic equipment and storage medium
CN113626100A (en) Shared library processing method and device, electronic equipment and storage medium
CN114840195B (en) Privatization method for iOS SDK static library
CN116010461A (en) Data blood relationship analysis method and device, storage medium and electronic equipment
CN114706586A (en) Code compiling method, code running method, code compiling device, code running device, computer equipment and storage medium
CN114625373A (en) Application conversion method and device, electronic equipment and storage medium
CN109284097B (en) Method, device, system and storage medium for realizing complex data analysis
CN112181951B (en) Heterogeneous database data migration method, device and equipment
CN114090514A (en) Log retrieval method and device for distributed system
CN114065197A (en) Call sequence generation method and device, electronic equipment, storage medium and product
CN109783133B (en) Code packaging method and device, computer equipment and storage medium
KR101673151B1 (en) Method and Apparatus for Analyzing Relationship Change of Program Source and DB Schema

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination