CN109918074B - Compiling link optimization method - Google Patents

Compiling link optimization method Download PDF

Info

Publication number
CN109918074B
CN109918074B CN201711294532.0A CN201711294532A CN109918074B CN 109918074 B CN109918074 B CN 109918074B CN 201711294532 A CN201711294532 A CN 201711294532A CN 109918074 B CN109918074 B CN 109918074B
Authority
CN
China
Prior art keywords
sym
key
symbol
function
marked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711294532.0A
Other languages
Chinese (zh)
Other versions
CN109918074A (en
Inventor
孟杰
薛皓琳
马瑶瑶
卢彦
杨建生
张蓓
方平
冯艳红
穆鹤林
程毅轩
杨晓璇
吴昆鹏
李洪彬
申利飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Standard Software Co Ltd
Original Assignee
China Standard Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Standard Software Co Ltd filed Critical China Standard Software Co Ltd
Priority to CN201711294532.0A priority Critical patent/CN109918074B/en
Publication of CN109918074A publication Critical patent/CN109918074A/en
Application granted granted Critical
Publication of CN109918074B publication Critical patent/CN109918074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention relates to a compiling and linking optimization method, which optimizes the linking function of a linker by utilizing the address space distribution function, the symbol resolution function and the repositioning function of a linker LD of a GNU open source compiling and linking tool BINUTILS, and the optimized functions comprise a symbol table establishing function, a searching function and an inquiry function in the linking process. The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.

Description

Compiling link optimization method
Technical Field
The invention relates to the technical field of computer software program operation, in particular to a compiling link optimization method.
Background
The GNU tool chain plays a significant role in a Linux system, and the compiling link occupies an important proportion. In recent years, the Linux system is rapidly developed, with the continuous development of computer technology, more and more individuals and enterprises begin to use the Linux system in a large amount, the variety of various application programs is continuously increased, the functions of the programs are continuously changed, and the programs become more diversified and complex, so that the phenomenon of the code amount of the programs is rapidly increased, the modules are increased, and meanwhile, a great burden is brought to compiling and linking of the programs. In the compiling and integrating process, the increase of the modules leads to the increase of binary object files and directly leads to the increase of a large number of symbols needing to be linked, and the order of magnitude can occupy a large number of system resources and seriously slow down the running speed of a system, so that a query processing mode of big data is introduced to solve the problems caused by the linking.
The original linker uses a hash algorithm. The hash algorithm maps data (characters or values, etc.) of any given length to a shorter, fixed-length value, called a hash value, through a given function, which serves as an index. The hash table is used for mapping a group of keywords to a requested memory space through a given hash function H (key) and a collision processing method, the H (key) is used as a storage position of the given keyword in the memory space, the memory space is called as a hash table or hash, and the obtained storage position is called as a hash address or hash address. As a linear data structure, compared with tables, queues, etc., a hash table is undoubtedly a faster one to find.
In the whole linking process, the establishment and searching part of the symbol table is the most time-consuming, mainly the establishment and searching positioning of the symbol table are consumed, and the consumption cannot be perceived by a small number of target files, but if hundreds of links of target files are encountered, and each target file comprises hundreds (more than) of symbols needing to be linked, the order of magnitude can expose the defects in the hardware aspect of the Loongson platform. If the existing hash algorithm of the linker is continuously used under the order of magnitude or higher order of magnitude, the important defect of the hash algorithm, namely low space efficiency, namely hash collision can be generated under the condition of higher order of magnitude, and more memories are developed for solving the hash collision; the hash algorithm used by the linker now occupies a larger memory, and during the period, the influence on the system speed is very large, which easily causes the system to be stuck, and the influence on the system caused by the occupation of a large amount of system resources caused by the linking is not negligible.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a compiling and linking optimization method, which optimizes the linking function of a linker by using the address space distribution function, the symbol resolution function and the repositioning function of a linker LD of a GNU open source compiling and linking tool BINUTILS, and the optimized functions comprise a symbol table establishing function, a searching function and a query function in the linking process.
Wherein the optimization of the symbol table establishment function is to create a GL _ KV symbol table.
In the optimization of the symbol table establishing function, the GL-KV symbol table is established through the following steps:
step S1: respectively collecting GNU library symbols and basic graphic library symbols;
step S2: inputting the symbol name of the symbol collected in step S1 into the bloom filter;
step S3: taking the output of the bloom filter as the input of an index algorithm, and determining the position of a symbol in a symbol table;
step S4: and writing the symbol information into the symbol table.
Wherein, still include:
step S5: if the repetition is found after the calculation of the indexing algorithm, another table is filled for storing data, and whether the repetition is determined by GL _ KV _ sym- > rep- > sym _ r or not is judged;
in the step S3, the indexing algorithm is a Hash indexing algorithm;
in step S3 to step S4, the bloom filter obtains a plurality of values by using a plurality of hash functions, marks the GL _ KV bit array by using the plurality of values as indexes, obtains the position of each symbol in the GL _ KV table by using the plurality of values as input of a hash index algorithm, and writes symbol information into the GL _ KV table.
Wherein, the optimization of the search function is completed by the following steps:
step SA: establishing a lookup table aiming at GL-KV symbols;
step SB: inputting sym _ key, and positioning whether the corresponding sym _ key exists in the lookup table through a bloom filter;
step SC: if not, the feedback search fails; if yes, executing the step SD;
step SD: and judging whether the searched sym _ key is in the flash positive state, if so, feeding back the searching failure, and if not, feeding back the searching success.
In the step SB, the bloom filter is composed of a plurality of hash functions and a bit array, and in the step SD, whether the searched sym _ key is a flip positive is determined by the value of GL _ KV _ sym- > sym _ key.
Wherein, the query function optimization in the linking process is completed through the following steps:
step Sa: respectively establishing a lookup table aiming at an OL _ KV symbol and a GL _ KV symbol;
and Sb: inputting sym _ key;
step Sc: determining whether a qualified marked sym _ key exists in a look-up table for the OL _ KV symbol, and if so, marking its existence;
step Sd: determining whether a marked sym _ key meeting the requirement exists in a lookup table for the GL _ KV symbol, and if so, marking the existence of the marked sym _ key;
step Se: if the marked sym _ key meeting the requirement does not exist in the two lookup tables after the step Sc and the step Sd, an error is reported;
if one sym _ key with a mark exists in the two lookup tables, using the sym _ value corresponding to the sym _ key;
and if the marked sym _ keys exist in the two lookup tables, comparing the intensity, reporting an error if the marked sym _ keys in the two lookup tables are both strong symbols, and selecting sym _ value corresponding to the strong symbols if one of the marked sym _ keys in the two lookup tables is a strong symbol.
Wherein the step Sc comprises:
step Sc 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the OL _ KV symbol; if not, executing step Sd, if yes, executing step Sc 2;
step Sc 2: judging whether the corresponding sym _ key is the active sym _ key, if not, marking the sym _ key to obtain the marked sym _ key and executing the step Sd; if yes, go to step Sc 3;
step Sc 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing the step Sd; if not, directly executing the step Sd.
Wherein the step Sd includes:
step Sd 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the GL _ KV symbol; if not, performing step Se, and if so, performing step Sd 2;
step Sd 2: judging whether the corresponding sym _ key is a flash positive sym _ key or not, if not, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if yes, go to step Sd 3;
step Sd 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if not, directly executing step Se.
In the step Sa, the lookup table for the OL _ KV symbol and the GL _ KV symbol is calculated by a bloom filter composed of a plurality of hash functions and a bit array and a hash index algorithm.
The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.
Drawings
FIG. 1: the invention optimizes the realization process of the symbol establishing function;
FIG. 2 is a schematic diagram: the search function of the invention optimizes the implementation process;
FIG. 3: the invention realizes the flow of optimizing the query function in the linking process.
Detailed Description
In order to further understand the technical scheme and the advantages of the present invention, the technical scheme and the advantages thereof are described in detail below with reference to the accompanying drawings.
The compiling link optimization method mainly aims to optimize the link function of the link by using the functions of address space allocation, symbol resolution, relocation and the like of the link LD of the GNU open source compiling link tool BINUTILS. Algorithms with symbol establishing, searching and repositioning functions are all improved, the original common hash algorithm is abandoned, a bloom filter is used, symbols which can be provided by a GNU component library and a basic graphic library are maintained, the speed of a compiling and linking stage is comprehensively improved, and the limitation of hardware is solved in a software mode.
Specifically, as shown in fig. 1 to fig. 3, the implementation flows of the symbol establishment function optimization, the search function optimization and the query function optimization in the linking process corresponding to the compiling and linking optimization method of the present invention are respectively.
First, please refer to fig. 1, for the implementation flow of the symbol establishment function optimization of the present invention, when performing compiling link optimization, first, the GNU component maintains a most basic symbol table that can be provided by the GNU tool chain and the primitive graphic library, where the symbol table mainly includes a symbol name, a symbol value (address), a type, a file in which the GNU is located, a number of symbols with the same name, etc., the symbol is used as a key, and the address is used as a value, and the table entry is an extension of the structure of the system struct elfxxx _ Sym (xx may be 64 or may be 32). The table is only used as a compiling link at present, so that the table is not resident in a memory, and the system is not required to be worried about the load of the table. This table is named GL-KV (G: GNU, L: linker, K: key, V: value) and the entries GL-KV _ sym, and these data are stored in a data file. In an embodiment of the present invention, the GL _ KV has the following general structure:
Figure BDA0001500034910000061
Figure BDA0001500034910000071
and creating a GL-KV symbol table data file:
(1) respectively collecting GNU library symbols and basic graphic library symbols;
(2) using the symbol name in the collected symbol set as the input of a bloom filter;
(3) taking the output of the bloom filter as the input of an index algorithm, and determining the position of a symbol in a data file;
(4) writing the symbol information into a symbol table;
(5) if the repetition is found after the indexing algorithm, another table is established for storing data, and whether the repetition exists is determined by GL _ KV _ sym- > rep- > sym _ r.
Then, please refer to fig. 2, which is a flow for implementing the lookup function optimization of the present invention, in the lookup function optimization, a lookup table is composed of n hash functions and a bit array, and the lookup table (i.e. the bit array) is specially directed to the GL _ KV table and is used for performing a lookup operation on the GL _ KV table; whether sym _ value corresponding to sym _ key exists can be quickly located by using a bloom filter, and the value of sym _ value can be quickly determined by using hash [0] (. once.) and hash [ n-1] (. once.) as factors of an indexing algorithm. Considering the result of "false positive" brought by the use of bloom filter, although the probability is very small, such error is very fatal to the link process, so as to determine whether the result is the sym _ value of "false positive" according to the value of GL _ KV _ sym- > sym _ key (as shown by the following code), thereby not only eliminating the influence of bloom filter "false positive", but also ensuring the correctness of bloom filter in the use process.
if(Del_FP(Get_S_K(key),Get_S_K(gl_sym_key)))
goto F_Positon;
else{
...
}
Finally, referring to fig. 3, for the implementation flow of query function optimization in the linking process of the present invention, in the linking process, links of hundreds of target files are queried in the same manner, and first, an OL _ KV (O: target file, L: link) table (as shown in the following code) is established.
Figure BDA0001500034910000081
And forming a lookup table by using n hash functions and a bit array, wherein the lookup table is specially used for OL _ KV and is used for performing lookup operation on the OL _ KV table, and using hash [0] (. once.), hash [ n-1] (. once.) as factors of an indexing algorithm to determine the value of sym _ value so as to complete relocation of symbols and correction of addresses. To avoid the effect of "false positive" brought by the bloom filter, OL _ KV- > sym _ key is used to ensure correctness. When the table is established, the symbols with the same name are judged, and if the symbols with the same name are divided into strong symbols and weak symbols, the strong symbols are used; if the symbols are weak symbols, uniformly marking the symbols as 0, and finally displaying the search failure; if the symbols are strong symbols, an error is reported immediately, and the failure of searching is also displayed.
In particular, during the linking process, many symbols of the same name will be involved. During linking, in the process of inquiring a key, firstly searching an OL _ KV table, checking whether the key exists, and if so, marking the symbol; then, inquiring a GL-KV table, inquiring whether the key exists, if so, comparing the key with symbols in the OL-KV table, if not, reporting errors if the key is a strong symbol and the symbols are strong symbols, wherein one symbol is a strong symbol, and selecting the strong symbol; if no key exists in the GL _ KV table and the OL _ KV table, an error is reported, and if only one key exists in the GL _ KV table and the OL _ KV table, the value corresponding to the key is used.
Note: the establishing mode of OL _ KV is the same as that of GL _ KV
Figure BDA0001500034910000091
Figure BDA0001500034910000101
The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.
The compiling link optimizing method provided by the invention is suitable for optimizing compiling links under various software platforms, and is particularly suitable for compiling links under a Loongson platform.
In the present invention, the term "linker" refers to the linker LD of the GNU open source compiling linking tool binotils, which is used to link a plurality of target binary files into an executable binary file.
In the present invention, the so-called "GNU toolchain" is the force of implicit support behind each large open source project (including the Linux kernel itself). They consist of a set of necessary tools and software for compiling and debugging a wide variety of software, from the smallest tool software to the most complex, with the characteristics of the Linux kernel.
In the invention, the hash algorithm is also called as a hash algorithm and is a one-way cryptosystem, namely, the one-way cryptosystem is irreversible mapping from a plaintext to a ciphertext and only has an encryption process and no decryption process.
In the present invention, the term "bloom filter" refers to a long binary vector and a series of random mapping functions. The bloom filter can be used for searching whether an element is in a set, and has the advantages that the space efficiency and the query time are far higher than those of a general algorithm, and the defects that certain misrecognition rate and deletion difficulty exist, and the misrecognition is called false positive.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that the scope of the present invention is not limited thereto, and those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit and scope of the present invention.

Claims (4)

1. A compile link optimization method, characterized by: optimizing the link function of the linker by using the address space allocation function, the symbol resolution function and the relocation function of the linker LD of the GNU open source compiling and linking tool BINUTILS, wherein the optimized function comprises a symbol table establishing function, a searching function and a query function in the link process;
the GNU component maintains a symbol table of the basis that GNU toolchains and basic graphic libraries can provide, the symbol table includes symbol names, symbol values or addresses, types, files, number of homonyms, symbols as keys, addresses as values, and the symbol table has symbol names of GL _ KV, wherein: g: GNU, L: linker, K: key, V: value, wherein the entry is GL _ KV _ sym, the optimization of the symbol table establishing function is to establish GL _ KV symbol table, and the establishing of GL _ KV symbol table includes the following steps:
step S1: respectively collecting GNU library symbols and basic graphic library symbols;
step S2: inputting the symbol name of the symbol collected in step S1 into the bloom filter;
step S3: the output of the bloom filter is used as the input of an index algorithm, and the position of a symbol in a symbol table is determined;
step S4: writing the symbol information into a symbol table;
the optimization of the search function is completed by the following steps:
step SA: establishing a lookup table aiming at GL-KV symbols;
step SB: inputting sym _ key, and positioning whether the corresponding sym _ key exists in the lookup table through a bloom filter;
step SC: if not, the feedback search fails; if yes, executing the step SD;
step SD: judging whether the searched sym _ key is in a flash positive state, if so, feeding back the searching failure, and if not, feeding back the searching success;
the query function optimization in the linking process is completed through the following steps:
step Sa: respectively establishing a lookup table aiming at an OL _ KV symbol and a GL _ KV symbol, wherein O: a target file;
and Sb: inputting sym _ key;
step Sc: determining whether a qualified marked sym _ key exists in a look-up table for the OL _ KV symbol, and if so, marking its existence;
step Sd: determining whether a marked sym _ key meeting the requirement exists in a lookup table for the GL _ KV symbol, and if so, marking the existence of the marked sym _ key;
step Se: if the marked sym _ key meeting the requirement does not exist in the two lookup tables after the step Sc and the step Sd, an error is reported;
if one sym _ key with a mark exists in the two lookup tables, using the sym _ value corresponding to the sym _ key;
and if the marked sym _ keys exist in the two lookup tables, comparing the intensity, reporting an error if the marked sym _ keys in the two lookup tables are both strong symbols, and selecting sym _ value corresponding to the strong symbols if one of the marked sym _ keys in the two lookup tables is a strong symbol.
2. The compilation link optimization method of claim 1 wherein: the step Sc comprises the following steps:
step Sc 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the OL _ KV symbol; if not, executing step Sd, if yes, executing step Sc 2;
step Sc 2: judging whether the corresponding sym _ key is the active sym _ key, if not, marking the sym _ key to obtain the marked sym _ key and executing the step Sd; if yes, go to step Sc 3;
step Sc 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing the step Sd; if not, directly executing the step Sd.
3. The compile link optimization method of claim 1, wherein: the step Sd includes:
step Sd 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the GL _ KV symbol; if not, performing step Se, and if so, performing step Sd 2;
step Sd 2: judging whether the corresponding sym _ key is a flash positive sym _ key or not, if not, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if yes, go to step Sd 3;
step Sd 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if not, directly executing step Se.
4. The compilation link optimization method of claim 1 wherein: in the step Sa, the lookup table for the OL _ KV symbol and the GL _ KV symbol is calculated by a bloom filter composed of a plurality of hash functions and a bit array and a hash index algorithm.
CN201711294532.0A 2017-12-08 2017-12-08 Compiling link optimization method Active CN109918074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711294532.0A CN109918074B (en) 2017-12-08 2017-12-08 Compiling link optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711294532.0A CN109918074B (en) 2017-12-08 2017-12-08 Compiling link optimization method

Publications (2)

Publication Number Publication Date
CN109918074A CN109918074A (en) 2019-06-21
CN109918074B true CN109918074B (en) 2022-09-27

Family

ID=66956601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711294532.0A Active CN109918074B (en) 2017-12-08 2017-12-08 Compiling link optimization method

Country Status (1)

Country Link
CN (1) CN109918074B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111208978B (en) * 2019-12-31 2023-05-23 杭州安恒信息技术股份有限公司 Character bloom filter implemented by taking Python as interface C++, and method for implementing character bloom filter
CN111736816B (en) * 2020-07-20 2020-11-24 华控清交信息科技(北京)有限公司 Compiling and linking method and device and compiling and linking device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10275088A (en) * 1997-03-31 1998-10-13 Hitachi Ltd Link optimizing method
CN102385524A (en) * 2011-12-23 2012-03-21 浙江大学 Method for replacing compiling chain order based on mixed-compiling order set
CN103034486A (en) * 2012-11-28 2013-04-10 清华大学 Automatic optimization method based on full-system expansion call graph for mobile terminal operation system
CN104951290A (en) * 2014-03-31 2015-09-30 国际商业机器公司 Method and equipment for optimizing software
CN105320654A (en) * 2014-05-28 2016-02-10 中国科学院深圳先进技术研究院 Dynamic bloom filter and element operating method based on same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10275088A (en) * 1997-03-31 1998-10-13 Hitachi Ltd Link optimizing method
CN102385524A (en) * 2011-12-23 2012-03-21 浙江大学 Method for replacing compiling chain order based on mixed-compiling order set
CN103034486A (en) * 2012-11-28 2013-04-10 清华大学 Automatic optimization method based on full-system expansion call graph for mobile terminal operation system
CN104951290A (en) * 2014-03-31 2015-09-30 国际商业机器公司 Method and equipment for optimizing software
CN105320654A (en) * 2014-05-28 2016-02-10 中国科学院深圳先进技术研究院 Dynamic bloom filter and element operating method based on same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"龙芯链接后优化器设计与分析";陈瑜等;《计算机研究与发展》;20060911;第43卷(第8期);1450-1456页 *

Also Published As

Publication number Publication date
CN109918074A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
US10127279B2 (en) Eigenvalue-based data query
US8843502B2 (en) Sorting a dataset of incrementally received data
TWI479341B (en) High throughput, reliable replication of transformed data in information systems
US20140379677A1 (en) Test sandbox in production systems during productive use
CN103559449B (en) The detection method of a kind of code change and device
US8037450B2 (en) System and methods for tracing code generation in template engines
JP7047228B2 (en) Data query methods, devices, electronic devices, readable storage media, and computer programs
CN106547644B (en) Incremental backup method and equipment
CN105095287A (en) LSM (Log Structured Merge) data compact method and device
CN103914483B (en) File memory method, device and file reading, device
US10528328B2 (en) Learning from input patterns in Programing-By-Example
US20160103858A1 (en) Data management system comprising a trie data structure, integrated circuits and methods therefor
CN111078672B (en) Data comparison method and device for database
CN105989015B (en) Database capacity expansion method and device and method and device for accessing database
US20120096054A1 (en) Reading rows from memory prior to reading rows from secondary storage
US20150081739A1 (en) Dynamic generation of traversal code for a graph analytics environment
CN109918074B (en) Compiling link optimization method
CN104424256A (en) Method and device for generating Bloom filter
CN108268596B (en) Method and system for searching data stored in memory
CN114185895A (en) Data import and export method and device, electronic equipment and storage medium
US10339151B2 (en) Creating federated data source connectors
WO2018228001A1 (en) Electronic device, information query control method, and computer-readable storage medium
CN105930104B (en) Date storage method and device
CN105264519A (en) Columnar database processing method and device
CN106980673A (en) Main memory database table index updating method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant