CN116842512A - Method and device for unshelling malicious files, electronic equipment and storage medium - Google Patents

Method and device for unshelling malicious files, electronic equipment and storage medium Download PDF

Info

Publication number
CN116842512A
CN116842512A CN202310545705.0A CN202310545705A CN116842512A CN 116842512 A CN116842512 A CN 116842512A CN 202310545705 A CN202310545705 A CN 202310545705A CN 116842512 A CN116842512 A CN 116842512A
Authority
CN
China
Prior art keywords
file
field
target
shelled
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310545705.0A
Other languages
Chinese (zh)
Inventor
何清林
何跃鹰
罗冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN202310545705.0A priority Critical patent/CN116842512A/en
Publication of CN116842512A publication Critical patent/CN116842512A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/565Static detection by checking file integrity
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16YINFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
    • G16Y40/00IoT characterised by the purpose of the information processing
    • G16Y40/50Safety; Security of things, users, data or systems

Abstract

The embodiment of the application provides a method and a device for unshelling malicious files, electronic equipment and a storage medium, wherein the method comprises the following steps: aiming at a file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file; wherein the target field is a field required for unshelling; if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file; and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file. In the embodiment, after the file to be processed is obtained, the subsequent steps are executed when the file is the target shelled file, so that the shelling is more targeted; and filling information in the information missing field in the target shelled file to realize the shelling of the variant shelled file, restore the original file of the variant shelled file, and improve the shelling success rate of the malicious file of the Internet of things.

Description

Method and device for unshelling malicious files, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of security of the Internet of things, in particular to a method and a device for unshelling malicious files, electronic equipment and a storage medium.
Background
The internet of things (Internet of Things, ioT) connects devices to the internet, a network that enables intelligent identification and management. With the wide application of the internet of things devices, the internet of things malicious software (IoT software) is propagated through a network to infect and control the internet of things devices, thereby threatening the devices themselves and the networks and systems connected with the devices. Malware packing (malware packing) is a technique that compresses and encrypts malware, making the shelled malware files more difficult to detect and analyze by security software.
In the related art, a shell-adding tool of an original edition is used for directly shelling a shell-adding malicious file in the Internet of things, and static analysis is performed on the shell-adding malicious file, namely, under the condition that an executable file is not operated, the shell-adding malicious file is analyzed and cracked.
However, some shelled malicious files may be modified in a shelling manner to bypass detection and analysis of security software, which may not be able to shelle the modified shelled file.
Disclosure of Invention
The embodiment of the application provides a method and a device for unshelling malicious files, electronic equipment and a storage medium, which are used for realizing unshelling treatment on variant shelled files.
In a first aspect, an embodiment of the present application provides a method for unshelling a malicious file, where the method includes:
for a file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file; wherein the target field is a field required for unshelling;
if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file;
and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
After the file to be processed is obtained, the scheme firstly determines whether the file is the target shelling file which is shelled by a specific shelling mode, and if the file is the target shelling file, the subsequent step is executed, so that the shelling is more targeted; in the target shelling file, some fields are needed to be used in the shelling process (target fields), and if the contents of the fields are changed (such as missing or error), the target shelling file is a variant shelling file which is a variant, and the shelling process of the target shelling file cannot be directly carried out; therefore, when the information missing field exists in the target fields, the information missing field is filled in the target shelled file to obtain a reconstructed shelled file, and the target fields in the reconstructed shelled file are complete; furthermore, based on a preset shelling mode, the reconstructed shelled file can be shelled, the original file with the variant shelled can be restored, the shelling processing can be performed on the variant shelled file through the mode, and the shelling success rate of the malicious file in the field of the Internet of things is improved.
In some alternative embodiments, the target format is an executable and linkable format (Executable and Linkable Format, ELF) and the target shelled file is a file shelled by an executable program file compressor (the Ultimate Packer for eXecutables, UPX).
According to the scheme, because the ELF file has certain portability and can run on different operating systems and processor architectures, most of malicious software exists in the form of the ELF file in the Internet of things; the ELF file is determined to be a file to be processed, so that the processing efficiency is improved; the internet of things malicious software generally uses a lightweight shelling mode, and most of the shelling is realized by UPX and variants thereof; whether the ELF file is a UPX shelled file or not is identified, and the following target field identification, filling processing and shelling processing are carried out by combining the UPX shelled characteristics, so that the method is suitable for shelling requirements in the scene of the Internet of things.
In some alternative embodiments, the target field includes a segment table offset field, a header segment, and an original file size field.
In some alternative embodiments, the determination of whether there is a missing information field in the target field is made by:
If the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; and
if the information corresponding to the packet header section does not exist in the target shelled file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and
if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
In the above scheme, by combining the characteristics of each item target field, it is precisely determined whether each item target field is an information missing field that needs to be reconstructed (padding information).
In some optional embodiments, the filling the information missing field in the target shelled file includes:
if the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined;
And filling the first preset value of the filling quantity in the segment table offset field.
In the above solution, since the segment table offset field is related to the file type, the target shelled file corresponds to a specific value, that is, the value (corresponding information) of the normal segment table offset field should be a first preset value, and the number of the first preset values corresponds to the number of program bits; the first preset value of the filling quantity can be filled in the segment table offset field by acquiring the program bit number from the target shell-added file and determining the filling quantity (namely, how many first preset values need to be filled) corresponding to the program bit number, so that the segment table offset field reconstruction can be accurately performed.
In some optional embodiments, the filling the information missing field in the target shelled file includes:
if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern; the preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section;
determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
Based on the target bit corresponding to each second field, acquiring a filling value corresponding to each second field from the target shelled file;
and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, and filling the packet header information in the packet header section.
In the scheme, the packet header section comprises a plurality of fields, a first field with a corresponding fixed value and a second field without a fixed value; setting corresponding preset header templates for different valid bit modes, wherein the preset header templates have third preset values corresponding to the first fields and blank values corresponding to the second fields; because of the position offset generated in the compression and shell adding process, the corresponding content of the field of the file content part is offset by a certain bit number (namely the offset bit) on the basis of the initial bit; determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field, and accurately finding corresponding content in the target file based on the target bit; and filling the first field in the preset header template without filling, and filling the second field after determining a filling value corresponding to the second field to obtain the complete packet header information corresponding to the packet header section.
In some optional embodiments, the filling the information missing field in the target shelled file includes:
if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on a bias bit and an initial bit corresponding to the original file size field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field.
In the above scheme, because the position offset is generated in the compression and shell adding process, the content corresponding to the field of the file content part is offset by a certain bit number (namely the offset bit) on the basis of the initial bit; and determining a target bit corresponding to the original file size field based on the offset bit and an initial bit corresponding to the original file size field, and accurately finding corresponding content in the target file based on the target bit, so that the original file size field is accurately reconstructed.
In some alternative embodiments, the determination of whether the file to be processed is a target shelled file is made by:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
In some alternative embodiments, the method further comprises:
and if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
In a second aspect, an embodiment of the present application provides a first malicious file shelling device, where the device includes:
the field determining module is used for determining a target field from a target shelled file aiming at a file to be processed in any target format if the file to be processed is the target shelled file; wherein the target field is a field required for unshelling;
the reconstruction module is used for filling the information missing field in the target shelled file if the information missing field exists in the target field, so as to obtain a reconstructed shelled file;
and the shelling module is used for shelling the reconstructed shelled file based on a preset shelling mode to obtain an original file.
In some alternative embodiments, the target format is ELF and the target shelled file is a file shelled by UPX.
In some alternative embodiments, the target field includes a segment table offset field, a header segment, and an original file size field.
In some optional embodiments, the reconstruction module is further configured to determine whether there is a missing information field in the target field by:
if the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; and
if the information corresponding to the packet header section does not exist in the target shelled file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and
if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
In some optional embodiments, the reconstruction module is specifically configured to:
If the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined;
and filling the first preset value of the filling quantity in the segment table offset field.
In some optional embodiments, the reconstruction module is specifically configured to:
if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern; the preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section;
determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
based on the target bit corresponding to each second field, acquiring a filling value corresponding to each second field from the target shelled file;
and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, and filling the packet header information in the packet header section.
In some optional embodiments, the reconstruction module is specifically configured to:
if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on a bias bit and an initial bit corresponding to the original file size field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field.
In some optional embodiments, the field determining module is further configured to determine whether the file to be processed is a target shelled file by:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
In some alternative embodiments, the dehulling module is further configured to:
and if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
In a third aspect, an embodiment of the present application provides an electronic device, including at least one processor and at least one memory, where the memory stores a computer program, and when the program is executed by the processor, causes the processor to execute the method for unshelling a malicious file according to any one of the first aspects.
In a fourth aspect, an embodiment of the present application provides a computer readable storage medium storing a computer program executable by a processor, which when run on the processor causes the processor to perform the method for unshelling malicious files according to any one of the first aspects above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a first method for unshelling malicious files according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a structure of a target shelled document according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a second method for unshelling malicious files according to an embodiment of the present application;
fig. 4 is a flow chart of a third method for unshelling malicious files according to an embodiment of the present application;
fig. 5 is a flow chart of a fourth method for unshelling malicious files according to an embodiment of the present application;
fig. 6 is a flowchart illustrating a fifth method for unshelling malicious files according to an embodiment of the present application;
fig. 7 is a flowchart illustrating a sixth malicious file shelling method according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a malicious file shelling device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
The term "malicious file", also known as internet of things malicious program/code/software, refers to a binary executable running on Linux (an operating system) that, for the purpose of illegal activity, destroys the normal operation of a computer network.
In the description of the present application, it should be noted that, unless explicitly stated and limited otherwise, the term "connected" should be interpreted broadly, and for example, it may be directly connected, or it may be indirectly connected through an intermediate medium, or it may be communication between two devices. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
The internet of things connects the equipment with the internet, and is a network for realizing intelligent identification and management. With the wide application of the internet of things equipment, the internet of things malicious software is spread through a network to infect and control the internet of things equipment, so that the equipment and the network and the system connected with the equipment are threatened.
In order to cope with the threat of the internet of things malicious software, a related organization accumulates a large amount of security event data, wherein the security event data also comprises a large amount of internet of things malicious software samples, and based on the data, tracking and tracing can be performed on the event. Meanwhile, analysis of the malicious software samples is also a necessary basis for threat event analysis and restoration and emergency disposal work.
Malware encapsidation is a technique that compresses and encrypts malware, making the encapsidated malicious files more difficult to detect and analyze by security software.
During the malware capping process, program developers may use different compression and encryption algorithms to protect their programs. These algorithms can compress and obfuscate the code of the malware, making it difficult for security software to detect. Some shelling techniques may also dynamically decompress and execute malicious code, making it more difficult to detect.
The purpose of malware encapsidation is to bypass detection and analysis of security software. Security software typically detects whether malicious code is contained therein by scanning the binary code of the program. However, if the malware has been shelled, its code becomes more complex and difficult to identify, making it difficult for the security software to detect the malicious code therein. Common shell adding methods include compression shell adding, virtual machine shell adding, encryption shell adding and the like.
Malware dehulling (malware unpacking) refers to breaking malware crust technology, thereby restoring the original code of the malware. In the analysis and research of the malicious software, the shelled malicious files need to be processed through the shelling of the malicious software.
In the related art, a shell-adding tool of an original edition is used for directly shelling a shell-adding malicious file in the Internet of things, and static analysis is performed on the shell-adding malicious file, namely, under the condition that an executable file is not operated, the shell-adding malicious file is analyzed and cracked. Common static unshelling methods include techniques such as manual reverse analysis, disassembly, code debugging and memory mapping.
In practice, in order to bypass the detection and analysis of security software, a variant of the shelling mode may be performed, and the contents of part of the fields in the variant shelling file are changed, so that the original shelling tool cannot be used to directly shelle the variant shelling file.
In still other embodiments, the shelled files may be parsed and inverted to obtain the original files using tools such as disassembler, static parsing tools, and debugger. However, this approach has high technical requirements for related personnel, low processing efficiency, and is easily interfered by protection mechanisms such as disassembly and assembly of the shell.
In view of this, an embodiment of the present application provides a method, an apparatus, an electronic device, and a storage medium for unshelling a malicious file, where the method includes: for a file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file; wherein the target field is a field required for unshelling; if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file; and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
After the file to be processed is obtained, the scheme firstly determines whether the file is the target shelling file which is shelled by a specific shelling mode, and if the file is the target shelling file, the subsequent step is executed, so that the shelling is more targeted; in the target shelling file, some fields are needed to be used in the shelling process (target fields), and if the contents of the fields are changed (such as missing or error), the target shelling file is a variant shelling file which is a variant, and the shelling process of the target shelling file cannot be directly carried out; therefore, when the information missing field exists in the target fields, the information missing field is filled in the target shelled file to obtain a reconstructed shelled file, and the target fields in the reconstructed shelled file are complete; furthermore, based on a preset shelling mode, the reconstructed shelled file can be shelled, the original file with the variant shelled can be restored, the shelling processing can be performed on the variant shelled file through the mode, and the shelling success rate of the malicious file in the field of the Internet of things is improved.
In addition, the embodiment directly uses a preset shelling mode (such as an original edition shelling tool) for shelling, manual interference is not needed, and the treatment efficiency is high.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems with reference to the drawings and specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 1 is a flow chart of a first method for unshelling malicious files according to an embodiment of the present application, as shown in fig. 1, including the following steps:
step S101: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
Wherein the target field is a field required for unshelling.
In this embodiment, after the file to be processed is obtained, it is determined whether the file is a target shelled file that is shelled by a specific shelled mode, and the subsequent steps are executed for the target shelled file, so that the processing efficiency is improved, and the occurrence of failure in shelling is reduced.
In practice, variations may be made to the shelling mode in order to bypass detection and analysis of security software. In the target shelling file, some fields are needed in the shelling process (target fields), and if the contents of the fields are changed (such as missing or error), the shelling process is affected; based on this, the present embodiment needs to identify the target field in the target shelled file.
Step S102: and if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file.
As described above, some fields are required to be used in the dehulling process (target fields), and if the contents of these fields are changed (such as a miss or error), the dehulling process is affected;
based on this, after identifying the target fields in the target shelled file, it needs to further confirm whether there are information missing fields in the target fields, that is, whether the contents of the target fields have been changed (such as missing or error);
if the information missing field exists in the target field, the shelling process cannot be directly performed on the target shelled file, and the embodiment fills in the information of the information missing field, so that the content of the information missing field is complete.
Step S103: and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
In this embodiment, by filling in the information missing field in which the content is changed, all the target fields in the obtained reconstructed shelled file are complete, so that the reconstructed shelled file can be shelled based on a preset shelling mode (for example, by using an original shell tool), and the original file can be restored.
And then the original file (original software) is extracted with information and characteristics, so that whether the original file is malicious software or not can be identified, and the malicious software is taken as a sample for analysis. The original file is the file before being shelled.
Illustratively, the following commands are entered in the command line or in the script: upx-d [ file_name ]; where [ file_name ] is the name of the reconstructed shelled file, -d parameter indicates that the file is to be decompressed.
According to the scheme, after the file to be processed is obtained, whether the file is the target shelled file shelled by a specific shelled mode is determined, and if the file is the target shelled file, the subsequent steps are executed, so that the processing efficiency is improved; in the target shelling file, some fields are needed to be used in the shelling process (target fields), and if the contents of the fields are changed (such as missing or error), the target shelling file is a variant shelling file which is a variant, and the shelling process of the target shelling file cannot be directly carried out; therefore, when the information missing field exists in the target fields, the information missing field is filled in the target shelled file to obtain a reconstructed shelled file, and the target fields in the reconstructed shelled file are complete; furthermore, based on a preset shelling mode, the reconstructed shelled file can be shelled, the original file with the variant shelled can be restored, the shelling processing can be performed on the variant shelled file through the mode, and the shelling success rate of the malicious file in the field of the Internet of things is improved.
In some alternative embodiments, the target format is ELF and the target shelled file is a file shelled by UPX.
ELF is a common binary executable file format, mainly used for executable files and shared libraries on Unix (a host panel)/Linux. Because the ELF file has certain portability, the ELF file can run on different operating systems and processor architectures, and most of malicious software exists in the form of the ELF file in the Internet of things.
Based on this, the present embodiment determines an ELF file as a file to be processed, that is, if one ELF file is received, it is used as the file to be processed, and further determines the shell adding manner thereof.
Because the internet of things device generally has the characteristic of limited resources, a lightweight shelling mode is generally used in implementation to protect malicious software, and most of the shelling is realized through UPX and variants thereof.
Based on this, the present embodiment recognizes whether the ELF file is a file that is shelled by UPX (UPX standard shelled or variant shelled), and performs the subsequent object field recognition, padding, and shelling processes in combination with the characteristics of UPX shelled.
According to the scheme, because the ELF file has certain portability and can run on different operating systems and processor architectures, most of malicious software exists in the form of the ELF file in the Internet of things; the ELF file is determined to be a file to be processed, so that the processing efficiency is improved; the internet of things malicious software generally uses a lightweight shelling mode, and most of the shelling is realized by UPX and variants thereof; whether the ELF file is a UPX shelled file or not is identified, and the following target field identification, filling processing and shelling processing are carried out by combining the UPX shelled characteristics, so that the method is suitable for shelling requirements in the scene of the Internet of things.
Referring to fig. 2, a schematic structure diagram of a target shelled file according to the present embodiment includes an ELF header, an i_info (original file basic attribute segment), a p_info (original file package attribute segment), a b_info (compressed block basic attribute segment), a b_data (compressed code block), a packet header segment, and the like.
Illustratively, the ELF header includes a file architecture field (the first 5 bytes), a valid bit pattern field (the 6 th byte), a segment table offset field (e_shoff), a file type field (e_type), an architecture field (e_machine), and the like;
the file structure field is used for identifying the program bit number of the file, such as 32-bit program or 64-bit program;
the above-mentioned valid bit pattern field is used for identifying the valid bit pattern of the file, such as whether the file is the most significant bit (Most Significant Bit, MSB) or the least significant bit (Least Significant Bit, LSB);
e_shoff represents the offset (in bytes) of the segment table in the file;
e_type identifies the type of the file;
e_machine indicates the architecture required to run the program;
e_shoff, e_type, and e_machine are related to file types, and fields such as the target shelled file are specifically identified.
Illustratively, the i_info corresponds to the basic attribute related information of the original file, including checksum (checksum) field, magic word (magic word) field, loader size (loader size), version (program version field), format (executable file type field).
Illustratively, p_info corresponds to the original file package attribute related information, including progid (program header identification field), filesize (file size field), and blocksize (original block size field).
Illustratively, a packer contains some first fields with fixed values and some second fields without fixed values.
The above-mentioned object-shell file and the fields included in each segment in the structure are exemplary, and there may be more or fewer segments in implementation, and each segment may include other fields, etc., which are not limited in this embodiment.
In some alternative embodiments, it may be determined whether the file to be processed is a target shelled file by, but not limited to:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
In practice, one or more preset fields may be set, and if there are multiple preset fields, the respective third preset values are different.
Taking the above fig. 2 as an example, a file to be processed is determined to be a target shelled file (UPX shelled) if one or more of the following are satisfied:
1. the value of e_type in the ELF header is ET_EXEC;
2. the value of e_machine in the ELF header is either EM_386 (32-bit program) or EM_X86_64 (64-bit program);
3. the value of e_shoff in ELF header is 0;
4. there is a section (preset section) named "UPX" in the ELF file.
The above determination conditions are merely exemplary, and in implementation, other fields that can characterize the shelled feature may be referred to, and will not be described herein.
In some alternative embodiments, the target field includes a segment table offset field, a header segment, and an original file size field.
Taking the example of fig. 2 above, the target field includes e_shoff (segment table offset field, packtheader (header field), filesize in p_info, and blocksize (both of which belong to the original file size field) in the ELF header.
Fig. 3 is a flow chart of a second method for unshelling malicious files according to an embodiment of the present application, as shown in fig. 3, including the following steps:
Step S301: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
The target field comprises a segment table offset field, a packet header segment and an original file size field.
The specific implementation of step S301 may refer to other embodiments, and will not be described herein.
Step S302: if the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; if the information corresponding to the packet header section does not exist in the target shell adding file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
In this embodiment, the target field includes a segment table offset field, a header segment, and an original file size field. It is necessary to determine whether each entry field is an information missing field.
Since the characteristics of each entry field are different, it is necessary to determine whether or not each entry field is an information missing field requiring reconstruction (padding information) in combination with the characteristics thereof.
For example, the segment table offset field is related to the file type, and the target shelled file corresponds to a specific value, that is, the value of the normal segment table offset field (corresponding information) should be a first preset value (e.g., 0), if the target shelled file does not have the information (missing) corresponding to the segment table offset field, or the corresponding information is not the first preset value (error), it is determined that the segment table offset field is an information missing field;
if the information corresponding to the normal packet header section has a specific ending format, if the information corresponding to the packet header section does not exist in the target shelled file, or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field;
the information corresponding to the normal original file size field should be matched with the content corresponding to the target shelled file (the obtaining manner of the content can be see the process of filling the original file size field, which is not described here again); and if the target shelled file does not have the information corresponding to the original file size field or the information corresponding to the original file size field is wrong, determining the original file size field as an information missing field.
Step S303: and if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file.
Step S304: and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
The specific implementation of steps S303 to S304 may refer to other embodiments, and will not be described herein.
In the above scheme, by combining the characteristics of each item target field, it is precisely determined whether each item target field is an information missing field that needs to be reconstructed (padding information).
For the case that the offset field of the segment table is an information missing field, a flow diagram of a third method for unshelling malicious files is provided in the embodiment of the present application, as shown in fig. 4, and includes the following steps:
step S401: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
The target field comprises a segment table offset field, a packet header segment and an original file size field.
The specific implementation of this step S401 may refer to other embodiments, and will not be described herein.
Step S402: if the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined.
As described above, the segment table offset field is related to the file type, and the target shelled file corresponds to a specific value, that is, the value (corresponding information) of the normal segment table offset field should be a first preset value (e.g., 0), and the number of the first preset values corresponds to the number of program bits;
based on this, if the segment table offset field is an information missing field, the number of program bits needs to be obtained from the target shelled file, and the number of padding corresponding to the number of program bits (i.e., how many first preset values need to be padded).
Step S403: and filling the first preset value of the filling quantity in the segment table offset field to obtain a reconstructed shell file reconstructed by the segment table offset field.
After determining the filling number corresponding to the program bit number, the first preset value of the filling number can be filled in the segment table offset field, so that the segment table offset field is reconstructed.
The following is a specific example:
the segment table offset field is marked as e_shoff (0 x28 to 0x 30), if no information corresponding to the e_shoff exists, or the information corresponding to the e_shoff is not a first preset value (such as 0), the segment table offset field is determined to be an information missing field;
Determining whether the program bit number of the target shelled file is a 32-bit program or a 64-bit program based on the file architecture field (the first 5 bytes) in the ELF header;
in the case of a 32-bit program, padding 40 s at positions 0x28 to 0x 30; in the case of a 64 bit program, 8 0 s are padded at positions 0x28 to 0x 30.
Step S404: and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
The specific implementation of step S404 may refer to other embodiments, and will not be described herein.
In the above solution, since the segment table offset field is related to the file type, the target shelled file corresponds to a specific value, that is, the value (corresponding information) of the normal segment table offset field should be a first preset value, and the number of the first preset values corresponds to the number of program bits; the first preset value of the filling quantity can be filled in the segment table offset field by acquiring the program bit number from the target shell-added file and determining the filling quantity (namely, how many first preset values need to be filled) corresponding to the program bit number, so that the segment table offset field reconstruction can be accurately performed.
Aiming at the situation that the header section is an information missing field, a flow diagram of a fourth malicious file unshelling method is provided for the embodiment of the application, as shown in fig. 5, and the method comprises the following steps:
Step S501: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
The target field comprises a segment table offset field, a packet header segment and an original file size field.
The specific implementation of this step S501 may refer to other embodiments, and will not be described herein.
Step S502: and if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern.
The preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section.
In implementation, the header section includes a plurality of fields, a first field having a corresponding fixed value and a second field having no fixed value;
based on this, in this embodiment, corresponding preset header templates are set for different valid bit patterns, and the preset header templates have third preset values corresponding to the first fields and blank values corresponding to the second fields. The third preset values of the different preset header templates are not identical. The first field is not padded, but only the value of the second field.
Step S503: and determining the target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field.
The offset is determined based on the product of a second preset value and a target node number, wherein the target node number is obtained from the target shelled file.
Because the position offset is generated in the compression and shell adding process, the target shell adding file is corresponding to an offset bit, and the content corresponding to the field of the file content part is not in the initial bit, but is offset by a certain bit (namely the offset bit) on the basis of the initial bit;
based on this, the present embodiment determines the target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field, and can find the corresponding content in the target file based on the target bit.
Illustratively, the target number of bytes (number of program heads) is obtained from the ELF header and multiplied by a second preset value (number of bytes characterizing a single program head) to obtain the offset bit.
Step S504: and based on the target bit corresponding to each second field, acquiring the filling value corresponding to each second field from the target shelled file.
In implementation, the target shelled file includes the contents of the second fields, that is, the preset header template has no value of the second fields, but the contents of the second fields are located at other positions in the target shelled file, so that the contents of the second fields can be obtained from the target shelled file as padding values corresponding to the second fields based on the target bits corresponding to the second fields.
Step S505: and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, filling the packet header information in the packet header section, and obtaining a reconstructed shell file of the packet header section reconstruction.
As described above, in this embodiment, a preset header template is provided, where the preset header template includes a third preset value corresponding to each first field and a null value corresponding to each second field, the first field is not required to be filled, and after determining a filling value corresponding to the second field, the second field is filled, so that complete header information corresponding to the header section can be obtained.
The following is a specific example:
the packet header section is marked as a packer, and if the information corresponding to the packer is not available in the target shell-added file or the ending format of the information corresponding to the packer is wrong, the packet header section is determined to be an information missing field;
determining whether the valid bit pattern of the target shelled file is MSB or LSB based on the valid bit pattern field (6 th byte) in the ELF header;
MSB corresponds to a first preset header template, such as:
x11+content [ offset +0xa: offset +0xc ] +content [ offset +0x20: offset +0x21] +x12+content [ offset +0x10: offset +0x14] +x13;
LSB corresponds to a second preset header template, such as:
x21+content [ offset +0xa: offset +0xc ] +content [ offset +0x20: offset +0x21] +x22+content [ offset +0x10: offset +0x14] +x23;
x11, X12, X13, X21, X22, X23 are third preset values corresponding to the first field, and are not illustrated herein;
content [ offset bit +0xA: offset bit +0xC ] is the value corresponding to the first and second fields, which characterize the program version, 0xA:0xC is its initial bit, [ offset bit +0xA: offset bit +0xC ] is its target bit;
content [ offset bit +0x20:offset bit +0x21] is the value corresponding to the second field, which characterizes the executable file type, 0x20:0x21 is its initial bit, [ offset bit +0x20:offset bit +0x21] is its target bit;
content [ offset bit +0x10:offset bit +0x14] is the value corresponding to the third second field, which characterizes the original file size, 0x10:0x14 is its initial bit, [ offset bit +0x10:offset bit +0x14] is its target bit;
and after selecting a preset packet header template matched with the valid bit mode, acquiring filling values corresponding to the second fields from a target shell adding file based on target bits corresponding to the second fields, and replacing the blank values corresponding to the second fields in the preset packet header template with the corresponding filling values to obtain packet header information.
Step S506: and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
The specific implementation of step S506 may refer to other embodiments, and will not be described herein.
In the scheme, the packet header section comprises a plurality of fields, a first field with a corresponding fixed value and a second field without a fixed value; setting corresponding preset header templates for different valid bit modes, wherein the preset header templates have third preset values corresponding to the first fields and blank values corresponding to the second fields; because of the position offset generated in the compression and shell adding process, the corresponding content of the field of the file content part is offset by a certain bit number (namely the offset bit) on the basis of the initial bit; determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field, and accurately finding corresponding content in the target file based on the target bit; and filling the first field in the preset header template without filling, and filling the second field after determining a filling value corresponding to the second field to obtain the complete packet header information corresponding to the packet header section.
For the case that the original file size field is an information missing field, a flowchart of a fifth malicious file unshelling method is provided in the embodiment of the present application, as shown in fig. 6, and includes the following steps:
Step S601: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
The target field comprises a segment table offset field, a packet header segment and an original file size field.
The specific implementation of this step S601 may refer to other embodiments, and will not be described herein.
Step S602: if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on the offset bit and an initial bit corresponding to the original file size field.
The offset is determined based on the product of a second preset value and a target node number, wherein the target node number is obtained from the target shelled file.
As described above, since a positional shift occurs during the compression and shell adding process, the target shell adding file has one shift bit corresponding to the field of the file content part, and the content corresponding to the field of the file content part is not in its initial bit, but is shifted by a certain number of bits (i.e., the shift bit) based on the initial bit;
based on this, the present embodiment determines the target bit corresponding to the original file size field based on the offset bit and the initial bit corresponding to the original file size field, and based on the target bit, the corresponding content can be found in the target file.
The specific determination of the offset may refer to the above embodiments, and will not be described herein.
Step S603: and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field to obtain a reconstructed shelled file reconstructed from the original file size field.
In implementation, the target shelled file includes the content of the original file size field, that is, the original file size field has no corresponding value, but has related content at other positions in the target shelled file, so the content of the original file size field can be obtained from the target shelled file as the original file size information based on the target bit corresponding to each original file size field.
In some embodiments, the header section includes a second field characterizing the original file size, such as the third second field in the above example, so if the header section is complete, the corresponding value of the second field may also be obtained from the header section and used as the original file size information.
Step S604: and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
The specific implementation of step S604 may refer to other embodiments, and will not be described herein.
In the above scheme, because the position offset is generated in the compression and shell adding process, the content corresponding to the field of the file content part is offset by a certain bit number (namely the offset bit) on the basis of the initial bit; and determining a target bit corresponding to the original file size field based on the offset bit and an initial bit corresponding to the original file size field, and accurately finding corresponding content in the target file based on the target bit, so that the original file size field is accurately reconstructed.
It can be understood that in practical application, there may be one or more information missing fields in the target shelled file, and when there are multiple information missing fields, the above-mentioned fig. 4 to fig. 6 may be combined, and the information filling is performed on each information missing field, so as to obtain the reconstructed shelled file.
Fig. 7 is a flowchart of a sixth malicious file shelling method according to an embodiment of the present application, as shown in fig. 7, including the following steps:
step S701: and aiming at the file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file.
The specific implementation of this step S701 may refer to the above embodiment, and will not be described herein.
Step S702: and if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
In implementation, if no information missing field exists in the target field, it is indicated that the target shelled file is conventional shelled, and reconstruction is not required, and the target shelled file can be directly shelled by a preset shelling mode (such as by an original shell tool) to obtain the original file.
As shown in fig. 8, an embodiment of the present application provides a malicious file shelling device 800, which includes:
a field determining module 801, configured to determine, for a file to be processed in any target format, a target field from a target shelled file if the file to be processed is the target shelled file; wherein the target field is a field required for unshelling;
a reconstruction module 802, configured to, if the target field has an information missing field, fill in the information missing field in the target shelled file to obtain a reconstructed shelled file;
and the shelling module 803 is used for shelling the reconstructed shelled file based on a preset shelling mode to obtain an original file.
In some alternative embodiments, the target format is ELF and the target shelled file is a file shelled by UPX.
In some alternative embodiments, the target field includes a segment table offset field, a header segment, and an original file size field.
In some optional embodiments, the reconstruction module 802 is further configured to determine whether there is a missing information field in the target field by:
if the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; and
if the information corresponding to the packet header section does not exist in the target shelled file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and
if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
In some alternative embodiments, the reconstruction module 802 is specifically configured to:
If the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined;
and filling the first preset value of the filling quantity in the segment table offset field.
In some alternative embodiments, the reconstruction module 802 is specifically configured to:
if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern; the preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section;
determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
based on the target bit corresponding to each second field, acquiring a filling value corresponding to each second field from the target shelled file;
and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, and filling the packet header information in the packet header section.
In some alternative embodiments, the reconstruction module 802 is specifically configured to:
if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on a bias bit and an initial bit corresponding to the original file size field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field.
In some optional embodiments, the field determining module 801 is further configured to determine whether the file to be processed is a target shelled file by:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
In some alternative embodiments, the shelling module 803 is further configured to:
And if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
Since the device is the device in the method according to the embodiment of the present application, and the principle of the device for solving the problem is similar to that of the method, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Based on the same technical concept, the embodiment of the present application further provides an electronic device 900, as shown in fig. 9, including at least one processor 901, and a memory 902 connected to the at least one processor, where a specific connection medium between the processor 901 and the memory 902 is not limited in the embodiment of the present application, and in fig. 9, the connection between the processor 901 and the memory 902 is exemplified by a bus 903. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 9, but not only one bus or one type of bus.
The processor 901 is a control center of the electronic device, and may connect various parts of the electronic device using various interfaces and lines, and implement data processing by executing or executing instructions stored in the memory 902 and calling data stored in the memory 902. Alternatively, the processor 901 may include one or more processing units, and the processor 901 may integrate an application processor and a modem processor, wherein the application processor primarily processes an operating system, a user interface, an application program, and the like, and the modem processor primarily processes issuing instructions. It will be appreciated that the modem processor described above may not be integrated into the processor 901. In some embodiments, processor 901 and memory 902 may be implemented on the same chip, and in some embodiments they may be implemented separately on separate chips.
The processor 901 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the method for unshelling malicious files may be embodied directly in hardware processor execution or in a combination of hardware and software modules in a processor.
The memory 902 is a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 902 may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory), magnetic Memory, magnetic disk, optical disk, and the like. Memory 902 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 902 of embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.
In an embodiment of the present application, the memory 902 stores a computer program that, when executed by the processor 901, causes the processor 901 to perform:
for a file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file; wherein the target field is a field required for unshelling;
if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file;
and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
In some alternative embodiments, the target format is ELF and the target shelled file is a file shelled by UPX.
In some alternative embodiments, the target field includes a segment table offset field, a header segment, and an original file size field.
In some alternative embodiments, processor 901 further performs:
if the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; and
If the information corresponding to the packet header section does not exist in the target shelled file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and
if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
In some alternative embodiments, processor 901 performs:
if the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined;
and filling the first preset value of the filling quantity in the segment table offset field.
In some alternative embodiments, processor 901 performs:
if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern; the preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section;
Determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
based on the target bit corresponding to each second field, acquiring a filling value corresponding to each second field from the target shelled file;
and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, and filling the packet header information in the packet header section.
In some alternative embodiments, processor 901 performs:
if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on a bias bit and an initial bit corresponding to the original file size field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field.
In some alternative embodiments, processor 901 further performs:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
In some alternative embodiments, processor 901 further performs:
and if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
Because the electronic device is the electronic device in the method according to the embodiment of the present application, and the principle of solving the problem by the electronic device is similar to that of the method, the implementation of the electronic device may refer to the implementation of the method, and the repetition is omitted.
Based on the same technical idea, an embodiment of the present application further provides a computer-readable storage medium storing a computer program executable by a processor, which when run on the processor, causes the processor to perform the steps of the above-described method for unshelling malicious files.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (12)

1. A method of dehulling a malicious file, the method comprising:
For a file to be processed in any target format, if the file to be processed is a target shelled file, determining a target field from the target shelled file; wherein the target field is a field required for unshelling;
if the target field has an information missing field, filling the information missing field in the target shelled file to obtain a reconstructed shelled file;
and based on a preset shelling mode, shelling the reconstructed shelled file to obtain an original file.
2. The method of claim 1, wherein the target format is an executable and linkable format, ELF, and the target shelled file is a file shelled by an executable program file compressor, UPX.
3. The method of claim 1, wherein the target field comprises a segment table offset field, a header segment, and an original file size field.
4. The method of claim 3, wherein determining whether there is a missing information field in the target field is performed by:
if the information corresponding to the segment table offset field does not exist in the target shelled file or the information corresponding to the segment table offset field is not a first preset value, determining that the segment table offset field is an information missing field; and
If the information corresponding to the packet header section does not exist in the target shelled file or the ending format of the information corresponding to the packet header section is wrong, determining the packet header section as an information missing field; and
if the information corresponding to the original file size field does not exist in the target shelled file or the information corresponding to the original file size field is wrong, determining that the original file size field is an information missing field.
5. The method of claim 3, wherein padding the missing information field in the target shelled file comprises:
if the segment table offset field is an information missing field, program bits are obtained from the target shelled file, and the filling quantity corresponding to the program bits is determined;
and filling the first preset value of the filling quantity in the segment table offset field.
6. The method of claim 3, wherein padding the missing information field in the target shelled file comprises:
if the packet header section is an information missing field, acquiring a valid bit pattern from the target shelled file, and determining a preset packet header template corresponding to the valid bit pattern; the preset packet header template comprises a third preset value corresponding to each first field in the packet header section and a blank value corresponding to each second field in the packet header section;
Determining a target bit corresponding to each second field based on the offset bit and the initial bit corresponding to each second field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
based on the target bit corresponding to each second field, acquiring a filling value corresponding to each second field from the target shelled file;
and replacing the blank value corresponding to each second field in the preset packet header template with a corresponding filling value to obtain packet header information, and filling the packet header information in the packet header section.
7. The method of claim 3, wherein padding the missing information field in the target shelled file comprises:
if the original file size field is an information missing field, determining a target bit corresponding to the original file size field based on a bias bit and an initial bit corresponding to the original file size field; wherein the offset is determined based on a product of a second preset value and a target node number, the target node number being obtained from the target shelled file;
and acquiring original file size information from the target shelled file based on a target bit corresponding to the original file size field, and filling the original file size information in the original file size field.
8. The method of claim 1, wherein determining whether the file to be processed is a target shelled file is performed by:
and if the value of any preset field in the file header of the file to be processed is a corresponding third preset value and/or the file to be processed contains a preset section, determining that the file to be processed is the target shelled file.
9. The method of any one of claims 1-8, further comprising:
and if the target field does not have the information missing field, unshelling the target shelled file based on the preset unshelling mode to obtain an original file.
10. A malicious file shelling device, characterized in that it comprises:
the field determining module is used for determining a target field from a target shelled file aiming at a file to be processed in any target format if the file to be processed is the target shelled file; wherein the target field is a field required for unshelling;
the reconstruction module is used for filling the information missing field in the target shelled file if the information missing field exists in the target field, so as to obtain a reconstructed shelled file;
And the shelling module is used for shelling the reconstructed shelled file based on a preset shelling mode to obtain an original file.
11. An electronic device comprising at least one processor and at least one memory, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the method of any of claims 1-9.
12. A computer readable storage medium, characterized in that it stores a computer program executable by a computer, which when run on the computer causes the computer to perform the method according to any one of claims 1 to 9.
CN202310545705.0A 2023-05-15 2023-05-15 Method and device for unshelling malicious files, electronic equipment and storage medium Pending CN116842512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310545705.0A CN116842512A (en) 2023-05-15 2023-05-15 Method and device for unshelling malicious files, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310545705.0A CN116842512A (en) 2023-05-15 2023-05-15 Method and device for unshelling malicious files, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116842512A true CN116842512A (en) 2023-10-03

Family

ID=88160591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310545705.0A Pending CN116842512A (en) 2023-05-15 2023-05-15 Method and device for unshelling malicious files, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116842512A (en)

Similar Documents

Publication Publication Date Title
US9858072B2 (en) Portable executable file analysis
US7165076B2 (en) Security system with methodology for computing unique security signature for executable file employed across different machines
US8452981B1 (en) Method for author verification and software authorization
US8645763B2 (en) Memory dump with expanded data and user privacy protection
US20140359285A1 (en) Method and system for transferring data instructions through a host file system
CN103530535A (en) Shell adding and removing method for Android platform application program protection
US11475133B2 (en) Method for machine learning of malicious code detecting model and method for detecting malicious code using the same
US8510523B2 (en) Memory dump with expanded data and user privacy protection
CN107038353B (en) Software program checking protection method and system
US20160224791A1 (en) Process testing apparatus, process testing program, and process testing method
US11874925B2 (en) Data processing method for coping with ransomware, program for executing the method, and computer-readable recording medium storing the program
Park et al. New flash memory acquisition methods based on firmware update protocols for LG Android smartphones
CN110297926B (en) On-orbit configuration method of satellite-borne image processing device
CN109214184B (en) Universal automatic shelling method and device for Android reinforced application program
CN113849859A (en) Linux kernel modification method, terminal device and storage medium
CN116842512A (en) Method and device for unshelling malicious files, electronic equipment and storage medium
CN116522368A (en) Firmware decryption analysis method for Internet of things equipment, electronic equipment and medium
CN108664796B (en) So file protection method and device
US11411577B2 (en) Data compression method, data decompression method, and related apparatus
US20170090801A1 (en) System for storing and reading of a message authentication code in an external memory and related method
US20040260980A1 (en) Data conversion system for protecting software against analysis and tampering
CN110349025B (en) Method and device for preventing loss of contract assets based on non-cost transaction output
CN117494122A (en) Method for positioning and extracting malicious features of shelled malicious software with self-adaptive granularity
CN114116439A (en) Debugging information output method and device, software debugging equipment and storage medium
CN116170792A (en) Method and device for determining failure in downloading configuration file and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination