CN110708376B - Processing and forwarding system and method for massive compressed files - Google Patents

Processing and forwarding system and method for massive compressed files Download PDF

Info

Publication number
CN110708376B
CN110708376B CN201910944196.2A CN201910944196A CN110708376B CN 110708376 B CN110708376 B CN 110708376B CN 201910944196 A CN201910944196 A CN 201910944196A CN 110708376 B CN110708376 B CN 110708376B
Authority
CN
China
Prior art keywords
server
data
file
processing
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910944196.2A
Other languages
Chinese (zh)
Other versions
CN110708376A (en
Inventor
陈宗朗
黄代良
骆武辉
赵丽婷
李健军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jingyuan Safety Technology Co ltd
Original Assignee
Guangzhou Jingyuan Safety Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jingyuan Safety Technology Co ltd filed Critical Guangzhou Jingyuan Safety Technology Co ltd
Priority to CN201910944196.2A priority Critical patent/CN110708376B/en
Publication of CN110708376A publication Critical patent/CN110708376A/en
Application granted granted Critical
Publication of CN110708376B publication Critical patent/CN110708376B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • H04L67/5651Reducing the amount or size of exchanged application data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a processing and forwarding system of massive compressed files, which comprises acquisition equipment, a CentOS server and a Java processing program, wherein the CentOS server comprises a CPU (central processing unit) and a Java processing program, wherein the CPU comprises the following steps: the acquisition equipment is used for acquiring data of the equipment; the CentOS server is used for modifying the open number of the system file; the method comprises the following steps that the path of receiving equipment data by the CentOS server points to a tmpfs file system, and the tmpfs file system compresses and mounts the data after receiving the data; the Java processing program directly reads, processes and forwards the compressed data. The invention deploys a centros system by using an x86 server, receives and stores the compressed file by modifying the maximum file opening number of the system and using a tmpfs file system. The Java processing program is utilized to directly read the file into the I/O stream, the file content is obtained through the I/O, and the file content is processed and forwarded out, so that the problems of high CPU utilization rate, poor hardware performance of a disk IO (input/output) and the like caused by concurrent processing of massive compressed small files are solved.

Description

Processing and forwarding system and method for massive compressed files
Technical Field
The invention relates to the field of file compression processing, in particular to a processing and forwarding system and method for massive compressed files.
Background
In the existing data processing and forwarding system, the conventional method is to use the default configuration of the system, and the partition for receiving data is directly mounted on the hard disk partition; then decompressing the compressed file, reading the decompressed file for processing, and then backing up. The system has low carried concurrency, and the overall load of the system is too high easily due to frequent disk I/O overhead, so that the performance of hardware cannot be fully utilized.
Therefore, further improvement on the existing data forwarding processing system is needed to improve the concurrency of the system and the performance of hardware.
Disclosure of Invention
In order to solve the above technical problems, the present invention provides a system and a method for processing and forwarding a large amount of compressed files, which improve the concurrency of the system and the performance of hardware.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a processing and forwarding system for massive compressed files comprises acquisition equipment, a CentOS server and a Java processing program, wherein:
the collection device is used for collecting device data of the device and sending the collected device data to the CentOS server;
the CentOS server is used for receiving the acquisition data of the acquisition equipment after modifying the system file opening number;
the method comprises the steps that a tmpfs file system is installed in a CentOS server, a PATH of receiving equipment data by the CentOS server points to the tmpfs file system, and the tmpfs file system compresses the data after receiving the data and mounts the data by using a mount-ttmpfs-p size-46080 m tmpfs $ FTP _ PATH command;
the Java processing program directly reads, processes and forwards the compressed data and simultaneously carries out backup at regular time.
Preferably, the server is provided with a CentOS 1810x86_64 version system of the basic server type.
Preferably, the server adopts a server of x86 architecture.
A processing and forwarding method for massive compressed files is applied to the processing and forwarding system for the massive compressed files, and is characterized by comprising the following steps:
s1, installing a CentOS 1810x86_64 version system of the type basic server on the server;
s2, after the system is installed, modifying the kernel limit of the system to make the system support TCP high concurrency, and then modifying the maximum file number limit to promote the processing ftp concurrency number;
s3, after the system kernel is modified, installing an ftp server for receiving the device data of the acquisition device;
s4: after the FTP server is installed, mounting a tmpfs file system, directing a PATH of compressed data received by the FTP to the tmpfs file system, and after receiving the data, compressing the data by the tmpfs file system by using a mount-ttmpfs-p size of 46080m tmpfs $ FTP _ PATH command;
s5, installing a Java8 environment, opening a Java processing program, reading data in a tmpfs file system by the Java processing program, directly reading compressed data by the Java processing program by using an API (application program interface) of Java, and reading a byte array without decompression;
s6, processing the read content and forwarding according to different business processes;
s7: after the forwarding is completed, the processed compressed file is cleaned up by using a Java processing program.
Preferably, the server adopts a server of x86 architecture.
Preferably, the kernel limit of the system is modified, namely modifying/etc/security/limits. conf files, the default soft and hard limit is 1024, and soft profile 65536 and hard profile 65536 are added.
Preferably, modifying the maximum file number limit raises both the soft limit and the hard limit of the maximum number of processes available to a single user to 65536, freeing the concurrent number limit of the system security level.
Preferably, yum installation or compiled installation can be used when installing the ftp server.
Preferably, the Java handlers are installed in the temporary directory.
Preferably, the compressed data is backed up to the disk periodically.
The invention has the beneficial technical effects that: the invention deploys a centros system by using an x86 server, receives and stores the compressed file by modifying the maximum file opening number of the system and using a tmpfs file system. And then, directly reading the compressed file into the I/O stream without decompressing by using a Java processing program, acquiring the file content through the I/O, and forwarding the file through corresponding processing. And after receiving, processing and forwarding, the compressed files all run in a tmpfs file system, so that the processing performance is greatly improved. The problem of because of the concurrent processing of the small file of massive compression results in hardware performance such as CPU high usage, disk IO poor is solved.
Drawings
FIG. 1 is a schematic diagram of the overall framework of the system of the present invention.
FIG. 2 is a flow chart of the steps of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments, but the scope of the present invention is not limited to the following embodiments.
As shown in fig. 1, in order to achieve the above object, the technical solution adopted by the present invention is as follows: a processing and forwarding System for massive compressed files comprises a collection device, a CentOS (Community Enterprise Operating System, Chinese means a Community Enterprise Operating System), wherein the CentOS is an Enterprise-level Linux distribution version which is provided based on Red Hat Linux and can freely use source codes, and a Java processing program, wherein:
the collection device is used for collecting device data of the device and sending the collected device data to the CentOS server;
the CentOS server is used for receiving the acquisition data of the acquisition equipment after modifying the system file opening number;
the centros server is installed with a tmpfs (temporary file system, a memory-based file system) file system, the way that the centros server receives device data points to the tmpfs file system, the tmpfs file system compresses the data after receiving the data and uses mount-t tmpfs-p size of 46080m tmpfs FTP $ PATH ($ FTP _ PATH is a variable referring to a specific receiving PATH, the instruction means mount the $ FTP _ PATH directory to the tmpfs file system, the mount space size is 46080MB, the space size can be adjusted according to the actual situation) command mount (mount is similar to the concept of a disk identifier under Windows, mount a partition to a directory, the way of a disk identifier is not supported, so that the Linux can operate on the file above the disk, at this time this partition needs to inherit the directory level of this directory, such as a partition sda2, mounted to/usr/src, at which time the aaa directory within sda2 is now denoted as/usr/src/aaa. If he mounts below/var/www. The aaa inside sda2 becomes/var/www/aaa. This is also similar to the modified drive letter of Windows, such as D: \ aaa. Modifying the drive letter to E, the directory becomes E: \\aaa, but they are one thing. After mounting, the modified contents include copy, delete, move, etc., and as long as the files and directories contained in the mounted partition are stored in the partition instead of the original directories);
the Java processing program directly reads, processes and forwards the compressed data, simultaneously backups the compressed data at regular time, and forwards the compressed data to other corresponding systems according to the content of the data during forwarding.
Preferably, the server is installed with a CentOS 1810x86 — 64 version system of type basic server.
Preferably, The server is a server of The X86 architecture (The X86 architecture, which is a set of computer language instructions executed by a microprocessor, and refers to standard numbering abbreviations for an intel general computer column, and also identifies a set of general computer instruction sets).
As shown in fig. 2, a method for processing and forwarding a large amount of compressed files is applied to the system for processing and forwarding a large amount of compressed files, and the method includes the following steps:
s1, installing a CentOS 1810x86_64 version system of the type basic server on the server;
s2, after the system is installed, modifying the kernel limit of the system to make it support TCP (Transmission control Protocol, which is a connection-oriented, reliable and byte stream-based transport layer communication Protocol and is defined by RFC 793 of IETF) high concurrency, and then modifying the maximum file number limit to promote the processing of ftp (File transfer Protocol) concurrency number;
s3, after the system kernel is modified, installing an ftp server for receiving the device data of the acquisition device;
s4: after the FTP server is installed, mounting a tmpfs file system, directing a PATH of compressed data received by the FTP to the tmpfs file system, and after receiving the data, compressing the data by the tmpfs file system by using a mount-ttmpfs-p size of 46080m tmpfs $ FTP _ PATH command; the method is used for solving the problem of disk I/O performance bottleneck caused by high-frequency overhead of ftp receiving files on a disk in a high-concurrency state, otherwise, dead loop of I/O wait is caused when massive files are received, and the running of the system is further slowed.
S5, installing a Java8 environment, opening a Java processing program, installing the Java processing program in a temporary directory, and reading the tmpfs file system by the Java program, so that the overhead of disk I/O is not increased in the program processing process. The conventional method for processing the compressed file is to decompress the compressed file and then read the content in the compressed file again, and the Java processing program of the invention directly reads the compressed data by using an Application Programming Interface (API) of Java, and reads the byte array without decompressing, thereby reducing the processing flow and further improving the processing performance.
S6, processing the read content and forwarding according to different business processes;
s7: after the forwarding is completed, the processed compressed file is cleaned up by using a Java processing program. When the backup is needed, the compressed file is backed up to a local disk, so that the compressed file can be conveniently stored for a long time. In the whole receiving and forwarding process, only when backup is needed, the compressed file can fall to the disk, namely, the disk I/O overhead is needed, and compared with a conventional method, the processing performance is effectively improved.
The server adopts a server of an x86 architecture. Modify kernel limits of the system, i.e., modify/etc/security/limits. conf file, default soft-hard limit is 1024, soft profile 65536 and hard profile 65536 are added.
Preferably, modifying the maximum file number limit raises both the soft limit and the hard limit of the maximum number of processes available to a single user to 65536, freeing the concurrent number limit of the system security level.
Preferably, the ftp server is installed using yum installation or compiled installation for receiving data of the acquisition device.
The method and the system of the invention deploy a centros system by using an x86 server, receive and store the compressed file by modifying the maximum file opening number of the system and using a tmpfs file system. And directly reading the compressed file into the I/O stream without decompressing by using a Java processing program. And acquiring the file content through the I/O, and forwarding the file content after corresponding processing. And after receiving, processing and forwarding, the compressed files all run in a tmpfs file system, so that the processing performance is greatly improved. The method solves the bottleneck of hardware such as CPU utilization rate, disk IO and the like caused by concurrent processing of massive compressed small files.
Compared with other inventions, the invention has the particularity that the system kernel limit is modified, and the tmpfs file system is used, so that the ftp concurrent processing number of the system is improved, and the performance limit of hardware is fully pressed, which is a characteristic that other types of data processing forwarding systems do not have.
Variations and modifications to the above-described embodiments may occur to those skilled in the art, which fall within the scope and spirit of the above description. Therefore, the present invention is not limited to the specific embodiments disclosed and described above, and some modifications and variations of the present invention should fall within the scope of the claims of the present invention. Furthermore, although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (10)

1. The processing and forwarding system for the massive compressed files is characterized by comprising acquisition equipment, a CentOS server and a Java processing program, wherein:
the collection device is used for collecting device data of the device and sending the collected device data to the CentOS server;
the CentOS server is used for receiving the acquisition data of the acquisition equipment after modifying the system file opening number;
the method comprises the steps that a tmpfs file system is installed in a CentOS server, a PATH of receiving equipment data by the CentOS server points to the tmpfs file system, and the tmpfs file system compresses the data after receiving the data and mounts the data by using a mount-ttmpfs-p size =46080m tmpfs $ FTP _ PATH command;
the Java processing program directly reads, processes and forwards the compressed data and simultaneously carries out backup at regular time.
2. The system as claimed in claim 1, wherein the server is installed with a CentOS 1810x86 — 64 version system of type baseline server.
3. The system for processing and forwarding of mass compressed files according to claim 2, wherein said server is a server of x86 architecture.
4. A method for processing and forwarding a mass of compressed files, which applies the system for processing and forwarding a mass of compressed files according to any one of claims 1 to 3, the method comprising the steps of:
s1, installing a CentOS 1810x86_64 version system of the type basic server on the server;
s2, after the system is installed, modifying the kernel limit of the system to make the system support TCP high concurrency, and then modifying the maximum file number limit to promote the processing ftp concurrency number;
s3, after the system kernel is modified, installing an ftp server for receiving the device data of the acquisition device;
s4: after the FTP server is installed, mounting a tmpfs file system, and directing a PATH of device data received by FTP to the tmpfs file system, wherein the tmpfs file system compresses the data after receiving the data and uses mount-t tmps-p size =46080m tmpfs $ FTP _ PATH command mounting;
s5, installing a Java8 environment, and opening a Java processing program, wherein the Java processing program reads data in a tmpfs file system, and the Java processing program directly reads compressed data without decompression and reads a byte array by using a Java self-contained API;
s6, processing the read content and forwarding according to different business processes;
s7: after the forwarding is completed, the processed compressed file is cleaned up by using a Java processing program.
5. The method for processing and forwarding the massive compressed files according to claim 4, wherein the server adopts a server of x86 architecture.
6. The method as claimed in claim 4, wherein modifying the kernel limit of the system, i.e. modifying/etc/security/limits. conf file, default soft and hard limits are 1024, and softprofile 65536 and hard profile 65536 are added.
7. The method for processing and forwarding of mass compressed files according to claim 4, wherein the modification of the maximum file number limit raises both soft and hard limits of the maximum number of processes available to a single user to 65536, and removes the concurrent number limit of the system security level.
8. The method for processing and forwarding the massive compressed files according to claim 4, wherein yum installation or compiling installation can be used when installing the ftp server.
9. The method as claimed in claim 4, wherein the Java handler is installed in the temporary directory.
10. The method as claimed in claim 4, wherein the compressed data is backed up to the disk at regular time.
CN201910944196.2A 2019-09-30 2019-09-30 Processing and forwarding system and method for massive compressed files Active CN110708376B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910944196.2A CN110708376B (en) 2019-09-30 2019-09-30 Processing and forwarding system and method for massive compressed files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910944196.2A CN110708376B (en) 2019-09-30 2019-09-30 Processing and forwarding system and method for massive compressed files

Publications (2)

Publication Number Publication Date
CN110708376A CN110708376A (en) 2020-01-17
CN110708376B true CN110708376B (en) 2020-10-30

Family

ID=69197826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910944196.2A Active CN110708376B (en) 2019-09-30 2019-09-30 Processing and forwarding system and method for massive compressed files

Country Status (1)

Country Link
CN (1) CN110708376B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950561A (en) * 2003-06-13 2011-01-19 尼尔森(美国)有限公司 Watermark embedding method and device
US8639787B2 (en) * 2009-06-01 2014-01-28 Oracle International Corporation System and method for creating or reconfiguring a virtual server image for cloud deployment
CN105607964A (en) * 2015-10-30 2016-05-25 浪潮(北京)电子信息产业有限公司 FTP server backup method and apparatus

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101394397A (en) * 2008-01-03 2009-03-25 中国移动通信集团湖北有限公司 Compression storage method capable of remote invoking used for mobile network, system thereof
US10061653B1 (en) * 2013-09-23 2018-08-28 EMC IP Holding Company LLC Method to expose files on top of a virtual volume
CN103744748A (en) * 2014-01-10 2014-04-23 浪潮电子信息产业股份有限公司 Method for simply, rapidly and automatically backuping FTP (File Transport Protocol) server
CN104917826A (en) * 2015-05-26 2015-09-16 浪潮电子信息产业股份有限公司 Method for realizing FTP access based on automatic code conversion in heterogeneous environment
CN105159799A (en) * 2015-09-06 2015-12-16 浪潮(北京)电子信息产业有限公司 Method and device for backing up server
CN109783117B (en) * 2019-01-18 2023-01-10 中国人民解放军国防科技大学 Mirror image file making and starting method of diskless system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950561A (en) * 2003-06-13 2011-01-19 尼尔森(美国)有限公司 Watermark embedding method and device
US8639787B2 (en) * 2009-06-01 2014-01-28 Oracle International Corporation System and method for creating or reconfiguring a virtual server image for cloud deployment
CN105607964A (en) * 2015-10-30 2016-05-25 浪潮(北京)电子信息产业有限公司 FTP server backup method and apparatus

Also Published As

Publication number Publication date
CN110708376A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
CA2564844C (en) Method and system for compression of files for storage and operation on compressed files
AU649455B2 (en) Distributed computing system
US6606685B2 (en) System and method for intercepting file system writes
CN104598809B (en) Program monitoring method and defending method thereof, as well as relevant device
US11048591B1 (en) Efficient name space organization in a global name space cluster
JP3955358B2 (en) Method and computer system for executing architecture specific code with reduced run-time memory space requirements
US9817879B2 (en) Asynchronous data replication using an external buffer table
JP3210606B2 (en) Method and computer system for executing network mobile code with reduced run-time memory space requirements
US20150012732A1 (en) Method and device for recombining runtime instruction
Chen et al. vNFS: Maximizing NFS performance with compounds and vectorized I/O
US20020174265A1 (en) Method and apparatus for caching active computing environments
US11836047B2 (en) Small file restore performance in a deduplication file system
US20210149759A1 (en) Application crash analysis techniques when memory dump and debug symbols are not co-located
CN115129494B (en) Event log collection method and system based on Windows kernel
CN111104258A (en) MongoDB database backup method and device and electronic equipment
US20160306739A1 (en) Garbage collection of non-pinned objects within heap
US7743333B2 (en) Suspending a result set and continuing from a suspended result set for scrollable cursors
CN110708376B (en) Processing and forwarding system and method for massive compressed files
CN101135978A (en) Compression version application program generating, executing method and apparatus and applications method and system
CN109189652A (en) A kind of acquisition method and system of close network terminal behavior data
US20110302377A1 (en) Automatic Reallocation of Structured External Storage Structures
CN111405020A (en) Asynchronous file export method and system based on message queue and fastDFS micro-service architecture
US10341451B2 (en) Cloud oriented stream scheduling method based on android platform
CN115048360A (en) MySQL database tuning device, method, terminal and storage medium
Yang et al. NativeTask: a Hadoop compatible framework for high performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant