CN116126799A - File splitting method, device, electronic equipment and storage medium - Google Patents

File splitting method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116126799A
CN116126799A CN202211533750.6A CN202211533750A CN116126799A CN 116126799 A CN116126799 A CN 116126799A CN 202211533750 A CN202211533750 A CN 202211533750A CN 116126799 A CN116126799 A CN 116126799A
Authority
CN
China
Prior art keywords
splitting
split
file
information
source file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211533750.6A
Other languages
Chinese (zh)
Inventor
蔡小伟
汤鑫
钱益民
贾亮
张瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Post Information Technology Beijing Co ltd
Original Assignee
China Post Information Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Post Information Technology Beijing Co ltd filed Critical China Post Information Technology Beijing Co ltd
Priority to CN202211533750.6A priority Critical patent/CN116126799A/en
Publication of CN116126799A publication Critical patent/CN116126799A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation
    • G06F16/166File name conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a file splitting method, a file splitting device, electronic equipment and a storage medium. The method comprises the following steps: under the condition of receiving a splitting request, determining splitting task information corresponding to each source file to be split in the process of determining the splitting request; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information; determining splitting threads for splitting files of the source files to be split respectively based on the file information; and for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of splitting subfiles. According to the technical scheme disclosed by the invention, the files are processed simultaneously by multithreading, so that the processing efficiency of file splitting is improved.

Description

File splitting method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of file processing technologies, and in particular, to a method and apparatus for splitting a file, an electronic device, and a storage medium.
Background
In the related art, with the continuous improvement of a service system, data generated by the service system is output in a PDF file form, and automatic batch splitting processing is performed on PDF documents, so that new requirements for electronic document management are met.
However, in the process of processing a PDF file, the existing document splitting processing tool cannot rename and has low processing efficiency when a single document automatically splits a plurality of documents, so that the document processing efficiency is reduced.
Disclosure of Invention
The invention provides a file splitting method, a device, electronic equipment and a storage medium, which are used for solving the problems that a single file cannot be renamed and the file splitting efficiency is low when a plurality of files are automatically split in the prior art, processing a plurality of files simultaneously by multithreading, improving the processing efficiency of file splitting, directly renaming split sub-files based on preset naming rules, and solving the problem that the split sub-files cannot be renamed.
In a first aspect, an embodiment of the present invention provides a file splitting method, where the method includes:
under the condition of receiving a splitting request, determining splitting task information corresponding to each source file to be split in the process of determining the splitting request; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information;
Determining splitting threads for splitting files of the source files to be split respectively based on the file information;
and for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of renamed splitting subfiles.
Optionally, the splitting task information corresponding to each source file to be split in the splitting request determination includes:
determining servers for generating the source files to be split based on the splitting request respectively, and carrying out grouping processing on the servers to obtain a plurality of server groups;
and respectively acquiring the splitting task information corresponding to each source file to be split based on each server.
Optionally, the determining, based on each piece of file information, a splitting thread for splitting a file from each piece of source file to be split includes:
for any split task information, acquiring a preset thread pool under the condition that the occupation amount of the current split task information accords with a preset threshold value;
And determining a splitting thread for splitting the file of the source file to be split corresponding to the current splitting task information based on the file information in the current splitting task information and the thread pool.
Optionally, the file information includes a server address of a server generating the current source file to be split;
before acquiring the current source file to be split and the splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, the method further comprises:
acquiring a current source file to be split based on the server address, and storing the current source file to be split into a preset external storage medium;
and generating a storage message of the current source file to be split based on the medium information of the external storage medium, and broadcasting the storage message to the splitting thread.
Optionally, the obtaining the current source file to be split and the splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information includes:
under the condition that the storage message is received, analyzing the storage message to obtain a storage address of the current source file to be split, and reading the current source file to be split based on the storage address;
And acquiring a splitting rule corresponding to the current source file to be split based on the file information of the current source file to be split and the splitting rule.
Optionally, the splitting rule information includes a naming rule;
the splitting processing is performed on the source file to be split based on the splitting rule to obtain a plurality of renamed split subfiles, including:
calling a preset splitting interface, and splitting the source file to be split based on the splitting interface and the splitting rule to obtain a plurality of splitting subfiles corresponding to the source file to be split;
and renaming each split sub-file based on the naming rule to obtain a renamed split sub-file.
Optionally, after obtaining the plurality of split subfiles, the method further includes:
acquiring a splitting state table of the source file to be split;
updating the split state table based on the split information and the storage information of each split sub-file, and storing the updated split state table.
In a second aspect, an embodiment of the present invention further provides a file splitting apparatus, where the apparatus includes:
the splitting task information acquisition module is used for determining splitting task information corresponding to each source file to be split respectively based on the splitting request under the condition that the splitting request is received; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information;
The splitting thread determining module is used for respectively determining splitting threads for splitting files of the source files to be split based on the file information;
the splitting processing module is used for acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information for any splitting thread, and splitting the source file to be split based on the splitting rule to obtain a plurality of renamed splitting subfiles.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the file splitting method according to any one of the embodiments of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a processor to execute the file splitting method according to any one of the embodiments of the present invention.
According to the technical scheme provided by the embodiment of the invention, under the condition that a splitting request is received, splitting task information corresponding to each source file to be split in the determination of the splitting request is determined; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information; determining splitting threads for splitting files of the source files to be split respectively based on the file information; and for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of splitting subfiles. According to the technical scheme, the plurality of splitting threads in the splitting device connected through the data system simultaneously split the plurality of source files to obtain the plurality of splitting subfiles respectively corresponding to each other, so that the splitting efficiency is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for splitting files according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a file splitting method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a file splitting apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing a file splitting method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.
Example 1
Fig. 1 is a flowchart of a file splitting method according to an embodiment of the present invention, where the embodiment is applicable to a case of processing a source file output by a service system.
In practical applications, some source files generated by a data system are in PDF format, where the source files include multiple data results, and the system does not support file splitting and file renaming of the output source files according to the data results due to the structure of the data system, so that the obtained source files are unfavorable for subsequent data processing. Based on the above, in the prior art, after the data system outputs the source file, the obtained source file is split by adopting a splitting method to obtain a plurality of split subfiles. However, in the prior art, the source file splitting can only be independently split, multiple source files generated by a data system cannot be split at the same time, the splitting efficiency is low, and the split sub-files obtained after the splitting cannot be automatically renamed. In order to solve the technical problems, the technical scheme of the embodiment of the invention provides a file splitting method, and particularly solves the problem that split sub files after splitting cannot be renamed by simultaneously splitting a plurality of source files through a plurality of splitting threads in a splitting device connected with a data system to obtain a plurality of split sub files respectively corresponding to each other, improving splitting efficiency and directly renaming the split sub files based on preset naming rules.
Specifically, the method may be performed by a file splitting device, which may be implemented in hardware and/or software, and the file splitting device may be configured in an intelligent terminal and a cloud server. As shown in fig. 1, the method includes:
s110, under the condition that a splitting request is received, splitting task information corresponding to each source file to be split in the splitting request determination is based on the splitting task information.
In the embodiment of the invention, a data system can be understood as a cluster of servers that perform data processing. Specifically, the data system includes a plurality of sub-servers, each of which generates a different source file. Specifically, each sub-server generates a splitting rule corresponding to each source file and a naming rule corresponding to the split sub-file after splitting based on each splitting rule when generating the source file. In particular, a splitting rule may be understood as a range of page numbers in a source file for different data results contained in the source file. When the data system detects the source files generated by each subsystem and the corresponding splitting rules, splitting requests of each source file are generated based on the file information of the source files. Further, the splitting request is sent to a file splitting system pre-associated with the data system, so that data splitting is carried out on each source file output by the data system through the file splitting system. Further, the split sub-file obtained after splitting is automatically renamed based on the naming rule, and the renamed split sub-file is obtained.
In practical application, a splitting request is generated based on file information of each source file, and the splitting request is sent in the form of an http request. Optionally, it is determined whether the http request was sent successfully. If the transmission fails, an error log is recorded and the request is ended; otherwise, if the file splitting system is successful, splitting the source files based on a file splitting system pre-associated with the data system.
Specifically, when receiving a splitting request sent by a data system, the file splitting system sends an information acquisition request to the data system based on the splitting request, and obtains splitting task information corresponding to each source file based on feedback. The splitting task information comprises file information of a source file to be split and splitting rule information of a splitting rule corresponding to the source file.
Optionally, the method for determining splitting task information corresponding to each source file to be split based on the splitting request in this embodiment may include: determining servers for generating source files to be split respectively based on the splitting request, and carrying out grouping processing on the servers to obtain a plurality of server groups; and respectively acquiring splitting task information corresponding to each source file to be split based on each server group.
Specifically, since the splitting request includes file information of each source file, the file information includes a server identifier of a server of the source file, and the server of the source file is determined based on the server identifier, that is, a sub-server generating the source file in the data system is determined. And grouping the servers based on the server identifiers to obtain server groups. Optionally, for any server group, a group of information acquisition requests is generated based on each server identifier of the current server group, and the information acquisition requests are sent to the data system to obtain split task information corresponding to source files generated by each server of the current group.
In practical application, when a file splitting system receives a splitting request, the splitting request is parsed to obtain a source file name and a server identifier of a source file server, and servers of the source file are grouped based on the server identifier to obtain each server group. And further generating splitting rule information of splitting rules corresponding to the source files generated by the servers of the current server group based on the grouping information. Optionally, splitting rule information corresponding to each server group is obtained based on each group information. And determining splitting task information of each source file based on the file information of the source file generated by each server and the corresponding splitting rule information. The file information may include information such as a file name of the source file, a file storage path, and the like; the splitting rule information comprises information such as file names corresponding to the splitting rules, splitting page number parameters, storage information of the splitting rules and the like.
It should be noted that, the file splitting system generates an information acquisition request based on the server group, and acquires splitting task information, which has the following effects: when the splitting request comprises splitting requests of a plurality of source files, a plurality of splitting task information can be obtained based on one SFTP request, and the number of request interaction times is reduced, so that the file splitting efficiency can be improved.
S120, determining threads for splitting files of the source files to be split based on the file information.
Under the condition of acquiring splitting task information corresponding to each source file respectively, a splitting thread for splitting files of each source file to be split needs to be determined so as to realize splitting processing of each source file at the same time, and further improve splitting efficiency.
Optionally, the method for determining splitting threads in this embodiment may include: for any split task information, acquiring a preset thread pool under the condition that the occupation amount of the current split task information accords with a preset threshold value; and determining a thread for splitting the file of the source file to be split corresponding to the current splitting task information based on the file information in the current splitting task information and the thread pool.
The occupation amount of the splitting task information can be understood as the occupation memory of the splitting task information of the source file in the current storage medium. Optionally, it is determined whether the information amount is greater than a preset threshold. If yes, the fact that a splitting task exists in the splitting task information is indicated, file information of a source file to be split can be obtained, and a splitting thread for processing the source file is determined based on the file information so as to split the source file; otherwise, the file acquisition is unsuccessful, and the splitting processing of the source file cannot be continued.
In practical application, it is determined whether the information length of the split task information is greater than 0. If the information length is greater than 0, the file splitting task can be continuously executed based on the splitting task information, namely the splitting task is distributed by using a thread pool strategy; otherwise, the splitting task is ended.
In this embodiment, a thread pool is preset in the file splitting system, where the thread pool includes a plurality of threads for executing a file splitting task of each source file. Specifically, distributing splitting tasks based on a thread pool policy can be understood as matching policies of splitting threads corresponding to matching source files in a thread pool; alternatively, the matching policy may match the file data size based on each source file and the processing degree of the thread, or match the processing time limit of each source file and the processing progress of the thread, or match the file data size based on other matching policies, which is not limited in this embodiment.
On the basis of the above embodiment, when determining the split thread corresponding to the source file, the technical solution of this embodiment modifies the task state corresponding to the split task information of the source file, so as to prevent other threads from matching the source file corresponding to the split task information again, and avoid the problem of repeated allocation of threads, thereby avoiding wasting split resources and reducing the split efficiency caused by repeated splitting of the same source file.
S130, for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on file information and splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of renamed splitting subfiles.
In the embodiment of the invention, in order to enable the file splitting system to simultaneously supply a plurality of threads and split a plurality of source files, on the basis of matching splitting threads corresponding to the source files, each source file stored in the data system is acquired based on each splitting thread, and is stored in an external storage medium, and when splitting is performed, the downloaded source files are directly called and processed, and the source files are immediately deleted after being split, so that the problem of disk space occupation of the file splitting system is solved, and the splitting efficiency of the file splitting system is further realized.
Optionally, the method for obtaining the source file based on the splitting thread and externally storing the source file in the embodiment may include: acquiring file information comprises generating a system address of a generating system of a current source file to be split; determining a generating system for generating a current source file to be split based on a system address, and reading the current source file to be split in the generating system based on file information; storing the current source file to be split into a preset external storage medium, generating a storage message of the current source file to be split based on the medium information of the external storage medium, and broadcasting the storage message to the splitting thread.
In practical application, the splitting thread obtains and generates the server address of the current source file server to be split, and logs in to the server by using the SFTP related API in the open source program JSCH. And judging whether the SFTP login is successful or not. Alternatively, if the SFTP login fails, the login is re-logged. If the SFTP login is successful, further judging whether the server finishes generating the source file, namely determining whether the complete source file can be acquired. If the source file does not exist, the splitting is finished, and if the source file exists, the source file is acquired in the form of a file stream. And storing the source file into a preset external storage medium, and naming the source file according to a preset rule. And when the source file is successfully stored, writing the related information of the file storage path into the rock MQ, and broadcasting the storage information into the splitting thread. Specifically, the stored message of the source file is broadcast to split message topics of the split thread.
Optionally, for any splitting thread, the source file and the splitting rule corresponding to the source file need to be acquired before the source file is processed. Optionally, the acquiring method in this embodiment may include: under the condition that a storage message is received, analyzing the storage message to obtain a storage address of a current source file to be split, and reading the current source file to be split based on the storage address; and acquiring a splitting rule corresponding to the current source file to be split based on the file information of the current source file to be split and the splitting rule information.
In practical applications, the split thread also includes an MQ message listener. After the splitting thread broadcasts the rock MQ message, an MQ message monitor monitors the splitting message theme in the splitting thread and judges whether the MQ TAG corresponding to the message meets the preset requirement according to the monitored message. If the requirements are not met, the message listener processing ends the processing of the current message and continues listening. If the requirement is met, analyzing the message body in the MQ, obtaining a storage path of an external storage medium according to the analysis result, reading a source file stored in the external storage medium according to the storage path, and further determining a corresponding splitting rule according to the file name of the source file.
On the basis, splitting processing is carried out on the source file. Optionally, the processing method of the splitting processing in this embodiment may include: calling a preset splitting interface, and splitting the source file to be split based on the splitting interface and a splitting rule to obtain a plurality of splitting subfiles corresponding to the source file to be split.
In practical application, an open source API is obtained, and a source file is split according to a splitting rule based on the API, so that a plurality of split subfiles are obtained.
Optionally, in the process of splitting the source file, the technical solution of this embodiment further includes: and renaming each split sub-file based on the naming rule to obtain a renamed split sub-file.
In this embodiment, the splitting task information further includes naming rule information of a splitting sub-file obtained after splitting the source file to be split based on the splitting rule. Specifically, when the splitting rule is obtained, the naming rule corresponding to the splitting molecule file is obtained based on naming rule information. Further, when the split sub-file is obtained, a naming rule corresponding to the splitting rule of the current split sub-file is determined, and renaming is carried out on the obtained current split sub-file based on the naming rule, so that the renamed split sub-file is obtained. Optionally, when splitting of each source file is completed, renamed split subfiles are obtained synchronously.
According to the technical scheme provided by the embodiment of the invention, under the condition that a splitting request is received, splitting task information corresponding to each source file to be split in the determination of the splitting request is determined; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information; determining splitting threads for splitting files of source files to be split based on the file information; for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on file information and splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of splitting subfiles. According to the technical scheme, the plurality of splitting threads in the splitting device connected through the data system simultaneously split the plurality of source files to obtain the plurality of splitting subfiles respectively corresponding to each other, so that the splitting efficiency is improved.
Example two
Fig. 2 is a flowchart of a file splitting method according to a second embodiment of the present invention, where, based on the foregoing embodiment, after obtaining a plurality of split subfiles, the method further includes:
obtaining a splitting state table of a source file to be split;
updating the split state table based on each split sub-file, and storing the updated split state table and each even split sub-file. As shown in fig. 2, the method includes:
s210, under the condition that a splitting request is received, splitting task information corresponding to each source file to be split in the splitting request determination is based on.
S220, determining splitting threads for splitting files of the source files to be split based on the file information.
S230, for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on file information and splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of splitting subfiles.
S240, acquiring a splitting state table of a source file to be split; updating the split state table based on the split information and the storage information of each split sub-file, and storing the updated split state table.
In the embodiment of the invention, the splitting state table is used for indicating the number of files of the current source file and the file page numbers corresponding to the number of files. In other words, the split state table may characterize the source file as an entire file that is not split, or as multiple split subfiles.
Specifically, when the source file is split and renamed to obtain a plurality of renamed split subfiles of the source file, each split subfile is stored in correspondence with the source file. Optionally, the split sub-file and the source file are correspondingly stored in a preset external storage medium, and the split sub-file in the split system is deleted, so that the memory occupation of the file split system is reduced.
Optionally, under the condition that the file is successfully stored, updating the split state of the split state table. Specifically, the splitting state of the splitting state table is updated based on the splitting rule and naming rule of each splitting sub-file and the storage information of each splitting sub-file obtained after splitting, so that other systems can verify file data based on the splitting rule and naming rule and verify the file data based on the storage information.
According to the technical scheme, the plurality of splitting threads in the splitting device connected through the data system simultaneously split the plurality of source files to obtain the plurality of splitting subfiles corresponding to the splitting files respectively, and the plurality of splitting subfiles and the source files are correspondingly stored in the external storage medium, so that the occupied space of the file splitting system is reduced, and the splitting efficiency is further improved.
Example III
Fig. 3 is a schematic structural diagram of a file splitting device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: a split task information acquisition module 310, a split thread determination module 320, and a split processing module 330; wherein,,
the splitting task information obtaining module 310 is configured to determine splitting task information corresponding to each source file to be split in the determination based on the splitting request when a splitting request is received; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information;
a splitting thread determining module 320, configured to determine splitting threads for splitting files of the source files to be split based on the file information;
the splitting processing module 330 is configured to obtain, for any splitting thread, a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, and split the source file to be split based on the splitting rule, so as to obtain a plurality of renamed split subfiles.
Based on the above embodiments, optionally, the splitting task information obtaining module 210 includes:
a server group obtaining unit, configured to determine servers that generate source files to be split based on the splitting request, and perform packet processing on the servers to obtain a plurality of server groups;
the splitting task information obtaining unit is used for respectively obtaining the splitting task information corresponding to each source file to be split based on each server.
Based on the above embodiments, optionally, the split thread determining module 320 includes:
the thread pool acquisition unit is used for acquiring a preset thread pool for any split task information under the condition that the occupation amount of the current split task information accords with a preset threshold value;
the splitting thread determining unit is used for determining a splitting thread for splitting the file of the source file to be split corresponding to the current splitting task information based on the file information in the current splitting task information and the thread pool.
On the basis of the above embodiments, optionally, the file information includes a server address of a server that generates the current source file to be split;
The apparatus further comprises:
the system address is used for acquiring the current source file to be split in the generating system based on the server address before acquiring the current source file to be split based on the file information and the splitting rule information, and storing the current source file to be split into a preset external storage medium;
and the storage message broadcasting module is used for generating the storage message of the current source file to be split based on the medium information of the external storage medium and broadcasting the storage message to the splitting thread.
Based on the above embodiments, optionally, the splitting processing module 330 includes:
the splitting source file obtaining unit is used for analyzing the storage message to obtain the storage address of the current source file to be split under the condition that the storage message is received, and reading the current source file to be split based on the storage address;
the splitting rule obtaining unit is used for obtaining the splitting rule corresponding to the current source file to be split based on the file information of the current source file to be split and the splitting rule.
On the basis of the above embodiments, optionally, the splitting rule information includes a naming rule;
The split processing module 330 includes:
the splitting sub-file obtaining unit is used for calling a preset splitting interface, splitting the source file to be split based on the splitting interface and the splitting rule, and obtaining a plurality of splitting sub-files corresponding to the source file to be split;
and the file renaming unit is used for renaming each split sub-file based on the naming rule to obtain a renamed split sub-file.
On the basis of the above embodiments, optionally, the apparatus further includes:
the splitting state table acquisition module is used for acquiring the splitting state table of the source file to be split;
and the file storage module is used for updating the split state table based on the split information and the storage information of each split sub-file and storing the updated split state table.
The file splitting device provided by the embodiment of the invention can execute the file splitting method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example IV
Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, such as a file splitting method.
In some embodiments, the file splitting method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the file splitting method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the file splitting method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of splitting a file, comprising:
under the condition of receiving a splitting request, determining splitting task information corresponding to each source file to be split in the process of determining the splitting request; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information;
determining splitting threads for splitting files of the source files to be split respectively based on the file information;
And for any splitting thread, acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, and splitting the source file to be split based on the splitting rule to obtain a plurality of renamed splitting subfiles.
2. The method according to claim 1, wherein the determining, based on the splitting request, splitting task information corresponding to each source file to be split, includes:
determining servers for generating the source files to be split based on the splitting request respectively, and carrying out grouping processing on the servers to obtain a plurality of server groups;
and respectively acquiring the splitting task information corresponding to each source file to be split based on each server.
3. The method according to claim 1, wherein the determining, based on each of the file information, a splitting thread for performing file splitting on each of the source files to be split, respectively, includes:
for any split task information, acquiring a preset thread pool under the condition that the occupation amount of the current split task information accords with a preset threshold value;
and determining a splitting thread for splitting the file of the source file to be split corresponding to the current splitting task information based on the file information in the current splitting task information and the thread pool.
4. The method of claim 1, wherein the file information includes a server address of a server that generated the current source file to be split;
before acquiring the current source file to be split and the splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information, the method further comprises:
acquiring a current source file to be split based on the server address, and storing the current source file to be split into a preset external storage medium;
and generating a storage message of the current source file to be split based on the medium information of the external storage medium, and broadcasting the storage message to the splitting thread.
5. The method of claim 4, wherein the obtaining the current source file to be split and the splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information includes:
under the condition that the storage message is received, analyzing the storage message to obtain a storage address of the current source file to be split, and reading the current source file to be split based on the storage address;
And acquiring a splitting rule corresponding to the current source file to be split based on the file information of the current source file to be split and the splitting rule.
6. The method of claim 1, wherein the split rule information comprises a naming rule;
the splitting processing is performed on the source file to be split based on the splitting rule to obtain a plurality of renamed split subfiles, including:
calling a preset splitting interface, and splitting the source file to be split based on the splitting interface and the splitting rule to obtain a plurality of splitting subfiles corresponding to the source file to be split;
and renaming each split sub-file based on the naming rule to obtain a renamed split sub-file.
7. The method of claim 1, wherein after obtaining the plurality of split subfiles, the method further comprises:
acquiring a splitting state table of the source file to be split;
updating the split state table based on the split information and the storage information of each split sub-file, and storing the updated split state table.
8. A document splitting apparatus, comprising:
The splitting task information acquisition module is used for determining splitting task information corresponding to each source file to be split respectively based on the splitting request under the condition that the splitting request is received; the splitting task information comprises file information of a source file to be split and splitting rule information corresponding to the file information;
the splitting thread determining module is used for respectively determining splitting threads for splitting files of the source files to be split based on the file information;
the splitting processing module is used for acquiring a current source file to be split and a splitting rule corresponding to the current source file to be split based on the file information and the splitting rule information for any splitting thread, and splitting the source file to be split based on the splitting rule to obtain a plurality of renamed splitting subfiles.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the file splitting method of any of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of file splitting of any of claims 1-7.
CN202211533750.6A 2022-12-01 2022-12-01 File splitting method, device, electronic equipment and storage medium Pending CN116126799A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211533750.6A CN116126799A (en) 2022-12-01 2022-12-01 File splitting method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211533750.6A CN116126799A (en) 2022-12-01 2022-12-01 File splitting method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116126799A true CN116126799A (en) 2023-05-16

Family

ID=86298210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211533750.6A Pending CN116126799A (en) 2022-12-01 2022-12-01 File splitting method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116126799A (en)

Similar Documents

Publication Publication Date Title
CN107766509B (en) Method and device for static backup of webpage
CN111478781A (en) Message broadcasting method and device
CN114090113B (en) Method, device, equipment and storage medium for dynamically loading data source processing plug-in
CN112667368A (en) Task data processing method and device
CN111767126A (en) System and method for distributed batch processing
CN116796085A (en) File processing method and device, electronic equipment and storage medium
CN112948138A (en) Method and device for processing message
CN116126799A (en) File splitting method, device, electronic equipment and storage medium
CN114064693A (en) Method, device, electronic equipment and computer readable medium for processing account data
CN114666319A (en) Data downloading method and device, electronic equipment and readable storage medium
CN114827159A (en) Network request path optimization method, device, equipment and storage medium
CN110896391B (en) Message processing method and device
CN112988806A (en) Data processing method and device
CN113556370A (en) Service calling method and device
CN112783914A (en) Statement optimization method and device
CN110858240A (en) Front-end module loading method and device
CN112667627B (en) Data processing method and device
CN114924806B (en) Dynamic synchronization method, device, equipment and medium for configuration information
CN114090524A (en) Excel file distributed exporting method and device
CN115827174B (en) Task processing method and device based on multiple instances
CN113778657B (en) Data processing method and device
CN113760965B (en) Data query method and device
CN116801001A (en) Video stream processing method and device, electronic equipment and storage medium
CN114398438A (en) Method, device, electronic equipment and computer readable medium for processing request
CN114647622A (en) Replication operation processing method, device and equipment based on distributed object storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination