Background
A Network Attached Storage (NAS) is a dedicated data Storage server, and a Storage device is completely separated from the server by taking data as a center, so as to centrally manage the data, thereby releasing bandwidth and improving performance.
NAS mainly has several disadvantages:
(1) limited by hardware reliability and prone to problems. Once a bad track occurs on the disk, irreversible data loss is easy to occur. The manual data recovery is difficult, and time and labor are consumed;
(2) the price is high, the safety mechanism needs to be realized independently, and the development and maintenance cost is high;
(3) when the concurrent access amount is large, the read-write performance is sharply reduced.
With the development of internet financing services such as balance treasures, the service volume and the user volume are larger and larger, the NAS is used as a storage server based on a unix system to bear larger and larger data access and write-in, so that the concurrency during the execution of task files is serious, the efficiency is low, some unpredictable abnormal conditions can be frequently met, and the stability of the file output is influenced. The time for processing the file of the balance bank becomes longer and longer, which greatly affects some more effective services, such as the issuing of the income of the balance bank.
In view of the above-mentioned shortcomings of NAS in data storage performance, no effective solution has been proposed at present.
Disclosure of Invention
The application aims to provide a file storage method and device, which can smoothly transfer data on an NAS to an OSS, thereby improving the file execution speed, reducing the cost and effectively ensuring the file read-write performance under the concurrent condition.
The application provides a file storage method and a file storage device, which are realized as follows:
a method of file storage, the method comprising: copying each task file into two task files, and respectively writing the two task files into the NAS and the OSS in parallel; comparing whether each task file stored in the NAS is completely consistent with each task file stored in the OSS; and if the comparison results are completely consistent within the specified time length, writing the subsequent task files into the OSS.
A file storage apparatus, the apparatus comprising: the parallel writing unit is used for copying each task file into two task files and respectively writing the two task files into the NAS and the OSS in parallel; the comparison unit is used for comparing whether each task file stored in the NAS is completely consistent with each task file stored in the OSS; and the storage unit is used for writing the subsequent task files into the OSS under the condition that the comparison results are completely consistent within the specified duration.
According to the file storage method and device, the traditional NAS is omitted, and the OSS is selected as the storage system of the task file, so that the file reading and writing performance is greatly improved, and meanwhile, the reliability of the storage system is improved. OSS (Object Storage Service) belongs to a massive, secure and highly reliable cloud Storage Service, and its flexible expansion of Storage capacity and processing power allows clients to focus on core services. Can also be matched with other cloud products for use. The OSS is widely applied to various service scenes such as mass data storage and backup, data processing and processing, content accelerated distribution, service data mining and analysis and the like. According to the method and the device, the OSS is used for storing the task file, the execution rate can be greatly improved, and the performance can be guaranteed under the concurrent condition. For the safe smooth transition from the NAS to the OSS, the method adopts the modes of task double-writing and file comparison to ensure the reliability and the correctness of the output files under two sets of storage environments.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The following describes the document storage method and apparatus in detail with reference to the accompanying drawings. Fig. 1 is a flowchart of a method of an embodiment of a file storage method proposed in the present application. Although the present application provides method operational steps or apparatus configurations as illustrated in the following examples or figures, more or fewer operational steps or module configurations may be included in the method or apparatus based on conventional or non-inventive efforts. In the case of steps or structures where there is no logically necessary cause-and-effect relationship, the execution order of the steps or the block structure of the apparatus is not limited to the execution order or the block structure provided in the embodiments of the present application. When the described method or module structure is implemented in an actual device or end product, it can be executed sequentially or executed in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or the method or module structure shown in the figures.
The file storage scheme can be used for carrying out specific description of the scheme by taking file storage as an example, and certainly, the file storage scheme can be used for large file storage migration of internet financing business and can also be suitable for file storage migration in other application scenes. Specifically, as shown in fig. 1, an embodiment of a file storage method provided by the present application may include:
and step S102, copying each task file into two task files, and respectively writing the two task files into the NAS and the OSS in parallel.
Specifically, this can be achieved by: two sets of task instances (task instances) are defined according to one task file, the data content of the two sets of task instances is identical to that of the task file, two task files (such as task1 and task2) are generated according to the two sets of task instances, one of the task files is written into the NAS, and the other task file is written into the OSS at the same time. The two parallel write operations run independently and do not interfere with each other.
And step S104, comparing whether each task file stored in the NAS is completely consistent with each task file stored in the OSS.
And step S106, if the comparison results are completely consistent within the specified duration, writing the subsequent task files into the OSS.
Before the comparison operation is executed, multiple task files are written in both the NAS and the OSS, the execution period of the comparison operation may be referred to as a transition period, and the duration of the transition period (i.e., the specified duration) may be set according to the requirement. And sequentially comparing whether each task file stored in the NAS is completely consistent with each task file stored in the OSS, if all comparison results are completely consistent in the transition period, the fact that the storage of the task files can be changed from the NAS to the OSS at the moment is indicated, all the subsequent task files are written into the OSS, and the writing operation of the NAS is stopped.
In the embodiment, the traditional NAS is omitted, and the OSS is selected as the storage system of the task file, so that the file reading and writing performance is greatly improved, and the reliability of the storage system is improved.
In one embodiment, how to determine which task file is written in the NAS and which task file is written in the OSS can be determined by setting task parameters. Namely, task parameters (task Param) of the two sets of task instances are respectively set, and the task parameters are used for indicating that the task file is written into the NAS or the OSS.
Table 1 shows a table structure of task instances, which is a task Instance table for recording task instances.
TABLE 1
Table 2 shows a table structure of task params, where the task Param table is used to record parameters of task instances, and 1 task Instance record corresponds to multiple task Param records.
TABLE 2
Name
|
Type
|
Default
|
Storage
|
Comments
|
ID
|
VARchar2(32)
|
|
|
Main key
|
GMT_CREATE
|
DATE
|
sysdate
|
|
Creation time
|
GMT_MODIFIED
|
DATE
|
sysdate
|
|
Modifying time
|
TASK_INSTANCE_ID
|
VARchar2(32)
|
|
|
Task instance id
|
PARAM_KEY
|
VARchar2(128)
|
|
|
Parameter key
|
PARAM_VALUE
|
VARchar2(256)
|
|
|
Parameter value |
Fig. 2 is a schematic diagram of a parallel write operation of the NAS and the OSS proposed in the present application, and as shown in fig. 2, in order to not affect the correct operation of the original service, an IO layer in an original task file is encapsulated, so that two task files generated later can determine whether a storage layer is written into the OSS or the NAS according to their own configuration (i.e., task parameters). And if the storage layer decision determines that the task file is written into the OSS for storage, the storage controller calls the OSS Client to perform read-write operation on the OSS or the local part, otherwise, the storage controller calls the Java IO to operate the NAS so as to adapt to the original logic.
Fig. 3 is a specific schematic diagram of a NAS and OSS parallel write operation proposed in the present application, and as shown in fig. 3, if it is determined that the NAS needs to be written according to Task parameters of a first Task file, the first Task file Task1 is written into the NAS; and determining that the second Task file needs to be written into the OSS according to the Task parameters of the second Task file, writing the second Task file 2 into the local, and uploading the second Task file to the OSS. The parallel write operation of the NAS and the OSS can be independently operated without mutual interference. In addition, the OSS is used as a task file storage, and the task file is written in a mode of local execution of the server and then uploading, so that the execution efficiency of the file is greatly improved.
In this embodiment, the task files are mainly classified into the following two types: input class files and output class files. The writing operation is also different for different types of task files. Fig. 4 is a schematic diagram of an input class File write operation proposed in the present application, and as shown in fig. 4, for an input class File, firstly, a task File from a client mechanism (mainly including a bank, a securities dealer, a fund company, etc.) is received via SFTP (Secure File Transfer Protocol), then the task File is copied into two task files, and then one of the task files is written into NAS based on a File exchange system, and the other task File is written into OSS. After that, the file batch processing system will process the files in the OSS and NAS simultaneously, and stop the NAS writing after the OSS stabilizes.
Fig. 5 is a schematic diagram of an output class file writing operation proposed in the present application, and as shown in fig. 5, for an output class file, first, a task file is read from a DB (Database), then the task file is copied into two task files, and then one of the task files is written into an NAS based on a file batch processing system, and the other task file is written into an OSS. And stopping the writing of the NAS after the OSS is stabilized.
Based on the same inventive concept as the above-described file storage method, the present application provides a file storage apparatus, as described in the following embodiments. Because the principle of solving the problems of the file storage device is similar to the file storage method, the implementation of the file storage device can refer to the implementation of the file storage method, and repeated parts are not described again.
FIG. 6 is a schematic structural diagram of an embodiment of a file storage apparatus according to the present application, and as shown in FIG. 6, the apparatus may include:
and the copying unit 10 is used for copying each task file into two task files. The copying unit 10 is a part of the file storage device that copies the task file, and may be software, hardware, or a combination of both, and may be, for example, an interface, a processing chip, or other components that perform a file generation function.
And the parallel writing unit 20 is connected to the copying unit 10 and is used for writing the two task files into the NAS and the OSS respectively in parallel. The parallel writing unit 20 is a part of the file storage device for writing the task file, and may be software, hardware, or a combination of the two, and may be, for example, an interface, a processing chip, or other components for performing a file writing function.
And the comparison unit 30 is connected to the parallel writing unit 20 and is used for comparing whether each task file stored in the NAS and the OSS are completely consistent. The comparison unit 30 is a part for comparing the task files in the NAS and the OSS in the file storage device, and may be software, hardware, or a combination of the two, for example, may be an interface, a processing chip, or other components that perform a data comparison function.
And the storage unit 40 is connected to the comparison unit 30 and is used for writing the subsequent task files into the OSS under the condition that the comparison results are completely consistent within the specified time length. The storage unit 40 is a part of the file storage device that stores the task file, and may be software, hardware, or a combination of the two, and may be, for example, an interface, a processing chip, or other components that perform a file storage function.
Fig. 7 is a schematic structural diagram of a copy unit according to an embodiment of the document storage apparatus described in this application, and as shown in fig. 7, the copy unit 10 may include: an example generating module 12, configured to define two groups of task instances according to a task file; and the file generating module 14 is configured to generate two task files according to the two groups of task instances. Therefore, complete replication of the task files is realized, and the two replicated task files are ensured to be completely the same as the original task files in data.
Fig. 8 is a schematic structural diagram of a parallel write unit according to an embodiment of the file storage apparatus described in the present application, and as shown in fig. 8, the parallel write unit 20 may include: a first writing module 22, configured to receive a task file from a client mechanism via a secure file transfer protocol SFTP when the task file is an input-class file, copy the task file into two task files, and respectively write the two task files in parallel to the NAS and the OSS based on a file exchange system; and a second writing module 24, configured to, when the task file is an output-type file, read the task file from the database, copy the task file into two task files, and write the two task files into the NAS and the OSS in parallel based on the file batch processing system. In this embodiment, the client mechanism mainly includes: banks, security dealers, fund companies, etc.
Fig. 9 is another structural diagram of a parallel write unit according to an embodiment of the file storage apparatus described in this application, and as shown in fig. 9, the parallel write unit 20 may include: a parameter setting module 21, configured to set task parameters of the two task files respectively; the task parameters are used for indicating the task file to be written into the NAS or the OSS; the NAS writing module 23 is configured to invoke Java IO to write a task file into an NAS; and the OSS write-in module 25 is configured to invoke an OSS Client to write another task file into the local first, and then upload the task file to the OSS. The parallel write operation of the NAS and the OSS can be independently operated without mutual interference. In addition, the OSS is used as a task file storage, and the task file is written in a mode of local execution of the server and then uploading, so that the execution efficiency of the file is greatly improved.
The embodiment is mainly applied to the Internet financing service, the traditional NAS is omitted, and the OSS is selected as the storage system of the task file, so that the file reading and writing performance is greatly improved, and the reliability of the storage system is improved.
From the above description, the present application is mainly applicable to large file heterogeneous storage services, such as financial management, continental gold, etc., and has a certain versatility. The application is designed based on JAVA language, and can also be realized by using other high-level languages. As the NAS bears a large number of important services (including a balance treasure profit file, a bedding file and the like), in order to not influence the normal operation of the old services, the application adopts a double-write mode of the NAS and an OSS storage system during the transition from the NAS to the OSS. The smooth migration mode is not sensitive to the service and is convenient to switch back. In order to ensure the correctness and integrity of the file output after the task file is migrated to the OSS, the files in the heterogeneous storage system need to be compared. OSS is superior to NAS in reliability and safety, the execution rate can be greatly improved, the performance can be guaranteed under the concurrent condition, and the cost is lower.
Although the present application provides method steps as described in an embodiment or flowchart, additional or fewer steps may be included based on conventional or non-inventive efforts. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or client product executes, it may execute sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
The apparatuses or modules illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. The functionality of the modules may be implemented in the same one or more software and/or hardware implementations of the present application. Of course, a module that implements a certain function may be implemented by a plurality of sub-modules or sub-units in combination.
The methods, apparatus or modules described herein may be implemented in computer readable program code to a controller implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
Some of the modules in the apparatus described herein may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary hardware. Based on such understanding, the technical solutions of the present application may be embodied in the form of software products or in the implementation process of data migration, which essentially or partially contributes to the prior art. The computer software product may be stored in a storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, mobile terminal, server, or network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. All or portions of the present application are operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, mobile communication terminals, multiprocessor systems, microprocessor-based systems, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
While the present application has been described with examples, those of ordinary skill in the art will appreciate that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.