WO2023146237A1

WO2023146237A1 - Intelligent data management and storage device, and intelligent data management and storage method using same

Info

Publication number: WO2023146237A1
Application number: PCT/KR2023/001051
Authority: WO
Inventors: 박성원; 정철영; 정진홍
Original assignee: (주)한국소프트웨어아이엔씨; 주식회사 비에이치에이
Priority date: 2022-01-26
Filing date: 2023-01-20
Publication date: 2023-08-03

Abstract

The present specification relates to an intelligent data management and storage method. The method comprises the steps of: collecting file information for a file to be managed; analyzing and classifying the file to be managed, by category, on the basis of the collected file information; optimizing the classified file according to a first preset criterion; and selecting a data storage location according to a second preset criterion and storing the optimized file. Accordingly, data to be managed can be optimized and stored at minimal cost.

Description

Intelligent data management and storage device and intelligent data management and storage method using the same

The present specification relates to an intelligent data management and storage device and an intelligent data management and storage method using the same.

Data storage technology is a technology that stores data in a data storage (storage medium) to secure data stability and smoothly support business.

Existing data storage technology uses a method of reading original data and storing a copy of the read data as it is in a data storage, or compressing and storing the read data in a data storage to reduce data storage space.

Existing data storage technologies store data without considering the type or characteristics of the data, so the management and storage costs of the original data are high, and the management and storage costs of the data stored in the data storage cannot be optimized. There was no In addition, existing data storage technology uses a network to store original data in a remote data store or cloud data store, but optimization of the original data is necessary because the size of the original data affects performance.

In particular, as the 4th industry such as artificial intelligence, big data, and cloud technology develops, data management and storage technology that can accommodate changes in vast amounts of data and various types of data has become necessary.

The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the description below.

This specification presents an intelligent data management and storage method. The method may include collecting file information on a management target file; analyzing and classifying the management target file for each category based on the collected file information; optimizing the classified files according to a first preset criterion; and selecting and storing the optimized file in a data storage according to a second preset criterion.

The intelligent data management and storage method and other embodiments may include the following features.

According to an embodiment, the step of classifying the management target files for each category based on the collected file information is the step of classifying the management target files according to usage time slots based on usage time information of the collected files, and , Optimizing the classified files according to a first preset criterion may include compressing files of a specific time period among the classified files.

According to an embodiment, the step of classifying the management target files for each category based on the collected file information is the step of classifying the management target files for each file type based on the file type information of the collected files. Optimizing the classified files according to a first preset criterion may include compressing the classified files using different compression methods for each file type.

According to an embodiment, the step of selecting and storing the optimized file in a data store according to a second preset criterion may include analyzing a use cost of each of a plurality of data stores; and storing the optimized file in a data storage having a minimum usage cost among the plurality of data storages.

According to an embodiment, the usage cost may be determined by at least one of a type of storage medium of the data store, an installation location of the data store, an owner of the data store, and a purpose of the data store.

According to an embodiment, the usage cost is determined by a 1 TB storage cost index (T), and the 1 TB storage cost index (T) sets the price (S) of a 1 TB storage space provision service of a data storage provider to 1 The storage cost per TB (C) may be calculated as a result of dividing the cost of storage per TB by the total number of TBs.

According to an embodiment, the categories are divided into classifications including file size, file characteristics, and file use frequency, and classifying the management target files by category based on the collected file information includes the collected files. Based on the information, the management target files may be classified according to file size ranges, file characteristics, or file usage frequency ranges.

According to an embodiment, the method may further include providing data information on the management target file by virtualizing the management target file.

Meanwhile, this specification presents a computer program. The computer program may be a computer program stored in a computer readable recording medium in order to execute each step included in the intelligent data management and storage method according to any one of the above steps by being combined with computer hardware.

On the other hand, this specification presents an intelligent data management and storage device. The device includes a memory for storing one or more instructions; and a processor for executing the instruction, wherein when the instruction is executed, the processor collects file information about a file to be managed, and classifies the file to be managed by category based on the collected file information. and optimize the classified file according to a first preset criterion, and select and store the optimized file in a data storage according to a second preset criterion.

According to an embodiment disclosed in this specification, by optimizing data in consideration of characteristics of original data to be managed, there is an effect of minimizing management cost or storage cost.

On the other hand, the effects obtainable in the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below. You will be able to.

The following drawings attached to this specification illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention together with specific details for carrying out the invention, so the present invention is described in such drawings should not be construed as limited to

1 shows an example of an intelligent data management and storage system including an intelligent data management and storage device according to an embodiment.

2 is a block diagram illustrating the configuration of an intelligent data management and storage device according to an embodiment.

3 is a diagram illustrating an intelligent data management and storage method using an intelligent data management and storage device according to an embodiment.

4 illustrates an example of visualizing data analysis results applied based on usage time among time indexes.

5 shows an example of visualizing data analysis results according to file types.

6 is a diagram explaining procedures of data analysis and data optimization according to time and file type.

7 is a diagram illustrating a system configuration for implementing a backup using an intelligent data management and storage method according to an embodiment.

8 is a diagram illustrating in detail the functions of data analysis, data management, and data storage performed by the intelligent data management and storage method according to the above-described embodiment as a table.

9 is a block diagram of an AI device according to an embodiment of the present invention.

The technology disclosed herein can be applied to data management and storage technology. However, the technology disclosed in this specification is not limited thereto, and may be applied to all devices and methods to which the technical spirit of the technology may be applied.

It should be noted that technical terms used in this specification are only used to describe specific embodiments and are not intended to limit the spirit of the technology disclosed in this specification. In addition, technical terms used in this specification should be interpreted in terms commonly understood by those of ordinary skill in the field to which the technology disclosed in this specification belongs, unless specifically defined otherwise in this specification. It should not be interpreted in an overly comprehensive sense or in an excessively reduced sense. In addition, when the technical terms used in this specification are incorrect technical terms that do not accurately express the spirit of the technology disclosed in this specification, it is a technical term that can be correctly understood by those of ordinary skill in the field to which the technology disclosed in this specification belongs. should be replaced with In addition, general terms used in this specification should be interpreted as defined in advance or according to context, and should not be interpreted in an excessively reduced sense.

Terms including ordinal numbers such as first and second used herein may be used to describe various components, but the components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention.

Hereinafter, the embodiments disclosed in this specification will be described in detail with reference to the accompanying drawings, but the same or similar components are assigned the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted.

In addition, in describing the technology disclosed in this specification, if it is determined that a detailed description of a related known technology may obscure the gist of the technology disclosed in this specification, the detailed description will be omitted. In addition, it should be noted that the accompanying drawings are only intended to facilitate understanding of the spirit of the technology disclosed in this specification, and should not be construed as limiting the spirit of the technology by the accompanying drawings.

Hereinafter, the method proposed in this specification will be described in detail.

Intelligent data management and storage technology refers to a technology that sets analysis criteria for data to be managed with a specific metric, analyzes the data according to the criteria, and then optimizes the data. In general data management and storage technology, the original data is kept in the storage device as it is, but the data optimization in this specification also optimizes the storage space of the original data by performing optimization such as compression, deduplication, and virtualization of the original data itself. When saving data to local storage, remote storage, or cloud storage, data storage space can be minimized and network usage can be minimized.

To store data, it is necessary to define the type of data, file type, database (DB) type (hereinafter referred to as DB), and data storage.

The type of data can be classified into file-type data or DB-type data.

The file type may be classified into structured and unstructured files including text files, document files, image files, audio files, video files, and other multimedia data.

In addition, DBs include all types of existing commercialized and used DBs (Oracle, MSSQL, Informix, DB2, SYBASE, MySQL, and all other commercially used DBs), open source-based DBs, and domestically developed and commercialized DBs. All DB types are included.

Types of data storage can be divided into local storage, remote storage, cloud storage, and other storage.

Hereinafter, an intelligent data management and storage device for intelligent data management and storage according to an embodiment of the present invention will be described.

Referring to FIG. 1, the intelligent data management and storage system 10 includes an intelligent data management and storage device 100, a plurality of user terminals 200 that are management target computing devices, and a plurality of storage device servers 300. can be configured.

The intelligent data management and storage device 100 is connected to a plurality of user terminals 200, collects and analyzes information on data stored in the plurality of user terminals 200, and then collects data according to the analyzed result. Provides services such as compressing, moving, storing, and virtualizing the data of the user terminal 200.

The plurality of user terminals 200 are management target computing devices that receive data management services provided by the intelligent data management and storage device 100, and mean PCs connected to the intelligent data management and storage device 100 through a network. can do. In addition, these plurality of user terminals 200 may mean a local data server device in which a plurality of PCs are connected to store and share data. These plurality of user terminals 200 receive and install the client program provided by the intelligent data management and storage device 100, through which information and management rights (compression, movement, storage, copy) for stored data are stored. , virtualization, etc.) to the intelligent data management and storage device 100.

The plurality of storage device servers 300 may refer to remote storage and cloud storage, excluding local storage provided in the plurality of user terminals 200 connected to the intelligent data management and storage device 100 through a network. The intelligent data management and storage device 100 reads data from a plurality of user terminals 200 and stores them in the plurality of storage device servers 300, or stores user data stored in the plurality of storage device servers 300. It is provided to a plurality of user terminals 200 .

Referring to FIG. 2, an intelligent data management and storage device 100 for implementing the method or function proposed in this specification includes a control unit 110, a storage unit 130, a bus (not shown) for data transmission, or an external It may include a communication unit 120 for performing communication with, an output unit 140 and an input unit 150. The illustrated components are not essential, so an intelligent data management and storage device 100 having more or fewer components may be implemented. In addition, these components may be implemented as hardware or software, or through a combination of hardware and software.

The controller 110 may refer to all types of processing devices capable of processing data, such as a processor. Controller 110 may be configured to execute processing logic to perform various operations and steps described herein. Here, the 'processor' may refer to a data processing device embedded in hardware having a physically structured circuit to perform functions expressed by codes or instructions included in a program, for example. As an example of such a data processing device built into hardware, a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated (ASIC) circuit), field programmable gate array (FPGA), etc., but the scope of the present invention is not limited thereto.

The communication unit 120 may be composed of wired and/or wireless communication modules. For example, the communication unit 120 may use wireless fidelity (Wi-Fi), Bluetooth, Zigbee, near field communication (NFC), wireless broadband Internet (Wibro), and the like. A wireless communication module and a wired communication module such as a wired LAN such as Ethernet may be included. The communication unit 120 may transmit and receive data by performing wired/wireless communication with other devices through a network.

The storage unit 130 may include magnetic storage media or flash storage media, but the scope of the present invention is not limited thereto.

The output unit 140 is for generating an output related to sight, hearing, or touch, and may include, for example, the display unit 145 . The display unit 145 may implement a touch screen by forming a mutual layer structure or integrally with the touch sensor. Such a touch screen may function as a user input unit providing an input interface between the device and the user, and may provide an output interface between the device and the user.

The input unit 150 may be connected to the control unit 110 through an input/output interface connected to a bus. The input unit 150 may include a keyboard, a pointing device, a microphone, a joystick, a touch pad, a scanner, and the like. The input/output interface may include any one of a wide variety of interfaces such as a serial port interface, a PS/2 interface, a parallel port interface, a USB interface, and an IEEE 1394 interface, or may logically represent a combination of other interfaces.

On the other hand, the intelligent data management and storage device 100 according to the embodiment may be implemented as a separate device (device of FIG. 2) connected to and operating with the user terminal 200 as shown in FIG. 1, and its functions It may be included in the user terminal 200 and implemented as computing elements of the user terminal 200 .

The control unit 110 performs the functions of the intelligent data management and storage device 100 to be described later, that is, processes and procedures of an intelligent data management and storage method.

Referring to FIG. 3, the intelligent data management and storage method by the intelligent data management and storage device 100 according to the embodiment converts copy data generated by reading original data 210 from the user terminal 200 to the storage device server ( 300), data category-based data analysis is performed (S110) before being stored in the data storage prepared in step 300), and then, according to the analysis result, the original data is optimized by methods such as compression and redundancy removal (S120), The optimized data 220 is stored in a local storage (S130) or stored in a data storage provided in the storage device server 300 (S140). At this time, the intelligent data management and storage method supports virtualization of the original data (S150), so that after the original data is optimized and moved to the data store, even if the original data 210 stored in the local storage is deleted, the virtualized original data Through the data 211, information of the data can be easily checked.

When the data category-based analysis data is accumulated, the intelligent data management and storage device 100 learns an artificial intelligence-based data analysis module using this data, automatically analyzes the data, and then optimizes the data. At this time, the artificial intelligence-based data analysis module uses the original data and the classification data classified by the user as learning data to classify data into categories and learn optimization patterns for each data category, and then automatically analyze data when new data is entered. By classifying categories, optimization can be performed in an appropriate way.

The intelligent data management and storage method according to the embodiment analyzes classification and data call patterns according to data characteristics of the original data to be managed, analyzes to minimize data storage costs, and automatically or manually compresses the original data based on this. Data can be managed or moved and stored by removing redundant data or virtualizing it.

Intelligent data management and storage methods according to embodiments may be classified into data analysis, data management, and data storage based on data categories.

[1] 데이터 분석 기능[1] Data analysis function

Hereinafter, a data analysis function by an intelligent data management and storage method according to an embodiment will be described.

The data analysis function means analyzing data or files based on categories (categories), classifying the data according to types and characteristics, reporting the classification results to the user, and optimizing the data based on the results.

At this time, data optimization is performed based on categories, and the original data is optimized and stored in the order of optimization for infrequently used data, optimization according to data type, reading/writing other data, or optimization by other categories. The virtualization of the original data provides the user with ledger information on the original data. The categories corresponding to data optimization criteria are (1) Time, (2) File Type, (3) File Size, (4) File Property, and (5) File It can be classified by frequency of use.

(1) 시간(Time)에 따른 카테고리 분류(1) Category classification according to time

Category classification analysis based on time divides data into infrequently used data and frequently used data based on the frequency of data use, optimizes and saves the infrequently used data, and virtualizes the original data. It provides the information of the original data to the user.

All data (files or DBs) in the user terminal 200 may have three time indexes, and the three time indexes are data creation time (creation time) and data modification time (modification time). time), and the time at which data is called (access time) (or time at which data is read).

The intelligent data management and storage method according to the embodiment may analyze whether data is used or not based on three time indexes for data including data creation time, data modification time, and data call time. The time index is divided into four levels based on usage time, for example, files used within 6 months of the current time (level 4), files used between 6 and 12 months (level 3), 12 Data is classified into files that have not been used for more than one month (level 2) and unused free space (level 1). The intelligent data management and storage method analyzes the usage time of the data, classifies the level for optimization, and optimizes the data included in the level and stores it in the system, or it can be backed up and stored in a storage with low storage cost.

Roughly statistically speaking, out of the total amount of data in the system, the data belonging to level 4 is only about 20%, and the data belonging to level 3 or later is 80%. For example, in the case of e-mail, once an attachment is downloaded, it is rare to read that attachment within the e-mail again. In addition, when data is optimized by compression, the compression rate may vary depending on the type of data, but statistically, the compression rate of data excluding video data or image data is 80% or higher on average, so the storage space for storing the original data is at least 60 % can be optimized. The free space corresponding to level 1, that is, the unused storage space, does not contain data but is not initialized due to the nature of the system, so it may take a long time to optimize. Therefore, this space is set to '0' or Optimization time can be reduced by initializing with '1'.

Meanwhile, the method for managing and storing intelligent data according to the embodiment may classify the entire data into four levels based on the usage time, visualize it in a graph or table, and provide analysis results of the data.

Referring to FIG. 4 , the intelligent data management and storage method according to the embodiment maintains 24 GB of recently used data corresponding to level 4 among the total storage capacity of 200 GB provided in the user terminal 200 and stores data corresponding to levels 1 to 3. The remaining 176 GB is selected as the optimization target 410 and optimized to 80% for the selected optimization target data, thereby saving about 140 GB of storage space of the user terminal 200 . At this time, the intelligent data management and storage method according to the embodiment performs data compression, redundant data removal, and virtualization on 176GB of data to be optimized, and stores it in a local storage in the user terminal 200 or stores it in another storage to store the data locally. Optimize storage.

(2) 파일 유형(File Type)에 따른 카테고리 분류(2) Category classification according to file type

The local storage of the user terminal 200 includes an OS file area used by an operating system (OS) and a user data area. User data used by users is classified and stored in the form of files or DBs. The intelligent data management and storage method according to the embodiment includes text files, document files, image files, video files, audio files, and other types (external). It classifies the file type, calculates the file name and data size for each type, produces statistics, and visualizes the analysis result in the form of a graph or table. The user can easily know what type of file is most frequently used in a specific system through the visualization result (for example, a system used in a hospital has a lot of video data or image data, so the analysis result is derived), Information about the size of the OS area, the contents and size of the file area, and the contents and size of the DB area can be obtained. The user can optimize according to the type of data by selectively applying data optimization for each file type through the intelligent data management and storage method provided by the intelligent data management and storage device 100 . In the case of optimization, when compression is used, since compression for general file types and compression techniques for compressing video data are significantly different, optimal results can be obtained by selectively using compression techniques that can be optimized according to the type of data.

Referring to FIG. 5 , since image data and video data 510 occupy a large portion of the total data storage space of 400 GB of a user terminal 200, when a compression method for optimization is applied, image rather than uniform data compression is applied. Data compression can be optimized by using an optional compression technique that provides the highest compression ratio for both data and video data. At this time, data other than image data and video data uses a compression method optimized for each type of data. On the other hand, for the data optimized by the compression method, it is possible to provide data information in the original state that was originally provided to the user through data virtualization.

(3) 파일 크기(File Size)에 따른 카테고리 분류(3) Category classification according to file size

Category classification by file size can be divided into large files, medium files, and small files. For example, all file sizes are collected, the average value is obtained, and files with sizes close to the average value are classified as medium files. In this method, files with an average value or less are classified as small files, and files with an average value or more are classified as large files. Data to be optimized is classified according to the file size, and classification result information can be provided to the user.

(4) 파일 특성(File Property)에 따른 카테고리 분류(4) Category classification according to file properties

Category classification by file characteristics can be largely divided into four categories: system files, private files, business files, and public files, and classification by file characteristics. means to analyze the number of files of each of the above four types and the size of each file.

The system file refers to a file or program code stored in an area (OS area) used by an OS (operating system) among local storage spaces of the user terminal 200 . The intelligent data management and storage method according to the embodiment calculates the ratio of the size of the storage space used in the OS area and the space used as the OS area to the total size of the storage space.

The personal files refer to files having personal privacy. These data or files are managed so that no one with any authority can read, write, or even delete them unless encryption is used when backing up or duplicating in consideration of individual privacy, or the individual himself or herself is not authenticated. The intelligent data management and storage method according to the embodiment calculates the ratio of the size of the storage space used by personal files and the space used for storing personal files to the total size of the storage space.

Business files refer to files used while conducting company business. Files used for business use have a shared folder concept and are shared by applying the license concept, and backup or replication must be performed to protect data information in direct connection with the storage of data. The intelligent data management and storage method according to the embodiment calculates the size of the storage space used by work files and the ratio of the total storage space used to store work files.

Public files refer to files created through search or download on the Internet. Since there are many files or data duplicated between multiple people within a group, duplicate files are removed to maintain consistency as one file or data, and the rest are connected using links. The intelligent data management and storage method according to the embodiment calculates the size of the storage space used by common files and the ratio of the total storage space used to store common files.

(5) 파일 사용 빈도(Frequency)에 따른 카테고리 분류(5) Category classification according to file usage frequency (Frequency)

On the other hand, category classification according to file use frequency analyzes how many files are used by users or systems, and provides information to users by analyzing information on recently used files by creating a history of the most used files. manage For example, among files used during a preset period, files may be classified according to a preset file use frequency, and files of the corresponding classification may be optimized according to a preset criterion.

Hereinafter, as an example of data optimization according to category classification, data category classification according to time and file type and data optimization according to the classification result will be described with reference to FIG. 6 attached.

Referring to FIG. 6 , the file classification module 610 of the intelligent data management and storage device 100 provides storage directory information 601 for a management target file from the user terminal 200, file access time information 602, and file access time information 602. After receiving type information 603, based on the received information, files are collected based on time (S601), and files are classified based on time (S603). Next, the file classification module 610 collects files based on the file type based on the received information (S605) and classifies the files based on the file type (S607). Classification results classified by time criteria and types are stored in the intelligent data management and storage device 100 (S609), and each classification result (621, 622) is visualized in the file classification module (610) (S611). After being output to the user, data optimization is performed manually or automatically by the user's selection (S613). Visualization is output in the form of tables or graphs so that users can easily understand the analysis results at a glance.

(6) 파일 개수의 분석에 기초한 데이터 최적화(6) Optimization of data based on analysis of number of files

Meanwhile, the intelligent data management and storage method according to the embodiment may analyze the number of files constituting management target data and optimize the data according to the analysis result.

Analysis by the number of files means the total number of files that the user has on the user terminal 200, and is classified according to the number of files. For example, if the number of files is 100,000 or less, small grade (Small), between 100,000 and 1,000,000, normal grade (Normal), and if the number of files is more than 1,000,000, large grade (Large) Classifying the data of the user terminal 200 means that In data management, as the number of files increases, the processing speed or transmission speed decreases. Therefore, in case of a large number of files, the method of processing or transmitting the files is applied to the method of processing volume or utilizing Archive. to optimize data management.

(7) 저장 비용 분석에 기초한 데이터 최적화(7) Data optimization based on storage cost analysis

Meanwhile, the intelligent data management and storage method according to the embodiment may analyze data storage costs and optimize the data according to the analysis result.

The cost of storing data depends on the type of storage medium (disk, Virtual Tape Library (VTL), Solid-State Drive (SSD), memory, tape, USB, etc.), and the storage medium also Storage costs vary depending on whether it is used for personal or business purposes. In addition, storage costs vary depending on whether data is stored locally, remotely, or stored in the cloud. The data storage location may vary depending on whether it is a personal storage location, a corporate storage location, or a data center, and may also vary depending on the region (city) where the data center used is located.

The intelligent data management and storage method according to the embodiment calculates the storage cost as a formula using factors that may change the storage cost as variables, converts the storage cost into absolute and relative values, finds storage that minimizes the storage cost, and stores data. can move

The storage cost (C) of 1 TB can be calculated by Equation 1 below.

<Formula 1> Cost of 1 TB storage Cost (C) = (Cost of total TB used by the system)/Total number of TB

Therefore, when the price of a 1 TB storage space provision service of a cloud service provider (CSP) is S, the 1 TB storage cost index (T) can be calculated by the following <Equation 2>.

<Formula 2> 1 TB storage cost index (T) = S/C

Since the storage cost is minimized as the T representing the 1 TB storage cost index decreases, the intelligent data management and storage method according to the embodiment minimizes the storage cost by utilizing the storage cost index when selecting a storage to store data. The intelligent data management and storage method according to the embodiment may calculate the storage cost index even if there is already stored data, and then move the data to the data storage where the storage cost index is minimized in a policy manner.

(8) 저장 성능에 기초한 데이터 최적화(8) Data optimization based on storage performance

On the other hand, the intelligent data management and storage method according to the embodiment collects and analyzes data on the speed of storing or reading data, and then visualizes the analysis result to provide users with storage performance figures for all storage media. It provides data movement or data optimization so that the most used data can be stored in this good storage.

[2] 데이터 관리 기능[2] Data management function

Hereinafter, a data management function by an intelligent data management and storage method according to an embodiment will be described.

The data management function refers to a data storage management function that manages data storage and data storage media in which data is stored. The data storage management function includes management of designated multi-data storage, local storage, remote storage, and cloud storage, disk (DISK), virtual tape library (VTL), and physical tape library ( It has management capabilities for Physical Tape Library (PTL), Storage Area Network (SAN), and Network Attached Storage (NAS), with monitoring capabilities that provide detailed information about running processes and shutdowns. It is a reporting function that provides detailed information about the process that has been performed.

An intelligent data management and storage method according to an embodiment transfers or backs up management target data of the user terminal 200 to a storage selected from among data stores connected to a network based on the storage cost analysis result according to the storage medium described above, and , the optimized data can be stored.

[3] 데이터 저장 기능[3] Data storage function

Hereinafter, a data storage function by an intelligent data management and storage method according to an embodiment will be described.

The intelligent data management and storage method according to the embodiment is a data storage function, a backup function that reads original data once and stores it on a point-in-time basis, a replication function that reflects and stores the original data in real time whenever it is changed, and backup and It provides a hybrid function that fuses the cloning function.

Referring to FIG. 7, the intelligent data management and storage method includes a backup UI module (710), a backup master module (720), a backup client module (730), and a backup server module ( Backup Server module: 740) implements data backup.

The backup UI module 710 is a software module installed in the intelligent data management and storage device 100 or the user terminal 200 that performs the functions of the intelligent data management and storage device 100, and manages all backup and recovery operations. A user interface (UI) is provided. Through this user UI, users can set backup policies, schedule backups, register clients to receive backups, and register backup servers, and manage data recovery and check the results of backup and recovery. The user interface screen provided by the backup UI module divides and displays data storage areas into local storage with original data and remote storage or cloud storage, allowing users to transfer data from local storage to local storage and local storage with just a click of the mouse on the screen. You can easily save data from remote storage, local storage to cloud storage, remote storage to local storage, cloud storage to local storage, and cloud storage to remote storage.

The backup master module 720 is a software module installed in the intelligent data management and storage device 100 or the user terminal 200 that performs the function of the intelligent data management and storage device 100, and includes a backup client 730 and a backup server. (740) to manage and control the backup target as a whole, and store all data related to backup in DB (MySQL/postgres). The intelligent data management and storage device 100 may be a Linux server device or the like, and at this time, a user controls the backup master module 720 through a backup UI module 710 installed in a terminal such as a Windows PC.

The backup client module 730 includes a server that stores data to be backed up, such as a Windows system-based backup client module 731, a Linux server-based backup client module 732, and a Unix server-based backup client module 733. As a software module installed in the device (the user terminals 200 of FIG. 1 ), it reads the data to be backed up and transfers it to the data storage provided in the backup server device 740 .

The backup server module 740 is a server device having backup storage such as a Windows system-based backup server module 741, a Linux server-based backup server module 742, and a Unix server-based backup server module 743 (FIG. 1 As a software module installed in the storage servers 300), it communicates with the backup master module 720 and the backup client module 730, sets a storage to store data, reads data from the backup client 730, and reads the data. Serves as a repository to store

Hereinafter, an artificial intelligence processing module applicable to the apparatus and method proposed in this specification, particularly the aforementioned artificial intelligence-based data analysis module, and its application will be described with reference to FIG. 9 . That is, it is clarified that the content to be described later, that is, the content related to FIG. 9 can be applied to implement the method proposed in this specification.

Artificial Intelligence (hereafter referred to as AI) can use numerous analyzes to determine how complex target tasks are to be performed. In other words, AI can increase efficiency and reduce processing delays. Time-consuming tasks such as analyzing large amounts of data can be performed instantly by using AI.

Hereinafter, machine learning, a type of AI, will be looked at in more detail.

Machine learning refers to a set of actions that train a machine to create a machine that can do tasks that humans can or cannot do. Machine learning requires data and a running model. In machine learning, data learning methods can be largely classified into three types: supervised learning, unsupervised learning, and reinforcement learning.

Neural network training is aimed at minimizing errors in the output. Neural network learning repeatedly inputs training data to the neural network, calculates the output of the neural network for the training data and the error of the target, and backpropagates the error of the neural network from the output layer of the neural network to the input layer in a direction to reduce the error. ) to update the weight of each node in the neural network.

Supervised learning uses training data in which correct answers are labeled in the learning data, and unsupervised learning may not have correct answers labeled in the learning data. That is, for example, learning data in the case of supervised learning related to data classification may be data in which each learning data is labeled with a category. Labeled training data is input to the neural network, and an error may be calculated by comparing the output (category) of the neural network and the label of the training data. The calculated error is back-propagated in a reverse direction (ie, from the output layer to the input layer) in the neural network, and connection weights of each node of each layer of the neural network may be updated according to the back-propagation. The amount of change in the connection weight of each updated node may be determined according to a learning rate. The neural network's computation of input data and backpropagation of errors can constitute a learning cycle (epoch). The learning rate may be applied differently according to the number of iterations of the learning cycle of the neural network. For example, a high learning rate is used in the early stages of neural network learning to increase efficiency by allowing the neural network to quickly achieve a certain level of performance, and a low learning rate can be used in the late stage to increase accuracy.

The learning method may vary depending on the characteristics of the data. For example, in a case where the purpose of the receiver is to accurately predict data transmitted by the transmitter in a communication system, it is preferable to perform learning using supervised learning rather than unsupervised learning or reinforcement learning.

The learning model corresponds to the human brain, and the most basic linear model can be considered. called deep learning).

The neural network cord used as a learning method is largely divided into deep neural networks (DNN), convolutional deep neural networks (CNN), and recurrent neural networks (RNN). there is.

An artificial neural network is an example of connecting several perceptrons.

The AI device 900 may include an electronic device including an AI module capable of performing AI processing or a server including the AI module. In addition, the AI device 900 may be included in at least a part of the intelligent data management and storage device 100 according to the embodiment to perform at least a part of AI processing together.

The AI device 900 may include an AI processor 901, a memory 905 and/or a communication unit 907.

The AI device 900 is a computing device capable of learning a neural network, and may be implemented in various electronic devices such as a server, a desktop PC, a notebook PC, and a tablet PC.

The AI processor 901 may learn a neural network using a program stored in the memory 905 . In particular, the AI processor 901 may learn a neural network for recognizing image-related data. Here, the neural network for recognizing image-related data may be designed to simulate the structure of the human brain on a computer, and may include a plurality of network nodes having weights that simulate neurons of the human neural network. A plurality of network modes may transmit and receive data according to a connection relationship, respectively, so as to simulate synaptic activity of neurons that transmit and receive signals through synapses. Here, the neural network may include a deep learning model developed from a neural network model. In the deep learning model, a plurality of network nodes may exchange data according to a convolution connection relationship while being located in different layers. Examples of neural network models are deep neural networks (DNN), convolutional deep neural networks (CNN), recurrent Boltzmann machines (RNNs), restricted Boltzmann machines (RBMs), deep trust It includes various deep learning techniques such as deep belief networks (DBN) and deep Q-network, and can be applied to fields such as computer vision, voice recognition, natural language processing, and voice/signal processing.

Meanwhile, the processor performing the functions described above may be a general-purpose processor (eg, CPU), or may be an AI-only processor (eg, GPU) for artificial intelligence learning.

The memory 905 may store various programs and data necessary for the operation of the AI device 900 . The memory 905 may be implemented as a non-volatile memory, a volatile memory, a flash-memory, a hard disk drive (HDD), or a solid state drive (SDD). The memory 905 is accessed by the AI processor 901, and reading/writing/modifying/deleting/updating of data by the AI processor 901 can be performed. In addition, the memory 905 may store a neural network model (eg, a deep learning model 906) generated through a learning algorithm for data classification/recognition according to an embodiment of the present invention.

Meanwhile, the AI processor 901 may include a data learning unit 902 that learns a neural network for data classification/recognition. The data learning unit 902 may learn criteria for which training data to use to determine data classification/recognition and how to classify and recognize data using the training data. The data learning unit 902 may acquire learning data to be used for learning and learn the deep learning model by applying the obtained learning data to the deep learning model.

The data learning unit 902 may be manufactured in the form of at least one hardware chip and mounted in the AI device 900 . For example, the data learning unit 902 may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or manufactured as a part of a general-purpose processor (CPU) or a graphics-only processor (GPU) for the AI device 900. may be mounted. Also, the data learning unit 902 may be implemented as a software module. When implemented as a software module (or a program module including instructions), the software module may be stored in a computer-readable, non-transitory computer readable recording medium (non-transitory computer readable media). In this case, at least one software module may be provided by an Operating System (OS) or an application.

The data learning unit 902 may include a learning data acquisition unit 903 and a model learning unit 904 .

The training data acquisition unit 903 may acquire training data necessary for a neural network model for classifying and recognizing data. For example, the learning data acquisition unit 903 may acquire data and/or sample data to be input to a neural network model as learning data.

The model learning unit 904 may learn to have a criterion for determining how to classify predetermined data by using the obtained training data. At this time, the model learning unit 904 may learn the neural network model through supervised learning using at least some of the learning data as a criterion. Alternatively, the model learning unit 904 may learn the neural network model through unsupervised learning in which a criterion for determination is discovered by self-learning using training data without guidance. In addition, the model learning unit 904 may learn the neural network model through reinforcement learning using feedback about whether the result of the situation judgment according to learning is correct. In addition, the model learning unit 904 may train the neural network model using a learning algorithm including error back-propagation or gradient decent.

When the neural network model is learned, the model learning unit 904 may store the learned neural network model in memory. The model learning unit 904 may store the learned neural network model in a memory of a server connected to the AI device 900 through a wired or wireless network.

The data learning unit 902 further includes a training data pre-processing unit (not shown) and a training data selection unit (not shown) to improve the analysis result of the recognition model or save resources or time required for generating the recognition model. You may.

The learning data pre-processing unit may pre-process the acquired data so that the acquired data can be used for learning for situation determination. For example, the learning data pre-processing unit may process the acquired data into a preset format so that the model learning unit 904 can use the acquired learning data for learning for image recognition.

In addition, the learning data selector may select data necessary for learning from among the learning data acquired by the learning data acquisition unit 903 or the learning data preprocessed by the preprocessor. The selected learning data may be provided to the model learning unit 904 . For example, the learning data selection unit may select only data about an object included in a specific region as training data by detecting a specific region among acquired images.

In addition, the data learning unit 902 may further include a model evaluation unit (not shown) to improve the analysis result of the neural network model.

The model evaluation unit inputs evaluation data to the neural network model, and when an analysis result output from the evaluation data does not satisfy a predetermined criterion, the model learning unit 902 may perform relearning. In this case, the evaluation data may be predefined data for evaluating the recognition model. For example, the model evaluator may evaluate that the predetermined criterion is not satisfied when the number or ratio of the evaluation data for which the analysis result is inaccurate among the analysis results of the learned recognition model for the evaluation data exceeds a preset threshold. there is.

The communication unit 907 may transmit AI processing results by the AI processor 901 to an external electronic device.

Here, the external electronic device is a device capable of wired/wireless communication with the AI device, and may include a broadcasting device and a viewer device to be described later. In this case, the AI device may be implemented in the manager device.

On the other hand, the AI device 900 shown in FIG. 9 has been functionally divided into an AI processor 901, a memory 905, a communication unit 907, etc., but the above-described components are integrated into one module and the AI module Note that it can also be called

In the foregoing description, the steps or actions may be further divided into additional steps, procedures or actions, or combined into fewer steps, procedures or actions, depending on the implementation of the invention. In addition, some steps, processes or operations may be omitted if necessary, or the order of steps, processes or operations may be switched. In addition, each step or operation included in the intelligent data management and storage method using the above-described intelligent data management and storage device may be implemented as a computer program and stored in a computer-readable recording medium, and each step may be performed by a computer device. , a process or action may be executed.

The embodiments described above are those in which elements and features of the present invention are combined in a predetermined form. Each component or feature should be considered optional unless explicitly stated otherwise. Each component or feature may be implemented in a form not combined with other components or features. In addition, it is also possible to configure an embodiment of the present invention by combining some components and/or features. The order of operations described in the embodiments of the present invention may be changed. Some components or features of one embodiment may be included in another embodiment, or may be replaced with corresponding components or features of another embodiment. It is obvious that claims that do not have an explicit citation relationship in the claims can be combined to form an embodiment or can be included as new claims by amendment after filing.

An embodiment according to the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof. In the case of implementation by hardware, one embodiment of the present invention is one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs ( Field Programmable Gate Arrays), processors, controllers, microcontrollers, microprocessors, etc.

In the case of implementation by firmware or software, an embodiment of the present invention may be implemented in the form of a module, procedure, or function that performs the functions or operations described above. The software code can be stored in memory and run by a processor. The memory may be located inside or outside the processor and exchange data with the processor by various means known in the art.

As used herein, the term “unit” (eg, a controller) may refer to a unit including one or a combination of two or more of, for example, hardware, software, or firmware. “Unit” may be used interchangeably with terms such as, for example, unit, logic, logical block, component, or circuit. A "unit" may be a minimum unit of an integrally constituted part or a part thereof. A “unit” may be a minimal unit or part thereof that performs one or more functions. A “unit” may be implemented mechanically or electronically. For example, a "part" is an application-specific integrated circuit (ASIC) chip, field-programmable gate arrays (FPGAs), or programmable-logic device that performs certain operations, known or developed in the future. may contain at least one.

At least some of the devices (eg, modules or functions thereof) or methods (eg, operations) according to various embodiments may be stored on computer-readable storage media in the form of, for example, program modules. It can be implemented as a command stored in . When the command is executed by a processor, the one or more processors may perform a function corresponding to the command. The computer-readable storage medium may be, for example, a memory.

As used herein, the term "a" is defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in a claim means that the same claim includes introductory phrases such as “at least one” and “one or more” and ambiguous phrases such as “an”. If any, be construed to mean that the introduction of another claim element by the ambiguous phrase "an" limits any particular claim containing the so-introduced claim element to an invention containing only one such element. It shouldn't be.

Further, those skilled in the art will recognize that the boundaries between the functionality of the foregoing operations are exemplary only. A plurality of actions may be combined into a single action, a single action may be distributed into additional actions, and the actions may be executed at least partially overlapping in time. Also, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be changed in various other embodiments. However, other modifications, variations and alternatives are also possible. Accordingly, the detailed description and drawings are to be regarded in an illustrative rather than a limiting sense.

The phrase “may be X” indicates that condition X can be met. This phrase also indicates that condition X may not be satisfied. For example, a reference to a system that contains a specific component should also include a scenario where the system does not contain that specific component. For example, a reference to a method that includes a specific action must also include scenarios in which the method does not include that specific component. But to take another example, a reference to a system configured to perform a specific action should also include a scenario in which the system is not configured to perform the specific action.

The terms "comprising", "having", "consisting of", "consisting of" and "consisting essentially of" are used interchangeably. For example, any method may include at least the operations included in the drawings and/or specifications, and may include only the operations included in the drawings and/or specifications. Alternatively, the word "comprising" does not exclude the presence of the recited elements or acts in a claim.

It is apparent to those skilled in the art that the present invention can be embodied in other specific forms without departing from the essential characteristics of the present invention. Accordingly, the foregoing detailed description should not be construed as limiting in all respects and should be considered illustrative. The scope of the present invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

Claims

In the intelligent data management and storage method,

Collecting file information about a management target file;

analyzing and classifying the management target file for each category based on the collected file information;

optimizing the classified files according to a first preset criterion; and

Selecting and storing the optimized file in a data store according to a second preset criterion; comprising

method.
According to claim 1,

In the step of classifying the management target file for each category based on the collected file information,

Classifying the management target file according to the usage time zone based on the usage time information of the collected files;

Optimizing the classified file according to a first preset criterion,

A step of compressing a file of a specific time period among the classified files

characterized by a method.
According to claim 1,

In the step of classifying the management target file for each category based on the collected file information,

Classifying the management target file by file type based on the file type information of the collected files;

Optimizing the classified file according to a first preset criterion,

Compressing the classified files with different compression methods for each file type

characterized by a method.
According to claim 1,

Selecting and storing the optimized file in a data store according to a second preset criterion,

Analyzing a usage cost of each of the plurality of data stores; and

Storing the optimized file in a data store having the lowest usage cost among the plurality of data stores.

characterized by a method.
The method of claim 4, wherein the use cost,

determined by at least one of the type of storage medium of the data store, the installation location of the data store, the owner of the data store, and the purpose of the data store

characterized by a method.
According to claim 5,

The usage cost is determined by the 1 TB storage cost index (T),

The 1 TB storage cost index (T) is calculated as the result of dividing the price (S) of a data storage provider's 1 TB storage space provision service by the storage cost cost (C) per 1 TB,

The storage cost (C) per 1 TB is calculated as the result of dividing the cost of the total used TB by the total number of TB

characterized by a method.
According to claim 1,

The category is divided into classifications including file size, file characteristics, and file usage frequency,

In the step of classifying the management target file for each category based on the collected file information,

Based on the collected file information, classifying the management target file by file size range, by file characteristic, or by file usage frequency range

characterized by a method.
According to claim 1,

Virtualizing the management target file to provide data information on the management target file; further comprising

method.
Combined with computer hardware,

A computer program stored in a computer-readable recording medium to execute each step included in the intelligent data management and storage method according to any one of claims 1 to 8.
a memory that stores one or more instructions; and

A processor for executing the instruction; includes,

When the instruction is executed, the processor:

Collect file information about files to be managed,

Classifying the management target file by category based on the collected file information;

optimize the classified files according to a first preset criterion; and

An intelligent data management and storage device for selecting and storing the optimized file in a data storage according to a second preset criterion.