WO2022134345A1 - File access method, apparatus, device, and readable storage medium - Google Patents

File access method, apparatus, device, and readable storage medium Download PDF

Info

Publication number
WO2022134345A1
WO2022134345A1 PCT/CN2021/082867 CN2021082867W WO2022134345A1 WO 2022134345 A1 WO2022134345 A1 WO 2022134345A1 CN 2021082867 W CN2021082867 W CN 2021082867W WO 2022134345 A1 WO2022134345 A1 WO 2022134345A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
information
association
access
association relationship
Prior art date
Application number
PCT/CN2021/082867
Other languages
French (fr)
Chinese (zh)
Inventor
兰东平
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022134345A1 publication Critical patent/WO2022134345A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Definitions

  • the present application relates to the field of distributed storage, and in particular, to a file access method, apparatus, electronic device, and readable storage medium.
  • a file access method provided by this application includes:
  • the present application also provides a file access device, the device comprising:
  • Information acquisition module used to acquire file information
  • the association calculation module is used to extract the file access information in the file information, perform association calculation by using the file access information, and obtain a file association value set; divide the corresponding files in the file access information according to the file association value set. Association relationship, get the file association relationship set;
  • the file access module is configured to respond to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  • the present application also provides an electronic device, the electronic device comprising:
  • the processor executes the computer program stored in the memory to realize the following steps:
  • the present application also provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is executed by a processor in an electronic device to implement the following steps:
  • FIG. 1 is a schematic flowchart of a file access method provided by an embodiment of the present application.
  • FIG. 2 is a schematic block diagram of a file access device provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of the internal structure of an electronic device implementing a file access method provided by an embodiment of the present application
  • the embodiment of the present application provides a file access method.
  • the execution subject of the file access method includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server and a terminal.
  • the file access method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform.
  • the server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
  • the file access method includes:
  • the file information is information of a user's uploaded file, wherein the user's upload file is a user file of a user that is uploaded and stored, and the user file is one or more files.
  • the user's upload file is a user file of a user that is uploaded and stored
  • the user file is one or more files.
  • in order to reduce the storage cost it is necessary to perform hierarchical storage on the uploaded user files. Different companies store the uploaded user files in different storage pools in the order of upload time according to their respective data storage capabilities, such as: The current time is the benchmark. User files whose upload time is within 30 days of the current time are stored in the high-speed storage pool, with fast access speed and high storage costs. User files whose upload time is more than 30 days away from the current time are transferred to the low-speed storage pool.
  • the information of the user file includes: file information, file access information, file storage address, and file upload time.
  • the file information is the file names of all files in the uploaded user file
  • the file access information is the daily access times of all files in the uploaded user file
  • the file storage address is each file in the uploaded user file.
  • the file upload time is the upload time of each file in the uploaded user files. If file A was uploaded on December 4, 2020, the upload time is December 4, 2020.
  • the file information is file A and file B
  • the file access information is the number of times that file A and file B are accessed every day
  • the file storage address is the storage path of file A and file B
  • the file storage address is the storage path of file A and file B.
  • the upload time is December 4, 2020 when the A and B files were uploaded.
  • the file access information in the file information set is extracted, and the access correlation calculation is performed by using the file access information to obtain the user file access information.
  • File association value set wherein, the file access information is the daily access times of each file in the uploaded user files, such as: User A uploads file A and file B, and the file access information of user A is file A and file B The number of visits per day.
  • using the file access information to perform file association calculation to obtain a file association value set includes: using the file access information to calculate the sample standard deviation and the value of each file in the file access information.
  • the sample covariance of two files calculate the correlation value of any two files in the file access information according to the sample standard deviation and the sample covariance, and summarize all the correlation values to obtain the file correlation value set.
  • sample standard deviation can be calculated by the following formula:
  • S x is the sample standard deviation of the x file
  • x i is the number of visits per day of the user file x in the file access information
  • n is the total number of days that x files are uploaded
  • i is the date
  • x is the file of the user in the file access information.
  • S y is the sample standard deviation of the file y
  • y i is the daily access times of the file y in the user file in the file access information
  • n is the total number of days that the y file was uploaded
  • i is the date
  • y is the user's file in the file access information.
  • file x and file y are both files in user files and have the same upload time
  • the total number of upload days is the same, and both are denoted by n, where the total number of days uploaded is the difference between the current time and the upload time.
  • sample covariance can be calculated by the following formula:
  • S xy is the sample covariance of file x and file y.
  • R xy is the associated value of file x and file y.
  • each association value in the file association value set is greater than a preset threshold, and when the association value is greater than the preset threshold, the association relationship of the corresponding file in the user's access information is divided into Strong association; when the association value is less than or equal to the preset threshold, the association of the corresponding files in the user's access information is divided into weak associations; Once one of the files is accessed, the other file will be accessed with a high probability; weak association, the weak relationship between the two files means that once one of the two files is accessed, the other file will be accessed with a low probability. Further, all the divided association relationships are aggregated to obtain the file association relationship set.
  • the file association value set contains a total of A file and B file association value of 0.9, A file and C file association value of 0.5, and the preset threshold is 0.8, then the association relationship between A file and B file is divided into strong associations The relationship between the A file and the C file is divided into a weak relationship.
  • the file access request is an access request for a certain file of the user corresponding to the file information, for example: the file information is the file information of user A, then the user file access request is to access A A user's request to access a file. Further, the file to be accessed is determined according to the file access request. For example, if the user file access request is an access request to access user A's file, then file A is determined to be the file to be accessed.
  • the storage information corresponding to the file to be accessed in the file information is extracted to obtain access information, and the file to be accessed is read and accessed according to the access information, and the file information is updated.
  • S3 it can be known from S3 in this embodiment of the present application that after the file to be accessed is accessed, the file with a strong association relationship with the file to be accessed has a high probability of being accessed. If these The files are always stored in the low-speed storage area, so the access speed is relatively slow.
  • the embodiment of the present application performs access acceleration according to the file association relationship set, so that when accessing the file to be accessed, the The strongly associated file of the to-be-accessed file performs access acceleration.
  • performing access acceleration according to the file association relationship set includes: screening the file association relationship set for the association relationship corresponding to the to-be-accessed file to obtain an initial association relationship set; screening the initial association relationship set; The strong association relationship in the association relationship set is to obtain the target association relationship set; all files corresponding to the to-be-accessed files in the target association relationship set are selected to obtain the target file set; each file in the target file set in the file information is extracted The corresponding storage information is obtained, and the target storage information set is obtained; the access acceleration is performed according to the target storage information set.
  • the performing access acceleration according to the target storage information set includes: transferring the files whose target storage information is centrally stored in a preset low-speed storage pool to a preset high-speed storage pool , and update the corresponding storage information in the file information, further, construct a storage date, optionally, in the embodiment of the present application, the response date of the file access request is increased by a preset storage number of days to obtain the storage date, When the file transferred to the preset high-speed storage pool is accessed within the storage date, the storage date is updated, that is, the storage date is increased by a preset number of days as a new storage date.
  • the preset number of days is 3 days, such as: the storage date is December 4, 2020, the preset number of days is 3 days, and the storage date is updated, and the storage date becomes December 7, 2020;
  • the file transferred to the preset high-speed storage pool is transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated, so as to accelerate file access.
  • the strongly associated files of file A to be accessed are file B and file C.
  • file B and file C are transferred to the high-speed storage pool to access the file.
  • the speed of file B and file C will be very fast. If file B and file C are not accessed for a period of time, file B and file C will be transferred to the low-speed storage pool to reduce file storage costs.
  • the file to be accessed may be stored in a blockchain node.
  • most files can be stored in low-speed storage through access acceleration, while the actual retrieval performance is not affected, and the storage cost is greatly reduced.
  • FIG. 3 it is a functional block diagram of the file access device of the present application.
  • the file access apparatus 100 described in this application may be installed in an electronic device.
  • the file access device may include an information acquisition module 101, an association calculation module 102, and a file access module 103.
  • the modules described in the present invention may also be called units, which refer to a type that can be accessed by an electronic device processor.
  • each module/unit is as follows:
  • the information acquisition module 101 is used for acquiring file information.
  • the file information is information of a user's uploaded file, wherein the user's upload file is a user file of a user that is uploaded and stored, and the user file is one or more files.
  • the user's upload file is a user file of a user that is uploaded and stored
  • the user file is one or more files.
  • in order to reduce the storage cost it is necessary to perform hierarchical storage on the uploaded user files. Different companies store the uploaded user files in different storage pools in the order of upload time according to their respective data storage capabilities, such as: The current time is the benchmark. User files whose upload time is within 30 days of the current time are stored in the high-speed storage pool, with fast access speed and high storage costs. User files whose upload time is more than 30 days away from the current time are transferred to the low-speed storage pool.
  • the information of the user file includes: file information, file access information, file storage address, and file upload time.
  • the file information is the file names of all files in the uploaded user file
  • the file access information is the daily access times of all files in the uploaded user file
  • the file storage address is each file in the uploaded user file.
  • the file upload time is the upload time of each file in the uploaded user files. If file A was uploaded on December 4, 2020, the upload time is December 4, 2020.
  • the file information is file A and file B
  • the file access information is the number of times that file A and file B are accessed every day
  • the file storage address is the storage path of file A and file B
  • the file storage address is the storage path of file A and file B.
  • the upload time is December 4, 2020 when the A and B files were uploaded.
  • the association calculation module 102 is configured to extract the file access information in the file information, perform association calculation by using the file access information, and obtain a file association value set; divide the corresponding file access information according to the file association value set. The association relationship of the file is obtained, and the file association relationship set is obtained.
  • the correlation calculation module 102 uses the file access information to perform access correlation calculation , obtain the file association value set of user file access; wherein, the file access information is the daily access times of each file in the uploaded user files, such as: user A uploads file A and file B altogether, user A's file access information The number of times that file A and file B are accessed per day.
  • the association calculation module 102 uses the following means to perform file association calculation to obtain a file association value set, including: using the file access information to calculate the sample standard of each file in the file access information difference and the sample covariance of each two files; calculate the association value of any two files in the file access information according to the sample standard deviation and the sample covariance, and summarize all the association values to obtain the file association value set.
  • sample standard deviation can be calculated by the following formula:
  • S x is the sample standard deviation of the x file
  • x i is the number of visits per day of the user file x in the file access information
  • n is the total number of days that x files are uploaded
  • i is the date
  • x is the file of the user in the file access information.
  • S y is the sample standard deviation of the file y
  • y i is the daily access times of the file y in the user file in the file access information
  • n is the total number of days that the y file was uploaded
  • i is the date
  • y is the user's file in the file access information.
  • file x and file y are both files in user files and have the same upload time
  • the total number of upload days is the same, and both are denoted by n, where the total number of days uploaded is the difference between the current time and the upload time.
  • sample covariance can be calculated by the following formula:
  • S xy is the sample covariance of file x and file y.
  • R xy is the associated value of file x and file y.
  • the association calculation module 102 determines whether each association value in the file association value set is greater than a preset threshold, and when the association value is greater than the preset threshold, the user's access information corresponding to The association relationship of the file is divided into a strong association relationship; when the association value is less than or equal to a preset threshold, the association relationship of the corresponding file in the user's access information is divided into a weak association relationship; for example, two files are strongly associated The relationship means that once one of the two files is accessed, the other file will be accessed with a high probability; weak association, the two files are weakly related, which means that once one of the two files is accessed, the other file will be accessed with a low probability. a file.
  • the file association value set contains a total of A file and B file association value of 0.9, A file and C file association value of 0.5, and the preset threshold is 0.8, then the association relationship between A file and B file is divided into strong associations The relationship between the A file and the C file is divided into a weak relationship.
  • the file access module 103 is configured to respond to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  • the file access request is an access request for a certain file of the user corresponding to the file information, for example: the file information is the file information of user A, then the user file access request is to access A A user's request to access a file. Further, the file access module 103 determines the file to be accessed according to the file access request. For example, if the user file access request is an access request to access user A's file, then file A is determined to be the file to be accessed.
  • the file access module 103 extracts the storage information corresponding to the to-be-accessed file in the file information, obtains the access information, and performs read access to the to-be-accessed file according to the access information, Update the file access information in the file information. Further, in this embodiment of the present application, after the to-be-accessed file is accessed, there is a high probability that a file with a strong association relationship with the to-be-accessed file may be accessed. Access, if these files are always stored in the low-speed storage area, the access speed is slow.
  • the embodiment of the present application performs access acceleration according to the file association relationship set, so as to realize the access to be accessed when accessing When accessing the file, the access acceleration is performed on the strongly associated file of the to-be-accessed file.
  • the file access module 103 uses the following means to accelerate access, including: screening the association relationship corresponding to the file to be accessed in the file association relationship set to obtain an initial association relationship set; screening The strong association relationship in the initial association relationship set is to obtain a target association relationship set; all files corresponding to the files to be accessed in the target association relationship set are selected to obtain a target file set; the target file set in the file information is extracted For the storage information corresponding to each file, a target storage information set is obtained; access acceleration is performed according to the target storage information set.
  • the file access module 103 uses the following means to accelerate access, including: centrally storing the target storage information in a preset low-speed storage pool The file is transferred to a preset high-speed storage pool, and the corresponding storage information in the file information is updated, and further, a storage date is constructed.
  • the embodiment of the present application increases the response date of the file access request by a preset The number of storage days is obtained, the storage date is obtained, and when the file transferred to the preset high-speed storage pool is accessed within the storage date, the storage date is updated, that is, the storage date is increased by a preset number of days as New storage date, optionally, the preset number of days is 3 days, such as: the storage date is December 4, 2020, the preset number of days is 3 days, update the storage date, the storage date The date becomes December 7, 2020; when the storage date is reached, the files transferred to the preset high-speed storage pool are transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated , to achieve accelerated file access.
  • the preset number of days is 3 days, such as: the storage date is December 4, 2020, the preset number of days is 3 days, update the storage date, the storage date The date becomes December 7, 2020; when the storage date is reached, the files transferred to the preset high-speed storage pool are transferred to the low
  • the strongly associated files of file A to be accessed are file B and file C.
  • file B and file C are transferred to the high-speed storage pool to access the file.
  • the speed of file B and file C will be very fast. If file B and file C are not accessed for a period of time, file B and file C will be transferred to the low-speed storage pool to reduce file storage costs.
  • the file to be accessed may be stored in a blockchain node.
  • most files can be stored in low-speed storage through access acceleration, while the actual retrieval performance is not affected, and the storage cost is greatly reduced.
  • FIG. 3 it is a schematic structural diagram of an electronic device implementing the file access method of the present application.
  • the electronic device 1 may include a processor 10, a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10, such as a file access program 12.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 .
  • the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash memory card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can not only be used to store application software installed in the electronic device 1 and various types of data, such as code of a file access program, etc., but also can be used to temporarily store data that has been output or will be output.
  • the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits.
  • Central Processing Unit CPU
  • microprocessor digital processing chip
  • graphics processor and combination of various control chips, etc.
  • the processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect the various components of the entire electronic device, by running or executing the program or module (such as a file) stored in the memory 11. access programs, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
  • the bus may be a peripheral component interconnect standard (perIPheral component interconnect, referred to as PCI) bus or an extended industry standard architecture (extended industry standard architecture, referred to as EISA) bus or the like.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to implement connection communication between the memory 11 and at least one processor 10 and the like.
  • FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.
  • the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components.
  • the electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • a network interface optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like.
  • the display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
  • the file access program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple computer programs, and when running in the processor 10, it can realize:
  • the integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium.
  • the computer-readable medium may be non-volatile or volatile.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
  • Embodiments of the present application may further provide a computer-readable storage medium, where the readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:
  • the computer usable storage medium may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like; using the created data, etc.
  • modules described as separate components may or may not be physically separated, and components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided are a file access method, and a corresponding access apparatus, electronic device, and readable storage medium, said access method comprising: extracting file access information from file information, and using said file access information to perform association calculation to obtain a set of file association values (S2); delineating a correlation of corresponding files in said file access information according to said set of file association values to obtain a set of file correlations (S3); in response to a file access request, determining a file to be accessed according to said file access request, and using the set of file correlations to access said file to be accessed (S4). In the described manner, the efficiency of file access can be improved.

Description

文件访问方法、装置、设备及可读存储介质File access method, apparatus, device and readable storage medium
本申请要求于2020年12月25日提交中国专利局、申请号为CN202011562951.X、名称为“文件访问方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number CN202011562951.X and the title of "File Access Method, Apparatus, Equipment and Readable Storage Medium" filed with the China Patent Office on December 25, 2020, the entire contents of which are approved by Reference is incorporated in this application.
技术领域technical field
本申请涉及分布式存储领域,尤其涉及一种文件访问方法、装置、电子设备及可读存储介质。The present application relates to the field of distributed storage, and in particular, to a file access method, apparatus, electronic device, and readable storage medium.
背景技术Background technique
现今,随着互联网、电子信息化时代的普及,以及大数据、AI行业的火热,各企业、组织的各种各样的非结构化数据量暴涨。发明人意识到在海量非结构化数据中,为了降低成本,往往会把数据分级存储。即新数据存储到高速存储池中,访问速度快,但存储成本高;历史旧数据存储在低速存储池中,访问速度慢,但存储成本低。对新数据的访问速度块,但对历史数据的访问速度很慢,文件整体访问效率低。Today, with the popularization of the Internet and the electronic information age, as well as the booming of the big data and AI industries, the amount of various unstructured data of various enterprises and organizations has skyrocketed. The inventor realized that in the massive unstructured data, in order to reduce the cost, the data is often stored in stages. That is, new data is stored in a high-speed storage pool, with fast access speed but high storage cost; historical old data is stored in a low-speed storage pool, with slow access speed but low storage cost. The access speed to new data is block, but the access speed to historical data is very slow, and the overall file access efficiency is low.
发明内容SUMMARY OF THE INVENTION
本申请提供的一种文件访问方法,包括:A file access method provided by this application includes:
获取文件信息;get file information;
提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
本申请还提供一种文件访问装置,所述装置包括:The present application also provides a file access device, the device comprising:
信息获取模块,用于获取文件信息;Information acquisition module, used to acquire file information;
关联计算模块,用于提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;The association calculation module is used to extract the file access information in the file information, perform association calculation by using the file access information, and obtain a file association value set; divide the corresponding files in the file access information according to the file association value set. Association relationship, get the file association relationship set;
文件访问模块,用于响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。The file access module is configured to respond to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
本申请还提供一种电子设备,所述电子设备包括:The present application also provides an electronic device, the electronic device comprising:
存储器,存储至少一个计算机程序;及a memory that stores at least one computer program; and
处理器,执行所述存储器中存储的计算机程序以实现如下步骤:The processor executes the computer program stored in the memory to realize the following steps:
获取文件信息;get file information;
提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个计算机程序,所述至少一个计算机程序被电子设备中的处理器执行以实现如下步骤:The present application also provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is executed by a processor in an electronic device to implement the following steps:
获取文件信息;get file information;
提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
附图说明Description of drawings
图1为本申请一实施例提供的文件访问方法的流程示意图;1 is a schematic flowchart of a file access method provided by an embodiment of the present application;
图2为本申请一实施例提供的文件访问装置的模块示意图;FIG. 2 is a schematic block diagram of a file access device provided by an embodiment of the present application;
图3为本申请一实施例提供的实现文件访问方法的电子设备的内部结构示意图;3 is a schematic diagram of the internal structure of an electronic device implementing a file access method provided by an embodiment of the present application;
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
本申请实施例提供一种文件访问方法。所述文件访问方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述文件访问方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。The embodiment of the present application provides a file access method. The execution subject of the file access method includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server and a terminal. In other words, the file access method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
参照图1所示的本申请一实施例提供的文件访问方法的流程示意图,在本申请实施例中,所述文件访问方法包括:Referring to the schematic flowchart of the file access method provided by an embodiment of the present application shown in FIG. 1, in the embodiment of the present application, the file access method includes:
S1、获取文件信息;S1. Obtain file information;
本申请实施例中,所述文件信息为某用户的上传文件的信息,其中,所述用户的上传文件为上传存储的某用户的用户文件,所述用户文件为一个或多个文件,本申请实施例中为了降低存储成本,需要对上传的用户文件进行分级存储,不同的公司根据各自的数据存储能力将所述上传的用户文件按照上传时间的先后存储在不同的存储池中,如:以当前时间为基准,上传时间距离当前时间的时间差在30天内的用户文件存储在高速存储池中,访问速度快,存储成本高,上传时间距离当前时间的时间差大于30天的用户文件转存至低速存储池中,访问速度慢,但是存储成本低。进一步地,所述用户文件的信息,包含:文件信息,文件访问信息,文件存储地址,文件上传时间。其中,所述文件信息为上传的用户文件中所有文件的文件名称,文件访问信息为上传的用户文件中所有文件的访问每天的访问次数,所述文件存储地址为上传的用户文件中每个文件的存储路径,所述文件上传时间为上传的用户文件中每个文件的上传时间,如A文件是2020年12月4日上传的,那么上传时间为2020年12月4日。详细地,如:A用户的上传文件信息中文件信息为甲文件和乙文件、文件访问信息为甲文件和乙文件每天被访问的次数,文件存储地址为甲文件和乙文件的存储路径,文件上传时间为甲文件和乙文件上传的时间如2020年12月4日。In the embodiment of the present application, the file information is information of a user's uploaded file, wherein the user's upload file is a user file of a user that is uploaded and stored, and the user file is one or more files. In the embodiment, in order to reduce the storage cost, it is necessary to perform hierarchical storage on the uploaded user files. Different companies store the uploaded user files in different storage pools in the order of upload time according to their respective data storage capabilities, such as: The current time is the benchmark. User files whose upload time is within 30 days of the current time are stored in the high-speed storage pool, with fast access speed and high storage costs. User files whose upload time is more than 30 days away from the current time are transferred to the low-speed storage pool. In the storage pool, the access speed is slow, but the storage cost is low. Further, the information of the user file includes: file information, file access information, file storage address, and file upload time. The file information is the file names of all files in the uploaded user file, the file access information is the daily access times of all files in the uploaded user file, and the file storage address is each file in the uploaded user file. The file upload time is the upload time of each file in the uploaded user files. If file A was uploaded on December 4, 2020, the upload time is December 4, 2020. Specifically, for example, in the uploaded file information of user A, the file information is file A and file B, the file access information is the number of times that file A and file B are accessed every day, the file storage address is the storage path of file A and file B, and the file storage address is the storage path of file A and file B. The upload time is December 4, 2020 when the A and B files were uploaded.
S2、提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;S2, extracting the file access information in the file information, and using the file access information to perform an association calculation to obtain a file association value set;
本申请实施例中,为了分析上传的用户文件中所有文件两两之间的关联性,提取所述文件信息集中的文件访问信息,利用所述文件访问信息进行访问关联计算,得到用户文件访问的文件关联值集;其中,所述文件访问信息为上传的用户文件中每个文件每天的访问次数,如:A用户共上传甲文件和乙文件,A用户的文件访问信息为甲文件和乙文件每天被访问的次数。In the embodiment of the present application, in order to analyze the correlation between all files in the uploaded user files, the file access information in the file information set is extracted, and the access correlation calculation is performed by using the file access information to obtain the user file access information. File association value set; wherein, the file access information is the daily access times of each file in the uploaded user files, such as: User A uploads file A and file B, and the file access information of user A is file A and file B The number of visits per day.
详细地,本申请实施例中,利用所述文件访问信息进行文件关联计算,得到文件关联值集,包括:利用所述文件访问信息计算所述文件访问信息中每个文件的样本标准差及每 两个文件的样本协方差;根据所述样本标准差及所述样本协方差计算所述文件访问信息中任意两个文件的关联值,汇总所有的所述关联值得到所述文件关联值集。In detail, in this embodiment of the present application, using the file access information to perform file association calculation to obtain a file association value set includes: using the file access information to calculate the sample standard deviation and the value of each file in the file access information. The sample covariance of two files; calculate the correlation value of any two files in the file access information according to the sample standard deviation and the sample covariance, and summarize all the correlation values to obtain the file correlation value set.
其中,所述样本标准差可用下述公式进行计算:Wherein, the sample standard deviation can be calculated by the following formula:
Figure PCTCN2021082867-appb-000001
Figure PCTCN2021082867-appb-000001
其中,S x为x文件的样本标准差,x i为所述文件访问信息中用户文件文件x每一天的访问次数,
Figure PCTCN2021082867-appb-000002
为x文件每天访问次数的平均值,n为x文件上传的总天数,i为日期,x为所述文件访问信息中用户的文件。
Wherein, S x is the sample standard deviation of the x file, x i is the number of visits per day of the user file x in the file access information,
Figure PCTCN2021082867-appb-000002
is the average of the daily access times of x files, n is the total number of days that x files are uploaded, i is the date, and x is the file of the user in the file access information.
Figure PCTCN2021082867-appb-000003
Figure PCTCN2021082867-appb-000003
其中,S y为文件y的样本标准差,y i为所述文件访问信息中用户文件中文件y每一天的访问次数,
Figure PCTCN2021082867-appb-000004
为y文件每天访问次数的平均值,n为y文件上传的总天数,i为日期,y为所述文件访问信息中用户的文件。
Wherein, S y is the sample standard deviation of the file y, y i is the daily access times of the file y in the user file in the file access information,
Figure PCTCN2021082867-appb-000004
is the average number of daily visits to the y file, n is the total number of days that the y file was uploaded, i is the date, and y is the user's file in the file access information.
由于文件x与文件y均为用户文件中的文件,上传时间相同,因此上传的总天数一致,均用n表示,其中,所述上传的总天数为当前时间与上传时间的差值。Since file x and file y are both files in user files and have the same upload time, the total number of upload days is the same, and both are denoted by n, where the total number of days uploaded is the difference between the current time and the upload time.
进一步地,样本协方差可用下述公式进行计算:Further, the sample covariance can be calculated by the following formula:
Figure PCTCN2021082867-appb-000005
Figure PCTCN2021082867-appb-000005
其中,S xy为文件x和文件y的样本协方差。 where S xy is the sample covariance of file x and file y.
进一步地,本申请实施例中计算任意两个文件的关联值可用如下公式进行计算:Further, calculating the associated value of any two files in the embodiment of the present application can be calculated by the following formula:
Figure PCTCN2021082867-appb-000006
Figure PCTCN2021082867-appb-000006
其中,R xy为文件x和文件y的关联值。 Among them, R xy is the associated value of file x and file y.
S3、根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;S3, dividing the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
本申请实施例中,判断所述文件关联值集中的每个关联值是否大于预设阈值,当所述关联值大于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为强关联关系;当所述关联值小于或等于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为弱关联关系;如,两个文件为强关联关系表示两个文件中其中的一个文件一旦被访问,将大概率访问另一文件;弱关联,两个文件为弱关联关系表示两个文件中其中的一个文件一旦被访问,将低概率访问另一文件。进一步地,汇总划分后的所有所述关联关系,得到所述文件关联关系集。例如:所述文件关联值集中共包含A文件和B文件的关联值0.9,A文件和C文件的关联值0.5,预设阈值为0.8,那么将A文件和B文件的关联关系划分为强关联关系,将A文件和C文件的关联关系划分为弱关联关系。In the embodiment of the present application, it is determined whether each association value in the file association value set is greater than a preset threshold, and when the association value is greater than the preset threshold, the association relationship of the corresponding file in the user's access information is divided into Strong association; when the association value is less than or equal to the preset threshold, the association of the corresponding files in the user's access information is divided into weak associations; Once one of the files is accessed, the other file will be accessed with a high probability; weak association, the weak relationship between the two files means that once one of the two files is accessed, the other file will be accessed with a low probability. Further, all the divided association relationships are aggregated to obtain the file association relationship set. For example: the file association value set contains a total of A file and B file association value of 0.9, A file and C file association value of 0.5, and the preset threshold is 0.8, then the association relationship between A file and B file is divided into strong associations The relationship between the A file and the C file is divided into a weak relationship.
S4、响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。S4. In response to the file access request, determine the to-be-accessed file according to the file access request, and use the file association relationship set to access the to-be-accessed file.
本申请实施例中,所述文件访问请求经过验证的所述文件信息对应的用户的某个文件的访问请求,如:所述文件信息为A用户的文件信息,那么用户文件访问请求为访问A用户某个文件的访问请求。进一步地,根据所述文件访问请求确定待访问文件,如:用户文件访问请求为访问A用户甲文件的访问请求,那么将甲文件确定为待访问文件。In the embodiment of the present application, the file access request is an access request for a certain file of the user corresponding to the file information, for example: the file information is the file information of user A, then the user file access request is to access A A user's request to access a file. Further, the file to be accessed is determined according to the file access request. For example, if the user file access request is an access request to access user A's file, then file A is determined to be the file to be accessed.
进一步地,本申请实施例中,提取所述文件信息中所述待访问文件对应存储信息,得到访问信息,根据所述访问信息对所述待访问文件进行读取访问,更新所述文件信息中的文件访问信息,进一步地,本申请实施例由S3可知,当所述待访问文件被访问后,与所述待访问文件之间存在强关联关系的文件有很大概率可能被访问,如果这些文件一直存储 在低速存储区,那么访问的速度较慢,为了提升这些文件的访问速度,进一步地,本申请实施例根据所述文件关联关系集进行访问加速,实现在访问待访问文件时,将所述待访问文件的强关联关系文件进行访问加速。Further, in the embodiment of the present application, the storage information corresponding to the file to be accessed in the file information is extracted to obtain access information, and the file to be accessed is read and accessed according to the access information, and the file information is updated. Further, it can be known from S3 in this embodiment of the present application that after the file to be accessed is accessed, the file with a strong association relationship with the file to be accessed has a high probability of being accessed. If these The files are always stored in the low-speed storage area, so the access speed is relatively slow. In order to improve the access speed of these files, further, the embodiment of the present application performs access acceleration according to the file association relationship set, so that when accessing the file to be accessed, the The strongly associated file of the to-be-accessed file performs access acceleration.
详细地,本申请实施例中,根据所述文件关联关系集进行访问加速,包括:筛选所述文件关联关系集中与所述待访问文件对应的关联关系,得到初始关联关系集;筛选所述初始关联关系集中的强关联关系,得到目标关联关系集;选取所述目标关联关系集中所述待访问文件对应的所有文件,得到目标文件集;提取所述文件信息中所述目标文件集中每个文件对应的存储信息,得到目标存储信息集;根据所述目标存储信息集进行访问加速。In detail, in this embodiment of the present application, performing access acceleration according to the file association relationship set includes: screening the file association relationship set for the association relationship corresponding to the to-be-accessed file to obtain an initial association relationship set; screening the initial association relationship set; The strong association relationship in the association relationship set is to obtain the target association relationship set; all files corresponding to the to-be-accessed files in the target association relationship set are selected to obtain the target file set; each file in the target file set in the file information is extracted The corresponding storage information is obtained, and the target storage information set is obtained; the access acceleration is performed according to the target storage information set.
进一步地,本申请实施例中,所述根据所述目标存储信息集进行访问加速,包括:将所述目标存储信息集中存储在预设的低速存储池中的文件转移至预设的高速存储池,并更新所述文件信息中对应的存储信息,进一步地,构建存储日期,可选地,本申请实施例将所述文件访问请求的响应日期增加预设的存储天数,得到所述存储日期,当所述转移至预设的高速存储池的文件在所述存储日期内被访问时,更新所述存储日期,即将所述存储日期增加预设天数后作为新的存储日期,可选地,所述预设天数为3天,如:所述存储日期为2020年12月4日,所述预设天数为3天,更新所述存储日期,所述存储日期变为2020年12月7日;当到达所述存储日期时,将所述转移至预设的高速存储池的文件转移至所述低速存储池,并更新所述文件信息中对应的存储信息,实现文件的访问加速。如:待访问文件A的强关联关系文件为文件B和文件C,当访问文件A时很大概率会访问文件B和文件C,因此,将文件B和文件C转移至高速存储池,访问文件B和文件C的速度就会很快,若文件B和文件C一段时间内没有被访问,再将文件B和文件C转移至低速存储池,降低文件存储成本。Further, in the embodiment of the present application, the performing access acceleration according to the target storage information set includes: transferring the files whose target storage information is centrally stored in a preset low-speed storage pool to a preset high-speed storage pool , and update the corresponding storage information in the file information, further, construct a storage date, optionally, in the embodiment of the present application, the response date of the file access request is increased by a preset storage number of days to obtain the storage date, When the file transferred to the preset high-speed storage pool is accessed within the storage date, the storage date is updated, that is, the storage date is increased by a preset number of days as a new storage date. The preset number of days is 3 days, such as: the storage date is December 4, 2020, the preset number of days is 3 days, and the storage date is updated, and the storage date becomes December 7, 2020; When the storage date arrives, the file transferred to the preset high-speed storage pool is transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated, so as to accelerate file access. For example, the strongly associated files of file A to be accessed are file B and file C. When accessing file A, there is a high probability that file B and file C will be accessed. Therefore, file B and file C are transferred to the high-speed storage pool to access the file. The speed of file B and file C will be very fast. If file B and file C are not accessed for a period of time, file B and file C will be transferred to the low-speed storage pool to reduce file storage costs.
本申请的另一实施例中,为保证数据的隐私性,所述待访问文件可以存储在区块链节点中。In another embodiment of the present application, in order to ensure the privacy of data, the file to be accessed may be stored in a blockchain node.
本申请实施例通过访问加速可将大部分文件放到低速存储上,而实际调阅性能不受影响,大大降低存储成本。In the embodiment of the present application, most files can be stored in low-speed storage through access acceleration, while the actual retrieval performance is not affected, and the storage cost is greatly reduced.
如图3所示,是本申请文件访问装置的功能模块图。As shown in FIG. 3 , it is a functional block diagram of the file access device of the present application.
本申请所述文件访问装置100可以安装于电子设备中。根据实现的功能,所述文件访问装置可以包括信息获取模块101、关联计算模块102、文件访问模块103,本发所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The file access apparatus 100 described in this application may be installed in an electronic device. According to the realized functions, the file access device may include an information acquisition module 101, an association calculation module 102, and a file access module 103. The modules described in the present invention may also be called units, which refer to a type that can be accessed by an electronic device processor. A series of computer program segments that execute and are capable of performing a fixed function and are stored in the memory of an electronic device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述信息获取模块101用于获取文件信息。The information acquisition module 101 is used for acquiring file information.
本申请实施例中,所述文件信息为某用户的上传文件的信息,其中,所述用户的上传文件为上传存储的某用户的用户文件,所述用户文件为一个或多个文件,本申请实施例中为了降低存储成本,需要对上传的用户文件进行分级存储,不同的公司根据各自的数据存储能力将所述上传的用户文件按照上传时间的先后存储在不同的存储池中,如:以当前时间为基准,上传时间距离当前时间的时间差在30天内的用户文件存储在高速存储池中,访问速度快,存储成本高,上传时间距离当前时间的时间差大于30天的用户文件转存至低速存储池中,访问速度慢,但是存储成本低。进一步地,所述用户文件的信息,包含:文件信息,文件访问信息,文件存储地址,文件上传时间。其中,所述文件信息为上传的用户文件中所有文件的文件名称,文件访问信息为上传的用户文件中所有文件的访问每天的访问次数,所述文件存储地址为上传的用户文件中每个文件的存储路径,所述文件上传时间为上传的用户文件中每个文件的上传时间,如A文件是2020年12月4日上传的,那么上传时间为2020年12月4日。详细地,如:A用户的上传文件信息中文件信息为甲文件和乙文件、文件访问信息为甲文件和乙文件每天被访问的次数,文件存储地址为甲文件 和乙文件的存储路径,文件上传时间为甲文件和乙文件上传的时间如2020年12月4日。In the embodiment of the present application, the file information is information of a user's uploaded file, wherein the user's upload file is a user file of a user that is uploaded and stored, and the user file is one or more files. In the embodiment, in order to reduce the storage cost, it is necessary to perform hierarchical storage on the uploaded user files. Different companies store the uploaded user files in different storage pools in the order of upload time according to their respective data storage capabilities, such as: The current time is the benchmark. User files whose upload time is within 30 days of the current time are stored in the high-speed storage pool, with fast access speed and high storage costs. User files whose upload time is more than 30 days away from the current time are transferred to the low-speed storage pool. In the storage pool, the access speed is slow, but the storage cost is low. Further, the information of the user file includes: file information, file access information, file storage address, and file upload time. The file information is the file names of all files in the uploaded user file, the file access information is the daily access times of all files in the uploaded user file, and the file storage address is each file in the uploaded user file. The file upload time is the upload time of each file in the uploaded user files. If file A was uploaded on December 4, 2020, the upload time is December 4, 2020. Specifically, for example, in the uploaded file information of user A, the file information is file A and file B, the file access information is the number of times that file A and file B are accessed every day, the file storage address is the storage path of file A and file B, and the file storage address is the storage path of file A and file B. The upload time is December 4, 2020 when the A and B files were uploaded.
所述关联计算模块102用于提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集。The association calculation module 102 is configured to extract the file access information in the file information, perform association calculation by using the file access information, and obtain a file association value set; divide the corresponding file access information according to the file association value set. The association relationship of the file is obtained, and the file association relationship set is obtained.
本申请实施例中,为了分析上传的用户文件中所有文件两两之间的关联性,提取所述文件信息集中的文件访问信息,所述关联计算模块102利用所述文件访问信息进行访问关联计算,得到用户文件访问的文件关联值集;其中,所述文件访问信息为上传的用户文件中每个文件每天的访问次数,如:A用户共上传甲文件和乙文件,A用户的文件访问信息为甲文件和乙文件每天被访问的次数。In this embodiment of the present application, in order to analyze the correlation between all files in the uploaded user files and extract the file access information in the file information set, the correlation calculation module 102 uses the file access information to perform access correlation calculation , obtain the file association value set of user file access; wherein, the file access information is the daily access times of each file in the uploaded user files, such as: user A uploads file A and file B altogether, user A's file access information The number of times that file A and file B are accessed per day.
详细地,本申请实施例中,所述关联计算模块102利用如下手段进行文件关联计算,得到文件关联值集,包括:利用所述文件访问信息计算所述文件访问信息中每个文件的样本标准差及每两个文件的样本协方差;根据所述样本标准差及所述样本协方差计算所述文件访问信息中任意两个文件的关联值,汇总所有的所述关联值得到所述文件关联值集。In detail, in this embodiment of the present application, the association calculation module 102 uses the following means to perform file association calculation to obtain a file association value set, including: using the file access information to calculate the sample standard of each file in the file access information difference and the sample covariance of each two files; calculate the association value of any two files in the file access information according to the sample standard deviation and the sample covariance, and summarize all the association values to obtain the file association value set.
其中,所述样本标准差可用下述公式进行计算:Wherein, the sample standard deviation can be calculated by the following formula:
Figure PCTCN2021082867-appb-000007
Figure PCTCN2021082867-appb-000007
其中,S x为x文件的样本标准差,x i为所述文件访问信息中用户文件文件x每一天的访问次数,
Figure PCTCN2021082867-appb-000008
为x文件每天访问次数的平均值,n为x文件上传的总天数,i为日期,x为所述文件访问信息中用户的文件。
Wherein, S x is the sample standard deviation of the x file, x i is the number of visits per day of the user file x in the file access information,
Figure PCTCN2021082867-appb-000008
is the average of the daily access times of x files, n is the total number of days that x files are uploaded, i is the date, and x is the file of the user in the file access information.
Figure PCTCN2021082867-appb-000009
Figure PCTCN2021082867-appb-000009
其中,S y为文件y的样本标准差,y i为所述文件访问信息中用户文件中文件y每一天的访问次数,
Figure PCTCN2021082867-appb-000010
为y文件每天访问次数的平均值,n为y文件上传的总天数,i为日期,y为所述文件访问信息中用户的文件。
Wherein, S y is the sample standard deviation of the file y, y i is the daily access times of the file y in the user file in the file access information,
Figure PCTCN2021082867-appb-000010
is the average number of daily visits to the y file, n is the total number of days that the y file was uploaded, i is the date, and y is the user's file in the file access information.
由于文件x与文件y均为用户文件中的文件,上传时间相同,因此上传的总天数一致,均用n表示,其中,所述上传的总天数为当前时间与上传时间的差值。Since file x and file y are both files in user files and have the same upload time, the total number of upload days is the same, and both are denoted by n, where the total number of days uploaded is the difference between the current time and the upload time.
进一步地,样本协方差可用下述公式进行计算:Further, the sample covariance can be calculated by the following formula:
Figure PCTCN2021082867-appb-000011
Figure PCTCN2021082867-appb-000011
其中,S xy为文件x和文件y的样本协方差。 where S xy is the sample covariance of file x and file y.
进一步地,本申请实施例中计算任意两个文件的关联值可用如下公式进行计算:Further, calculating the associated value of any two files in the embodiment of the present application can be calculated by the following formula:
Figure PCTCN2021082867-appb-000012
Figure PCTCN2021082867-appb-000012
其中,R xy为文件x和文件y的关联值。 Among them, R xy is the associated value of file x and file y.
本申请实施例中,所述关联计算模块102判断所述文件关联值集中的每个关联值是否大于预设阈值,当所述关联值大于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为强关联关系;当所述关联值小于或等于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为弱关联关系;如,两个文件为强关联关系表示两个文件中其中的一个文件一旦被访问,将大概率访问另一文件;弱关联,两个文件为弱关联关系表示两个文件中其中的一个文件一旦被访问,将低概率访问另一文件。进一步地,汇总划分后的所有所述关联关系,得到所述文件关联关系集。例如:所述文件关联值集中共包含A文件和B文件的关联值0.9,A文件和C文件的关联值0.5,预设阈值为0.8,那么将A文件和B文件的关联关系划分为强关联关系,将A文件和C文件的关联关系划分为弱关联关系。In the embodiment of the present application, the association calculation module 102 determines whether each association value in the file association value set is greater than a preset threshold, and when the association value is greater than the preset threshold, the user's access information corresponding to The association relationship of the file is divided into a strong association relationship; when the association value is less than or equal to a preset threshold, the association relationship of the corresponding file in the user's access information is divided into a weak association relationship; for example, two files are strongly associated The relationship means that once one of the two files is accessed, the other file will be accessed with a high probability; weak association, the two files are weakly related, which means that once one of the two files is accessed, the other file will be accessed with a low probability. a file. Further, all the divided association relationships are aggregated to obtain the file association relationship set. For example: the file association value set contains a total of A file and B file association value of 0.9, A file and C file association value of 0.5, and the preset threshold is 0.8, then the association relationship between A file and B file is divided into strong associations The relationship between the A file and the C file is divided into a weak relationship.
所述文件访问模块103用于响应文件访问请求,根据所述文件访问请求确定待访问文 件,利用所述文件关联关系集对所述待访问文件进行访问。The file access module 103 is configured to respond to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
本申请实施例中,所述文件访问请求经过验证的所述文件信息对应的用户的某个文件的访问请求,如:所述文件信息为A用户的文件信息,那么用户文件访问请求为访问A用户某个文件的访问请求。进一步地,所述文件访问模块103根据所述文件访问请求确定待访问文件,如:用户文件访问请求为访问A用户甲文件的访问请求,那么将甲文件确定为待访问文件。In the embodiment of the present application, the file access request is an access request for a certain file of the user corresponding to the file information, for example: the file information is the file information of user A, then the user file access request is to access A A user's request to access a file. Further, the file access module 103 determines the file to be accessed according to the file access request. For example, if the user file access request is an access request to access user A's file, then file A is determined to be the file to be accessed.
进一步地,本申请实施例中,所述文件访问模块103提取所述文件信息中所述待访问文件对应存储信息,得到访问信息,根据所述访问信息对所述待访问文件进行读取访问,更新所述文件信息中的文件访问信息,进一步地,本申请实施例中,当所述待访问文件被访问后,与所述待访问文件之间存在强关联关系的文件有很大概率可能被访问,如果这些文件一直存储在低速存储区,那么访问的速度较慢,为了提升这些文件的访问速度,进一步地,本申请实施例根据所述文件关联关系集进行访问加速,实现在访问待访问文件时,将所述待访问文件的强关联关系文件进行访问加速。Further, in the embodiment of the present application, the file access module 103 extracts the storage information corresponding to the to-be-accessed file in the file information, obtains the access information, and performs read access to the to-be-accessed file according to the access information, Update the file access information in the file information. Further, in this embodiment of the present application, after the to-be-accessed file is accessed, there is a high probability that a file with a strong association relationship with the to-be-accessed file may be accessed. Access, if these files are always stored in the low-speed storage area, the access speed is slow. In order to improve the access speed of these files, further, the embodiment of the present application performs access acceleration according to the file association relationship set, so as to realize the access to be accessed when accessing When accessing the file, the access acceleration is performed on the strongly associated file of the to-be-accessed file.
详细地,本申请实施例中,所述文件访问模块103利用下述手段进行访问加速,包括:筛选所述文件关联关系集中与所述待访问文件对应的关联关系,得到初始关联关系集;筛选所述初始关联关系集中的强关联关系,得到目标关联关系集;选取所述目标关联关系集中所述待访问文件对应的所有文件,得到目标文件集;提取所述文件信息中所述目标文件集中每个文件对应的存储信息,得到目标存储信息集;根据所述目标存储信息集进行访问加速。In detail, in this embodiment of the present application, the file access module 103 uses the following means to accelerate access, including: screening the association relationship corresponding to the file to be accessed in the file association relationship set to obtain an initial association relationship set; screening The strong association relationship in the initial association relationship set is to obtain a target association relationship set; all files corresponding to the files to be accessed in the target association relationship set are selected to obtain a target file set; the target file set in the file information is extracted For the storage information corresponding to each file, a target storage information set is obtained; access acceleration is performed according to the target storage information set.
进一步地,本申请实施例中,所述根据所述目标存储信息集所述文件访问模块103利用下述手段进行访问加速,包括:将所述目标存储信息集中存储在预设的低速存储池中的文件转移至预设的高速存储池,并更新所述文件信息中对应的存储信息,进一步地,构建存储日期,可选地,本申请实施例将所述文件访问请求的响应日期增加预设的存储天数,得到所述存储日期,当所述转移至预设的高速存储池的文件在所述存储日期内被访问时,更新所述存储日期,即将所述存储日期增加预设天数后作为新的存储日期,可选地,所述预设天数为3天,如:所述存储日期为2020年12月4日,所述预设天数为3天,更新所述存储日期,所述存储日期变为2020年12月7日;当到达所述存储日期时,将所述转移至预设的高速存储池的文件转移至所述低速存储池,并更新所述文件信息中对应的存储信息,实现文件的访问加速。如:待访问文件A的强关联关系文件为文件B和文件C,当访问文件A时很大概率会访问文件B和文件C,因此,将文件B和文件C转移至高速存储池,访问文件B和文件C的速度就会很快,若文件B和文件C一段时间内没有被访问,再将文件B和文件C转移至低速存储池,降低文件存储成本。Further, in this embodiment of the present application, the file access module 103 according to the target storage information set uses the following means to accelerate access, including: centrally storing the target storage information in a preset low-speed storage pool The file is transferred to a preset high-speed storage pool, and the corresponding storage information in the file information is updated, and further, a storage date is constructed. Optionally, the embodiment of the present application increases the response date of the file access request by a preset The number of storage days is obtained, the storage date is obtained, and when the file transferred to the preset high-speed storage pool is accessed within the storage date, the storage date is updated, that is, the storage date is increased by a preset number of days as New storage date, optionally, the preset number of days is 3 days, such as: the storage date is December 4, 2020, the preset number of days is 3 days, update the storage date, the storage date The date becomes December 7, 2020; when the storage date is reached, the files transferred to the preset high-speed storage pool are transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated , to achieve accelerated file access. For example, the strongly associated files of file A to be accessed are file B and file C. When accessing file A, there is a high probability that file B and file C will be accessed. Therefore, file B and file C are transferred to the high-speed storage pool to access the file. The speed of file B and file C will be very fast. If file B and file C are not accessed for a period of time, file B and file C will be transferred to the low-speed storage pool to reduce file storage costs.
本申请的另一实施例中,为保证数据的隐私性,所述待访问文件可以存储在区块链节点中。In another embodiment of the present application, in order to ensure the privacy of data, the file to be accessed may be stored in a blockchain node.
本申请实施例通过访问加速可将大部分文件放到低速存储上,而实际调阅性能不受影响,大大降低存储成本。In the embodiment of the present application, most files can be stored in low-speed storage through access acceleration, while the actual retrieval performance is not affected, and the storage cost is greatly reduced.
如图3所示,是本申请实现文件访问方法的电子设备的结构示意图。As shown in FIG. 3 , it is a schematic structural diagram of an electronic device implementing the file access method of the present application.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如文件访问程序12。The electronic device 1 may include a processor 10, a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10, such as a file access program 12.
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可 以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如文件访问程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash memory card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store application software installed in the electronic device 1 and various types of data, such as code of a file access program, etc., but also can be used to temporarily store data that has been output or will be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如文件访问程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits. Central Processing Unit (CPU), microprocessor, digital processing chip, graphics processor and combination of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect the various components of the entire electronic device, by running or executing the program or module (such as a file) stored in the memory 11. access programs, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(perIPheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect standard (perIPheral component interconnect, referred to as PCI) bus or an extended industry standard architecture (extended industry standard architecture, referred to as EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection communication between the memory 11 and at least one processor 10 and the like.
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management The device implements functions such as charge management, discharge management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components. The electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的文件访问程序12是多个计算机程序的组合,在所述处理器10中运行时,可以实现:The file access program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple computer programs, and when running in the processor 10, it can realize:
获取文件信息;get file information;
提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
具体地,所述处理器10对上述计算机程序的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above-mentioned computer program by the processor 10, reference may be made to the description of the relevant steps in the corresponding embodiment of FIG. 1, and details are not described herein.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独 立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。所述计算机可读介质可以是非易失性的,也可以是易失性的。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the integrated modules/units of the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
本申请实施例还可以提供一种计算机可读存储介质,所述可读存储介质存储有计算机程序,所述计算机程序在被电子设备的处理器所执行时,可以实现:Embodiments of the present application may further provide a computer-readable storage medium, where the readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:
获取文件信息;get file information;
提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
进一步地,所述计算机可用存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer usable storage medium may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like; using the created data, etc.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any reference signs in the claims shall not be construed as limiting the involved claim.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application rather than limitations. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present application.

Claims (20)

  1. 一种文件访问方法,其中,所述方法包括:A file access method, wherein the method comprises:
    获取文件信息;get file information;
    提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
    根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
    响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  2. 如权利要求1所述的文件访问方法,其中,所述提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集,包括:The file access method according to claim 1, wherein the extracting the file access information in the file information, and using the file access information to perform an association calculation to obtain a file association value set, comprising:
    利用所述文件访问信息计算所述文件访问信息中每个文件的样本标准差及每两个文件的样本协方差;Using the file access information to calculate the sample standard deviation of each file and the sample covariance of every two files in the file access information;
    根据所述样本标准差及所述样本协方差计算所述文件访问信息中任意两个文件的关联值;Calculate the correlation value of any two files in the file access information according to the sample standard deviation and the sample covariance;
    汇总所有的所述关联值得到所述文件关联值集。All the association values are aggregated to obtain the file association value set.
  3. 如权利要求2所述的文件访问方法,其中,所述根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集,包括:The file access method according to claim 2, wherein, according to the file association value set, dividing the association relationship of the corresponding file in the file access information to obtain a file association relationship set, comprising:
    判断所述文件关联值集中的每个关联值是否大于预设阈值;Judging whether each associated value in the file associated value set is greater than a preset threshold;
    当所述关联值大于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为强关联关系;When the association value is greater than a preset threshold, classifying the association relationship of the corresponding file in the user's access information into a strong association relationship;
    当所述关联值小于或等于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为弱关联关系;When the association value is less than or equal to the preset threshold, classifying the association relationship of the corresponding file in the user's access information into a weak association relationship;
    汇总划分后的所有所述关联关系,得到所述文件关联关系集。All the divided association relationships are aggregated to obtain the file association relationship set.
  4. 如权利要求1所述的文件访问方法,其中,所述利用所述文件关联关系集对所述待访问文件进行访问,包括:The file access method according to claim 1, wherein the accessing the to-be-accessed file by using the file association relationship set comprises:
    提取所述文件信息中所述待访问文件对应的存储信息,得到访问信息;Extracting the storage information corresponding to the to-be-accessed file in the file information to obtain the access information;
    根据所述访问信息对所述待访问文件进行读取访问,更新所述文件信息中的文件访问信息;Perform read access to the file to be accessed according to the access information, and update the file access information in the file information;
    基于所述待访问文件,利用所述文件关联关系集进行访问加速。Based on the to-be-accessed file, access acceleration is performed using the file association relationship set.
  5. 如权利要求3所述的文件访问方法,其中,所述基于所述待访问文件,利用所述文件关联关系集进行访问加速,包括:The file access method according to claim 3, wherein, based on the to-be-accessed file, using the file association relationship set to perform access acceleration, comprising:
    筛选所述文件关联关系集中与所述待访问文件对应的关联关系,得到初始关联关系集;Screening the association relationship corresponding to the file to be accessed in the file association relationship set to obtain an initial association relationship set;
    筛选所述初始关联关系集中的强关联关系,得到目标关联关系集;Screening strong associations in the initial association set to obtain a target association set;
    选取所述目标关联关系集中所述待访问文件对应的所有文件,得到目标文件集;Selecting all files corresponding to the to-be-accessed files in the target association relationship set to obtain a target file set;
    提取所述文件信息中所述目标文件集中每个文件对应的存储信息,得到目标存储信息集;Extracting the storage information corresponding to each file in the target file set in the file information to obtain a target storage information set;
    根据所述目标存储信息集进行访问加速。Access acceleration is performed according to the target storage information set.
  6. 如权利要求5所述的文件访问方法,其中,所述根据所述目标存储信息集进行访问加速,包括:The file access method according to claim 5, wherein the performing access acceleration according to the target storage information set comprises:
    将所述目标存储信息集中存储在预设的低速存储池中的文件转移至预设的高速存储池,并更新所述文件信息中对应的存储信息;Transfer the files whose target storage information is centrally stored in the preset low-speed storage pool to the preset high-speed storage pool, and update the corresponding storage information in the file information;
    设定存储日期,当所述转移至预设的高速存储池的文件在所述存储日期内被访问时,更新所述存储日期;Set a storage date, and update the storage date when the file transferred to the preset high-speed storage pool is accessed within the storage date;
    当到达所述存储日期时,将所述转移至预设的高速存储池的文件转移至所述低速存储池,并更新所述文件信息中对应的存储信息。When the storage date is reached, the file transferred to the preset high-speed storage pool is transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated.
  7. 如权利要求6所述的文件访问方法,其中,所述更新所述存储日期,包括:The file access method of claim 6, wherein the updating the storage date comprises:
    将所述存储日期增加预设天数后作为新的存储日期。The storage date is increased by a preset number of days as a new storage date.
  8. 一种文件访问装置,其中,包括:A file access device, comprising:
    信息获取模块,用于获取文件信息;Information acquisition module, used to acquire file information;
    关联计算模块,用于提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;The association calculation module is used to extract the file access information in the file information, perform association calculation by using the file access information, and obtain a file association value set; divide the corresponding files in the file access information according to the file association value set. Association relationship, get the file association relationship set;
    文件访问模块,用于响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。The file access module is configured to respond to the file access request, determine the to-be-accessed file according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  9. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device comprises:
    至少一个处理器;以及,at least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的计算机程序指令,所述计算机程序指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下步骤:The memory stores computer program instructions executable by the at least one processor, the computer program instructions being executed by the at least one processor to enable the at least one processor to perform the steps of:
    获取文件信息;get file information;
    提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
    根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
    响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  10. 如权利要求9所述的电子设备,其中,所述提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集,包括:The electronic device according to claim 9, wherein, extracting the file access information in the file information, and using the file access information to perform an association calculation to obtain a file association value set, comprising:
    利用所述文件访问信息计算所述文件访问信息中每个文件的样本标准差及每两个文件的样本协方差;Using the file access information to calculate the sample standard deviation of each file and the sample covariance of every two files in the file access information;
    根据所述样本标准差及所述样本协方差计算所述文件访问信息中任意两个文件的关联值;Calculate the correlation value of any two files in the file access information according to the sample standard deviation and the sample covariance;
    汇总所有的所述关联值得到所述文件关联值集。All the association values are aggregated to obtain the file association value set.
  11. 如权利要求10所述的电子设备,其中,所述根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集,包括:The electronic device according to claim 10, wherein, according to the file association value set, dividing the association relationship of the corresponding file in the file access information to obtain a file association relationship set, comprising:
    判断所述文件关联值集中的每个关联值是否大于预设阈值;Judging whether each associated value in the file associated value set is greater than a preset threshold;
    当所述关联值大于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为强关联关系;When the association value is greater than a preset threshold, classifying the association relationship of the corresponding file in the user's access information into a strong association relationship;
    当所述关联值小于或等于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为弱关联关系;When the association value is less than or equal to the preset threshold, classifying the association relationship of the corresponding file in the user's access information into a weak association relationship;
    汇总划分后的所有所述关联关系,得到所述文件关联关系集。All the divided association relationships are aggregated to obtain the file association relationship set.
  12. 如权利要求9所述的电子设备,其中,所述利用所述文件关联关系集对所述待访问文件进行访问,包括:The electronic device according to claim 9, wherein the accessing the to-be-accessed file by using the file association relationship set comprises:
    提取所述文件信息中所述待访问文件对应的存储信息,得到访问信息;Extracting the storage information corresponding to the to-be-accessed file in the file information to obtain the access information;
    根据所述访问信息对所述待访问文件进行读取访问,更新所述文件信息中的文件访问信息;Perform read access to the file to be accessed according to the access information, and update the file access information in the file information;
    基于所述待访问文件,利用所述文件关联关系集进行访问加速。Based on the to-be-accessed file, access acceleration is performed using the file association relationship set.
  13. 如权利要求11所述的电子设备,其中,所述基于所述待访问文件,利用所述文 件关联关系集进行访问加速,包括:The electronic device according to claim 11, wherein, based on the to-be-accessed file, using the file association relationship set to perform access acceleration, comprising:
    筛选所述文件关联关系集中与所述待访问文件对应的关联关系,得到初始关联关系集;Screening the association relationship corresponding to the file to be accessed in the file association relationship set to obtain an initial association relationship set;
    筛选所述初始关联关系集中的强关联关系,得到目标关联关系集;Screening strong associations in the initial association set to obtain a target association set;
    选取所述目标关联关系集中所述待访问文件对应的所有文件,得到目标文件集;Selecting all files corresponding to the to-be-accessed files in the target association relationship set to obtain a target file set;
    提取所述文件信息中所述目标文件集中每个文件对应的存储信息,得到目标存储信息集;Extracting the storage information corresponding to each file in the target file set in the file information to obtain a target storage information set;
    根据所述目标存储信息集进行访问加速。Access acceleration is performed according to the target storage information set.
  14. 如权利要求13所述的电子设备,其中,所述根据所述目标存储信息集进行访问加速,包括:The electronic device according to claim 13, wherein the performing access acceleration according to the target storage information set comprises:
    将所述目标存储信息集中存储在预设的低速存储池中的文件转移至预设的高速存储池,并更新所述文件信息中对应的存储信息;Transfer the files whose target storage information is centrally stored in the preset low-speed storage pool to the preset high-speed storage pool, and update the corresponding storage information in the file information;
    设定存储日期,当所述转移至预设的高速存储池的文件在所述存储日期内被访问时,更新所述存储日期;Set a storage date, and update the storage date when the file transferred to the preset high-speed storage pool is accessed within the storage date;
    当到达所述存储日期时,将所述转移至预设的高速存储池的文件转移至所述低速存储池,并更新所述文件信息中对应的存储信息。When the storage date is reached, the file transferred to the preset high-speed storage pool is transferred to the low-speed storage pool, and the corresponding storage information in the file information is updated.
  15. 如权利要求14所述的电子设备,其中,所述更新所述存储日期,包括:The electronic device of claim 14, wherein the updating the storage date comprises:
    将所述存储日期增加预设天数后作为新的存储日期。The storage date is increased by a preset number of days as a new storage date.
  16. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium storing a computer program, wherein the computer program implements the following steps when executed by a processor:
    获取文件信息;get file information;
    提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集;Extracting the file access information in the file information, and using the file access information to perform association calculation to obtain a file association value set;
    根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集;Divide the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set;
    响应文件访问请求,根据所述文件访问请求确定待访问文件,利用所述文件关联关系集对所述待访问文件进行访问。In response to the file access request, determine the file to be accessed according to the file access request, and use the file association relationship set to access the to-be-accessed file.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述提取所述文件信息中的文件访问信息,利用所述文件访问信息进行关联计算,得到文件关联值集,包括:The computer-readable storage medium according to claim 16, wherein the extracting the file access information in the file information, and using the file access information to perform an association calculation to obtain a file association value set, comprising:
    利用所述文件访问信息计算所述文件访问信息中每个文件的样本标准差及每两个文件的样本协方差;Using the file access information to calculate the sample standard deviation of each file and the sample covariance of every two files in the file access information;
    根据所述样本标准差及所述样本协方差计算所述文件访问信息中任意两个文件的关联值;Calculate the correlation value of any two files in the file access information according to the sample standard deviation and the sample covariance;
    汇总所有的所述关联值得到所述文件关联值集。All the association values are aggregated to obtain the file association value set.
  18. 如权利要求17所述的计算机可读存储介质,其中,所述根据所述文件关联值集划分所述文件访问信息中对应文件的关联关系,得到文件关联关系集,包括:The computer-readable storage medium according to claim 17, wherein, dividing the association relationship of the corresponding file in the file access information according to the file association value set to obtain a file association relationship set, comprising:
    判断所述文件关联值集中的每个关联值是否大于预设阈值;Judging whether each associated value in the file associated value set is greater than a preset threshold;
    当所述关联值大于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为强关联关系;When the association value is greater than a preset threshold, classifying the association relationship of the corresponding file in the user's access information into a strong association relationship;
    当所述关联值小于或等于预设阈值时,将所述用户的访问信息中对应文件的关联关系划分为弱关联关系;When the association value is less than or equal to the preset threshold, classifying the association relationship of the corresponding file in the user's access information into a weak association relationship;
    汇总划分后的所有所述关联关系,得到所述文件关联关系集。All the divided association relationships are aggregated to obtain the file association relationship set.
  19. 如权利要求16所述的计算机可读存储介质,其中,所述利用所述文件关联关系集对所述待访问文件进行访问,包括:The computer-readable storage medium of claim 16, wherein the accessing the to-be-accessed file using the file association relationship set comprises:
    提取所述文件信息中所述待访问文件对应的存储信息,得到访问信息;Extracting the storage information corresponding to the to-be-accessed file in the file information to obtain the access information;
    根据所述访问信息对所述待访问文件进行读取访问,更新所述文件信息中的文件访问 信息;According to the access information, the file to be accessed is read and accessed, and the file access information in the file information is updated;
    基于所述待访问文件,利用所述文件关联关系集进行访问加速。Based on the to-be-accessed file, access acceleration is performed using the file association relationship set.
  20. 如权利要求18所述的计算机可读存储介质,其中,所述基于所述待访问文件,利用所述文件关联关系集进行访问加速,包括:The computer-readable storage medium of claim 18, wherein, based on the to-be-accessed file, using the file association relationship set to perform access acceleration, comprising:
    筛选所述文件关联关系集中与所述待访问文件对应的关联关系,得到初始关联关系集;Screening the association relationship corresponding to the file to be accessed in the file association relationship set to obtain an initial association relationship set;
    筛选所述初始关联关系集中的强关联关系,得到目标关联关系集;Screening strong associations in the initial association set to obtain a target association set;
    选取所述目标关联关系集中所述待访问文件对应的所有文件,得到目标文件集;Selecting all files corresponding to the to-be-accessed files in the target association relationship set to obtain a target file set;
    提取所述文件信息中所述目标文件集中每个文件对应的存储信息,得到目标存储信息集;Extracting the storage information corresponding to each file in the target file set in the file information to obtain a target storage information set;
    根据所述目标存储信息集进行访问加速。Access acceleration is performed according to the target storage information set.
PCT/CN2021/082867 2020-12-25 2021-03-25 File access method, apparatus, device, and readable storage medium WO2022134345A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011562951.X 2020-12-25
CN202011562951.XA CN112667570A (en) 2020-12-25 2020-12-25 File access method, device, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
WO2022134345A1 true WO2022134345A1 (en) 2022-06-30

Family

ID=75409228

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/082867 WO2022134345A1 (en) 2020-12-25 2021-03-25 File access method, apparatus, device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN112667570A (en)
WO (1) WO2022134345A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103377060A (en) * 2012-04-25 2013-10-30 腾讯科技(深圳)有限公司 Computer program acceleration method and system
US20170011054A1 (en) * 2015-07-11 2017-01-12 International Business Machines Corporation Intelligent caching in distributed clustered file systems
CN110688360A (en) * 2019-09-17 2020-01-14 济南浪潮数据技术有限公司 Distributed file system storage management method, device, equipment and storage medium
CN111552664A (en) * 2020-03-24 2020-08-18 福建天泉教育科技有限公司 Method and storage medium for intelligently scheduling cold and hot of storage system
CN111813740A (en) * 2019-04-11 2020-10-23 中国移动通信集团四川有限公司 File layered storage method and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103377060A (en) * 2012-04-25 2013-10-30 腾讯科技(深圳)有限公司 Computer program acceleration method and system
US20170011054A1 (en) * 2015-07-11 2017-01-12 International Business Machines Corporation Intelligent caching in distributed clustered file systems
CN111813740A (en) * 2019-04-11 2020-10-23 中国移动通信集团四川有限公司 File layered storage method and server
CN110688360A (en) * 2019-09-17 2020-01-14 济南浪潮数据技术有限公司 Distributed file system storage management method, device, equipment and storage medium
CN111552664A (en) * 2020-03-24 2020-08-18 福建天泉教育科技有限公司 Method and storage medium for intelligently scheduling cold and hot of storage system

Also Published As

Publication number Publication date
CN112667570A (en) 2021-04-16

Similar Documents

Publication Publication Date Title
WO2022121171A1 (en) Similar text matching method and apparatus, and electronic device and computer storage medium
WO2019114128A1 (en) Block chain transaction block processing method, electronic device and readable storage medium
WO2021189826A1 (en) Message generation method and apparatus, electronic device, and computer-readable storage medium
WO2022160449A1 (en) Text classification method and apparatus, electronic device, and storage medium
WO2022116424A1 (en) Method and apparatus for training traffic flow prediction model, electronic device, and storage medium
WO2022048210A1 (en) Named entity recognition method and apparatus, and electronic device and readable storage medium
CN112801718B (en) User behavior prediction method, device, equipment and medium
WO2022142020A1 (en) Information pushing method and apparatus, electronic device, and computer-readable storage medium
WO2022179123A1 (en) Data update and presentation method and apparatus, and electronic device and storage medium
WO2022222943A1 (en) Department recommendation method and apparatus, electronic device and storage medium
WO2022121172A1 (en) Text error correction method and apparatus, electronic device, and computer readable storage medium
WO2022105135A1 (en) Information verification method and apparatus, and electronic device and storage medium
WO2021238563A1 (en) Enterprise operation data analysis method and apparatus based on configuration algorithm, and electronic device and medium
WO2022100032A1 (en) System analysis visualization method and apparatus, electronic device, and computer readable storage medium
WO2022088632A1 (en) User data monitoring and analysis method, apparatus, device, and medium
WO2022179119A1 (en) Data verification method and apparatus, electronic device, and readable storage medium
CN114185895A (en) Data import and export method and device, electronic equipment and storage medium
WO2019153483A1 (en) Service charge determination method and apparatus, and terminal device and medium
WO2022227192A1 (en) Image classification method and apparatus, and electronic device and medium
CN113434542B (en) Data relationship identification method and device, electronic equipment and storage medium
WO2022048362A1 (en) Data storage method and apparatus, electronic device, and storage medium
WO2021189905A1 (en) Distributed data retrieval method and apparatus, and electronic device and storage medium
CN111429085A (en) Contract data generation method and device, electronic equipment and storage medium
WO2022134345A1 (en) File access method, apparatus, device, and readable storage medium
CN112925753B (en) File additional writing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908357

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21908357

Country of ref document: EP

Kind code of ref document: A1