WO2020024445A1

WO2020024445A1 - Data storage method and apparatus, computer device, and computer storage medium

Info

Publication number: WO2020024445A1
Application number: PCT/CN2018/111119
Authority: WO
Inventors: 易仁杰
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-08-01
Filing date: 2018-10-21
Publication date: 2020-02-06
Also published as: CN109063121A; CN109063121B

Abstract

A data storage method and apparatus, a computer device, and a computer storage medium, relating to the technical field of the Internet. According to a physical location of a user, a target storage node closest to the physical location can be determined to store data to be stored, so that users at different locations implement distributed storage, data of all the users would not be stored in a storage cluster, and thus, data can be queried quickly, thereby shortening the time of querying the data, and achieving good intelligence. The method comprises: receiving a data storage instruction of a user (101); positioning a terminal of the user on the basis of a terminal identifier to obtain a first physical location of the user (102); determining at least one storage node according to the first physical location, and determining at least one first distance between the at least one storage node and the first physical location (103); extracting a target distance from among the at least one first distance, and determining a target storage node indicated by the target distance (104); and obtaining, from the data storage instruction, data to be stored, and storing said data to the target storage node (105).

Description

Data storage method, device, computer equipment and computer storage medium

This application claims priority from the Chinese patent application filed on August 1, 2018 with the Chinese Patent Office, application number 2018108656426, and application name "Data Storage Method, Device, Computer Equipment, and Computer Storage Medium", the entire contents of which are incorporated by reference Incorporated in the application.

Technical field

The present application relates to the field of Internet technologies, and in particular, to a data storage method, device, computer device, and computer storage medium.

Background technique

With the rapid development of various applications such as mobile devices, social networks, and the Internet of Things, data generated by human society has exploded. The traditional data storage method is usually disk storage. Users store the data that needs to be stored on the disk so that the data can be viewed anytime, anywhere. However, with the increasing amount of data to be stored, traditional disks have become increasingly difficult to meet the storage needs of users based on massive data in terms of capacity, performance, and bandwidth. Therefore, data storage systems supported by cloud platforms Came into being. A data center is deployed in the data storage system. Users can upload the data to be stored to the data center, and the data center stores the data to be stored. In related technologies, a data center is provided with a storage cluster for storing data. When a user uploads data to be stored to a data storage system, the data center receives the data to be stored, and the data center receives the data to be stored. Data is added to the storage cluster for storage. During the implementation of this application, the applicant found that the related technology has at least the following problems: When the data center performs data storage, all the data to be stored are stored in the same storage cluster, so that the storage cluster stores Massive data, and then when receiving a user's data query request, data query needs to be performed in the massive data, and the data that the user needs to query cannot be obtained quickly, resulting in a long time consuming query data and poor intelligence.

Summary of the invention

In view of this, this application provides a data storage method, device, computer equipment, and computer storage medium. The main purpose is to solve the need to query data in a large amount of data when receiving a data query request from a user. Obtaining data that users need to query results in a longer time and poor intelligence.

According to a first aspect of the present application, a data storage method is provided. The method includes: receiving a data storage instruction of a user, where the data storage instruction includes at least data to be stored and a terminal identifier of a terminal of the user; The terminal identification locates the terminal of the user, and obtains the first physical location of the user; determines at least one storage node according to the first physical location, and determines the at least one storage node and the first physical location At least one first distance between them, the at least one storage node is deployed in a storage resource pool indicated by the first physical location; extracting a target distance from the at least one first distance, and determining a target distance indicated by the target distance A target storage node, where the target distance is a first distance among the at least one first distance that meets a first distance criterion; obtaining the data to be stored in the data storage instruction, and storing the data to be stored in the The target storage node is described. According to a second aspect of the present application, a data storage device is provided. The device includes a first receiving module configured to receive a user's data storage instruction. The data storage instruction includes at least data to be stored and a terminal of the user. A terminal identification; a positioning module configured to locate the user's terminal based on the terminal identification to obtain a first physical location of the user; a first distance determination module configured to determine based on the first physical location At least one storage node and determining at least a first distance between the at least one storage node and the first physical location, the at least one storage node being deployed in a storage resource pool indicated by the first physical location; A first node determining module, configured to extract a target distance from the at least one first distance, and determine a target storage node indicated by the target distance, where the target distance satisfies a first distance criterion among the at least one first distance A first distance; a first storage module, configured to obtain the data to be stored in the data storage instruction, To be stored in the data storage to the target storage node. According to a third aspect of the present application, there is provided a computer device including a memory and a processor, where the computer-readable instructions are stored in the memory, and the processor implements the computer-readable instructions to implement the first section. In one aspect the steps of the method. According to a fourth aspect of the present application, there is provided a computer non-volatile readable storage medium having computer-readable instructions stored thereon, characterized in that the computer-readable instructions implement the first aspect when executed by a processor. The steps of the method. With the above technical solution, a data storage method and device provided in this application is compared with a current method in which all received data to be stored are stored in the same storage cluster. Physical location, determine the target storage node closest to the physical location to store the user's data to be stored, so that the data stored by users in different locations is distributed, and the storage data of all users is not stored in a storage cluster. When a user's data query request is received, the data that the user needs to query can be quickly obtained, the time taken to query the data is shortened, and the intelligence is better. The above description is only an overview of the technical solution of this application. In order to understand the technical means of this application more clearly, it can be implemented in accordance with the content of the description, and in order to make the above and other purposes, features, and advantages of this application more obvious and understandable. The specific implementations of this application are listed below.

BRIEF DESCRIPTION OF THE DRAWINGS

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the detailed description of the preferred embodiments below. The drawings are only for the purpose of illustrating preferred embodiments and are not to be considered as limiting the present application. Moreover, the same reference numerals are used throughout the drawings to refer to the same parts. In the drawings: FIG. 1 shows a schematic flowchart of a data storage method provided by an embodiment of the present application; FIG. 2A shows a schematic structure diagram of a data storage system provided by an embodiment of the present application; FIG. 2B shows the present application A schematic flow chart of a data storage method provided by the embodiment; FIG. 2C shows a schematic flow chart of a data storage method provided by an embodiment of the present application; FIG. 2D shows a schematic flow chart of a data storage method provided by an embodiment of the present application; FIG. 3A shows a schematic structural diagram of a data storage device provided by an embodiment of the present application; FIG. 3B shows a schematic structural diagram of a data storage device provided by an embodiment of the present application; FIG. 3C shows a A schematic diagram of a data storage device structure; FIG. 3D shows a schematic diagram of a data storage device structure provided in an embodiment of the present application; FIG. 3E shows a schematic diagram of a data storage device structure provided in an embodiment of the application; Schematic diagram of the structure of a data storage device provided by an embodiment of the present application; FIG. 3G shows the structure of a data storage device provided by an embodiment of the present application Intentions.

detailed description

Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided to enable a thorough understanding of the present disclosure, and to fully convey the scope of the present disclosure to those skilled in the art. An embodiment of the present application provides a data storage method. As shown in FIG. 1, the method includes: 101. Receive a user's data storage instruction. The data storage instruction includes at least data to be stored and a terminal identifier of a user's terminal. For the specific process, refer to step 203 in the following embodiment. 102. The user's terminal is located based on the terminal identifier to obtain the first physical location of the user. For the specific process, refer to step 203 in the following embodiments. 103. Determine at least one storage node according to the first physical location, determine at least a first distance between the at least one storage node and the first physical location, and deploy at least one storage node. In the storage resource pool indicated by the first physical location. For the specific process, refer to step 204 in the following embodiments. 104. Extract the target distance from at least one first distance, and determine the target storage node indicated by the target distance. The target distance is the first of the at least one first distance that meets the first distance criterion. distance. For the specific process, refer to step 205 in the following embodiment. 105. Obtain the data to be stored in the data storage instruction, and store the data to be stored in the target storage node. For details, refer to step 206 in the following embodiments. The data storage method provided in the embodiment of the present application can determine the target storage node closest to the physical location to store the user's to-be-stored data according to the physical location of the data requested by the user, so that the data stored by users in different locations can achieve distributed storage. , The storage data of all users will not be stored in a storage cluster, so that when the user's data query request is received, the data that the user needs to query can be quickly obtained, the time consumed for querying the data is shortened, and the intelligence is better. Before explaining the embodiments of the present application in detail, the structure of the data storage system involved in the embodiments of the present application will be briefly introduced. Referring to FIG. 2A, an endpoint and a storage resource pool are deployed in a plurality of regions in a data storage system. Among them, Endpoint represents the directory system when the user's terminal is connected to the storage resource pool; the storage resource pool is used to store the data uploaded by the user. The storage resource pool includes at least one storage node, and the storage of each region's storage resource pool. They are all highly available saves, and even if a storage node in the storage resource pool fails, the data stored in the storage node will not be lost. In addition, the storage resource pool in each region is connected to the resource synchronization channel, and the resource synchronization channel is used to synchronize the data in a certain region from the storage resource pool in the current region to the storage resource pool in another region. The embodiment of the present application provides a data storage method, which can quickly obtain the data that a user needs to query, shorten the time required for querying the data, and have better intelligence. As shown in FIG. 2B, the method includes:

201. Determine multiple storage resource pools. For each storage resource pool in the multiple storage resource pools, establish a data connection between the storage resource pool and other storage resource pools based on a preset data channel. In the embodiment of the present application, each storage resource pool in the multiple storage resource pools includes at least one storage node. The preset data channel is at least a public network channel or an internal network channel. The public network channel may specifically be a VPN (Virtual Private Network (Virtual Private Network) channel. The applicant recognizes that in order to make it more convenient for users to store data, and to select the most suitable storage node for data storage for the user, the storage node that stores the user's data to be stored can be determined according to the geographical location of the user, thereby shortening the user When to store and retrieve data. In addition, considering that users may perform data query in different regions, in order to facilitate the user's query of data, it is necessary to synchronize the data that the user needs to query from one region to another. Therefore, one can be established based on a preset data channel. Data connections between storage resource pools in a region and storage resource pools in other regions. When establishing a data connection between a storage resource pool in one region and a storage resource pool in another region, a preset data channel may be adopted to separately establish a storage resource pool in one region and a storage resource pool in another region and a resource synchronization channel. The data connection between them enables data synchronization from the resource storage pool in one region to the resource storage pool in another region through the resource synchronization channel. For at least one storage node in an area, at least one storage node may use a preset data channel of the area for data connection, thereby ensuring synchronization, backup, and other operations on data stored in the storage node.

202: Rate data connections between the storage resource pool and other storage resource pools, and generate a rating result. In the embodiment of the present application, when a storage resource pool establishes a data connection with another storage resource pool, a public network channel or an internal network channel may be used for connection, and the security coefficient of the internal network channel is higher than that of the public network channel. Therefore, after the data connection between the storage resource pool and other storage resource pools is successfully established, the data connection between the storage resource pool and other storage resource pools can be rated to generate a rating result, so that synchronization can be determined later based on the rating results. When data is being determined, the most secure storage node is synchronized, and users can also be reminded of the security of data synchronization. Among them, if the data connection between the storage resource pool and other storage resource pools is a public network channel, the generated rating result is three levels, which can be specifically expressed as level C, which indicates that the storage resource pool in this area is different from the storage resource pools in other areas. There is only a public network channel between them, and the risk of synchronization is high; if the data connection between the storage resource pool and other storage resource pools is an internal network channel, the generated rating result is a second-level, which can be specifically expressed as a B-level, indicating the area's The storage resource pool and the storage resource pools in other regions only have internal network channels, and the risk of synchronization is low. If the data connection between the storage resource pool and other storage resource pools is a public network channel and an internal network channel, the generated rating result is Level 1, which can be specifically expressed as level A, indicating that the storage resource pools in this region and the storage resource pools in other regions have two channels, the internal network channel and the public network channel. The synchronization risk is the lowest, and it will take precedence when synchronizing data. Select the intranet channel for data synchronization.

203. When the data storage instruction is received, the terminal identification is extracted from the data storage instruction, and the user's terminal is located based on the terminal identification to obtain the first physical location of the user. In the embodiment of the present application, the data storage instruction includes at least data to be stored and a terminal identifier of a user's terminal. The server provided by the data storage system can provide users with data storage services in the form of clients. Data storage portals are provided in the clients, and users can download the clients in the terminals they hold; further, in order to ensure that users can be accurate in the future To obtain the stored data, the client can provide a registration service for the user, and assign a registration ID to each registered user, and use this registration ID as the terminal ID. In this way, when it is detected that the user triggers the data storage entry, the data storage system determines that the data storage instruction is received to obtain the terminal identification of the terminal held by the user, which may specifically be the mobile phone number of the terminal held by the user and the user's registration identification. After the terminal identifier of the terminal is obtained, the terminal can be located according to the terminal identifier, and then the first physical location of the terminal is obtained. Wherein, when positioning the terminal, the terminal may be positioned by using GPRS (General Packet Radio Service). In the actual application process, when positioning the terminal, the GPRS function in the operating system of the terminal can be called to locate the terminal, and then obtain the first physical location of the terminal. The embodiment of the present application does not specifically describe the terminal positioning method. limited.

204. Determine at least one storage node according to the first physical location, and determine at least one first distance between the at least one storage node and the first physical location. In the embodiment of the present application, at least one storage node is deployed in a storage resource pool indicated by the first physical location. After the first physical location of the terminal is determined, the area where the terminal is located can be determined according to the obtained first physical location. For example, the first physical location may be a certain district in Shanghai, a certain district in Shenzhen, etc. It can be determined that the area to which the first physical location belongs is Shanghai and Shenzhen. Since the storage resource pool in each area includes more than one storage node, in order to select the best storage node for the user in the future, it is necessary to determine at least one storage node in the area. After determining at least one storage node, the location of each storage node is different. Considering that the closer the storage node is to the user's location, the faster the data storage and query speed will be, which will reduce the data during transmission. It takes a long time, therefore, at least one first distance between the at least one storage node and the first physical location of the terminal may be determined, and then the storage node closest to the terminal is determined according to the at least one first distance to store the user's data to be stored. Wherein, when acquiring the first distance, the node position of at least one storage node may be located first; then, the distance between the node position of the at least one storage node and the first physical position is calculated as the first distance. Because at least one storage node is included in the storage resource pool of an area, at least one first distance can be obtained.

205. Sort at least one first distance from large to small to generate a first ranking result, and extract the first ranked first distance from the first ranking result as the target distance, and determine the storage node indicated by the target distance as the target storage. node. In the embodiment of the present application, when at least one first distance is obtained, the at least one first distance is sorted from large to small to generate a first ranking result, and the first ranked first is extracted from the first ranking result. The distance is used as the target distance, that is, the smallest first distance is used as the target distance, and the storage node indicated by the target distance is determined as the target storage node. In addition, it is also possible to sort at least one first distance from small to large, use the first distance ranked first as the target distance, and further determine the target storage node.

206: Obtain the data to be stored in the data storage instruction, and store the data to be stored in the target storage node. In the embodiment of the present application, after the target storage node is determined, the data to be stored can be obtained in the data storage instruction, the data to be stored is transmitted to the target storage node, and the target storage node stores the data to be stored.

207. Determine a plurality of second distances between the target storage node and other storage nodes in at least one storage node, sort the plurality of second distances from large to small, generate a second ranking result, and extract from the second ranking result. The preset number of the last second distances uses the storage nodes indicated by the preset number of the second last distances as backup storage nodes. In the embodiment of the present application, the other storage nodes are storage nodes other than the target node among the at least one storage node. There is a data connection between at least one storage node in each storage resource pool. In order to ensure the security of the data stored in a storage node, multiple backup storage nodes can be determined according to the target storage node in order to store the data in the storage node. The data is synchronized to multiple backup storage nodes in the storage resource pool, and data backup is implemented based on the multiple backup storage nodes. After the target storage node is determined, multiple second distances between the target storage node and other storage nodes in at least one storage node of the current storage resource pool can be determined, and the multiple second distances can be sorted from large to small to generate The second sorting result, extracting a preset number of last second distances from the second sorting result, that is, extracting a preset number of minimum second distances, and using the storage nodes indicated by the preset number of last second distances as Backup the storage nodes, and then synchronize the data to be stored to a preset number of backup storage nodes for data backup. Among them, the preset number can be two, so that two storage nodes with the smallest distance from the target storage node can be selected as backup storage nodes. For example, if the target storage node is P1, after storing the data to be stored in P1, select the storage nodes P2 and P3 that are closest to P1 as the backup storage nodes, so that in the future based on the data channel between P1 and P2 and P3, The data to be stored is transmitted to the backup storage nodes P2 and P3 for backup. In the actual application process, since the data connection between the storage resource pool and other storage resource pools is rated in step 202 and a rating result is generated, when determining the backup storage node for the target storage node, it is also possible to Select based on the rating results between the target storage node and other storage nodes. For example, suppose P1 is the target storage node, the connection mode of P1 and P2 is A level, and the average network delay is 100ms; the connection mode of P1 and P3 is C level, and the average delay is 300ms. Then, we use F (A) + G (100) to calculate the network weight of P2, and use F (C) + G (300) to obtain the network weight of P3. In this way, when determining which storage node is to be used as a backup storage node, for any storage node of other storage nodes, the first distance and the second weight of the second distance and the network weight can be obtained, and the second distance is calculated. Calculate the second product of the network weight and the second weight with the first product of the first weight, take the sum of the first product and the second product as the comprehensive distance of the storage node, and calculate the overall comprehensive distance of other storage nodes, Furthermore, a preset number of storage nodes with the smallest comprehensive distance is used as a backup storage node. It should be noted that, because the data channel between the storage resource pools may change and the data connections between the storage nodes may also change, the new backup storage can be determined periodically for each target storage node in the above manner. Node and adjust the backup location of the data. For example, a new backup storage node can be determined for each target storage node every 30 days. The embodiment of the present application does not specifically limit the timing for updating the backup storage node for the target storage node.

208: Transfer the data to be stored to a preset number of backup storage nodes, and perform data backup. In the embodiment of the present application, after a preset number of backup storage nodes are determined according to the target storage node, data to be stored can be transmitted to the preset number of backup storage nodes, and data backup is performed based on the preset number of backup storage nodes. , Thereby ensuring the security of the data stored in the target storage node. In the actual application process, the user may need to add data stored in the data storage system from time to time in daily work. For the process of adding data, see FIG. 2C. The process includes the content in steps 209 to 211.

209. Receive a user's data addition instruction to obtain the user's second physical location. If the second physical location is the same as the first physical location, execute the following step 210; if the second physical location is different from the first physical location, execute The following step 211. In the embodiment of the present application, the data addition instruction carries data to be added. Considering that the user may want to add the previously stored data in the future and add some new data for storage, therefore, the client provided by the data storage system can provide an additional data entry. When it is detected that the user triggers the data query entry, To display the data addition page, obtain the data to be added uploaded by the user on the data addition page, and determine that the user's data addition instruction is currently received. Among them, the user may move in the area due to business trips, travels, etc., so that when the user requests data addition, the second physical location is different from the first physical location where data was previously stored. The storage node from the user's second physical location may not be the target storage node for data storage before, and the storage node closest to the second physical location may be any one of the multiple backup storage nodes determined according to the target storage node. Therefore, when receiving a user's data addition instruction, it is necessary to obtain the second physical location of the user, compare the second physical location of the user with the first physical location where the data was previously stored, and determine the second physical location and the first physical location. Whether a physical location is the same. If the second physical location is the same as the first physical location where data was previously stored, the data to be added can be directly added to the target storage node, that is, the following step 210 is performed; if the second physical location is Different from the first physical location where the data was stored before, The need for additional data based on a backup storage node, i.e., step 211 performs the following.

210: If the second physical location is the same as the first physical location, write data to be added in the target storage node. In the embodiment of the present application, if the second physical location for which the user requests data addition is the same as the first physical location for data storage before, the data to be added may be written directly in the target storage node. It should be noted that after the data to be added is added to the target storage node, the data to be added needs to be synchronized to other backup storage nodes, so as to ensure that the data stored in multiple backup storage nodes and the target storage node are The data is consistent.

211. If the second physical location is different from the first physical location, store the data to be added to the backup storage node, and based on the backup storage node, synchronize the data to be added to the target storage node. In the embodiment of the present application, if the second physical location is different from the first physical location and the storage node whose distance from the second physical location satisfies the distance standard is a backup storage node of the target storage node, the data to be added may be stored in In the backup storage node, based on the backup storage node, the data to be added is synchronized to the target storage node, and the data to be added is added to the target storage node. For example, suppose the target storage node is P1, and the backup storage nodes are P2 and P3. If the user's first new physical location is closest to P2, when the user's data addition instruction is received, the additional data carried in the data addition instruction is stored. In P2, the additional data is synchronized to P1 and P3 based on P2. It should be noted that after the data to be added is added to the target storage node, the data to be added needs to be synchronized to other backup storage nodes, so as to ensure that the data stored in multiple backup storage nodes and the target storage node are The data is consistent. In addition, if the second physical location is different from the first physical location and is not in the same storage resource pool as the first physical location, that is, the node closest to the second physical location is not a backup storage node determined according to the target storage node, Then repeat the process of determining the target storage node shown in the above steps 204 to 205, determine the temporary storage node based on the second physical location, and transmit the data to be added to the temporary storage node. Based on the temporary storage node, the to-be-added Data is synchronized to the target storage node and a preset number of backup storage nodes. The process of determining the temporary storage node is not repeated here. For example, suppose the target storage node is P1, and the backup storage nodes are P2 and P3. P1, P2, and P3 are all located in area A. If the user's second physical location is in area B, and the storage node closest to the second physical location is P5 Then, the storage node P5 is used as a temporary storage node, and the additional data is transmitted to P5, and the additional data is transmitted to P1 based on P5, and synchronized to P2 and P3. In the process of practical application, the user needs to query the data stored in the data storage system from time to time in daily work. For the process of querying the data, see FIG. 2D. The process includes the content in steps 212 to 217.

212. Receive a user's data query request to obtain the user's query location. If the query location is consistent with the first physical location, perform the following step 213; if the query location is not consistent with the first physical location, perform the following step 214. In the embodiment of the present application, the data query request carries a data identifier of the data to be queried. Considering that the user may want to query the previously stored data in the future, the client provided by the data storage system can provide a data query entry. When it is detected that the user triggers the data query entry, the data query page is displayed to obtain the user. The data identifier of the data to be queried is input on the data query page, and it is determined that the user's data query request is currently received. Among them, the user may not perform data query at the previous data storage location due to business trips, trips, etc., so that the storage node closest to the user's query data location is not the target storage node for data storage. Therefore, When receiving a user's data query request, it is necessary to obtain the user's query location for data query, compare the query location with the first physical location where the data was previously stored, and determine whether the query location is consistent with the first physical location. If the location is consistent with the first physical location, you can directly obtain the data to be queried from the target storage node, that is, perform the following step 213; if the query location is not consistent with the first physical location, you need to determine the storage node that is closest to the query location And implement data query based on the storage node, that is, step 214 described below is performed.

213: If the query location is consistent with the first physical location, obtain the query data indicated by the data identifier in the target storage node, and return the query data to the user. In the embodiment of the present application, if the query location is consistent with the first physical location, the data to be queried indicated by the data identifier is acquired in the target storage node, and the data to be queried may be directly returned to the user. For example, suppose the user previously stored data in Shanghai, and the target storage node is Shanghai's P1. When a user's data query request is received, if it is detected that the user's current location is Shanghai, then the data to be queried is obtained in P1. Return the data to be queried to the user.

214. If the query location is inconsistent with the first physical location, repeat the above-mentioned process of determining the target storage node, determine the query storage node based on the query location, transmit the data identifier to the target storage node based on the query storage node, and receive based on the query storage node. The query data returned by the target storage node returns the query data to the user. In the embodiment of the present application, if the query location is inconsistent with the first physical location, it means that the storage node closest to the current query location is not the target storage node, and the storage node closest to the query location needs to be determined as the query storage node, and based on The query storage node transmits a data representation to the target storage node, so that the target storage node can obtain the data to be queried according to the data identifier, and then return the data to be queried to the query storage node, and the query storage node returns the data to be queried to the user. For example, suppose the previous user stored data in Shanghai, and the target storage node is P1 in Shanghai. When the user's data query request is received, if it is detected that the user's current location is Shenzhen, then it is determined in Shenzhen that the user is currently in The closest storage node P5 is the query storage node, and sends the data identifier of the data to be queried to P1 based on P5, and receives the data to be queried returned by P1 based on P5. It should be noted that, considering that a user may perform data query at the same query location multiple times, in order to shorten the time for the user to perform data query, the data storage system can perform statistics on the user's data query at the query location to generate the user's query rule. In order to predict the next query behavior of the user according to the query rule, thereby preparing the query before the user performs data query, and effectively shortening the user's time for data query. Specifically, generating a user's query rule and predicting the user's query behavior can be achieved through the following steps 1 to 2. Step 1: Statistic data query performed by the user at the query location to generate a query rule of the user at the query location. The query rule includes at least historical query times and historical query time. Among them, if it is detected that the user has not performed data query at the location where the data was stored previously, the query location for the current data query can be retained, and the user's data query at the query location will be continuously counted in the subsequent generation of users. The query pattern at this query position. For example, if the previous user stored data in Shanghai and the target storage node is P1 in Shanghai, if it is detected that the user requested to query data 10 times in Shenzhen during the working day, the query rule generated in Shenzhen may be 10 times and during the working day. . Step 2: Based on the query rule, predict the next query time of the user at the query location, and synchronize all the data in the target storage node to the query storage node at the next query time. In the embodiment of the present application, after the query rule is generated, the next query time of the user at the query location can be preset based on the query rule, and all data in the target storage node is synchronized to the query storage node at the next query time. So that the next time a user performs a data query at this location, the querying data node can directly provide the querying data without having to request the querying data from the target storage node, thereby reducing the time it takes to query the data. For example, continue to take the first step in the above step, the query rule generated in Shenzhen is 10 times and during the working day as an example, you can determine that the query location in Shenzhen is the closest query node to P5, and before the working day arrives, the target storage node is All the data in P1 are synchronized to the query storage node P5.

215. Count the number of data queries performed by the user at the query location to generate the number of queries. If the number of queries is greater than the number threshold, perform the following step 216; if the number of queries is less than the number threshold, perform the following step 217. In the embodiment of the present application, it is considered that the user may work and live in another area that is different from the area to which the first physical location for data storage belongs for a long time due to relocation or work transfer. In this way, the user will be Data query is requested in other areas at time. Therefore, the number of data queries performed by users at the query location can be counted to generate query times to determine whether the query times are greater than the threshold. If the query times are greater than the threshold, the data storage system can consider the user Being in the query position for a long time, you can re-determine a new target storage node for the user at this query position, and migrate all the data stored by the user in the target storage node to the new target storage node, thereby ensuring the high efficiency of data query That is, the following step 116 is performed; if the number of queries is less than the threshold, it means that the current user's data query operation does not meet the criteria for data migration, and the current data storage system provides the user with data storage, data addition, and data query. Service, that is, The following step 217 rows.

216. If the number of queries is greater than the threshold, the query storage node is used as the new target storage node, and all data in the target storage node is synchronized to the new target storage node. Based on the new target storage node, a preset number of new backup storage nodes are determined. To back up all data to a preset number of new backup storage nodes. In the embodiment of the present application, if the number of queries is greater than the threshold, it means that all the data stored by the user in the target storage node can be migrated to the query location, so that the query storage node at the query location can be used as the new target storage. Node, synchronizing all data previously stored by the user in the target storage node to the new target storage node, and repeatedly performing the content shown in step 207 above, to determine a preset number of new backup storage nodes based on the new target storage node, The data stored in the new target storage node is backed up to the new backup storage node, and when the user's data query request is received at the query location, the data to be queried can be obtained directly in the new target storage node. For example, suppose that the user previously stored data in Shanghai and the target storage node is P1 in Shanghai. The query threshold is 100 times. If it is detected that the user has requested data query in Shenzhen more than 100 times, and the location of the data query request is closest to P5 in Shenzhen. , You can use P5 as the new target storage node, copy all data stored in P1 to P5, and determine a new backup storage node near P5 to back up the data. It should be noted that after all data is migrated to the new target storage node, all data related to the user stored in the target storage node can be deleted, and if subsequent user requests are received for data addition, it is directly based on the new target The storage node can add data.

217. If the number of queries is less than the threshold, the user continues to receive the user's data query request, and performs data query and data acquisition. In the embodiment of the present application, if the number of queries is less than the threshold, it means that the conditions for migrating all data about the user in the target storage node to the query location are currently not met. In this way, the user continues to receive data query requests from users, and based on The query storage node can perform data query and data acquisition operations. The data storage method provided in the embodiment of the present application can determine the target storage node closest to the physical location to store the user's to-be-stored data according to the physical location of the data requested by the user, so that the data stored by users in different locations can achieve distributed storage. , The storage data of all users will not be stored in a storage cluster, so that when the user's data query request is received, the data that the user needs to query can be quickly obtained, the time consumed for querying the data is shortened, and the intelligence is better.

Further, as a specific implementation of the methods of FIGS. 2B to 2D, an embodiment of the present application provides a data storage device. As shown in FIG. 3A, the device includes a first receiving module 301, a positioning module 302, and a first distance determining module 303. The first node determination module 304 and the first storage module 302. The first receiving module 301 is configured to receive a user's data storage instruction. The data storage instruction includes at least data to be stored and a terminal identification of the user's terminal. The positioning module 302 is configured to locate the user's terminal based on the terminal identification. Obtaining a first physical location of a user; the first distance determining module 303 is configured to determine at least one storage node according to the first physical location, and determine at least a first distance between the at least one storage node and the first physical location, At least one storage node is deployed in a storage resource pool indicated by a first physical location; the first node determination module 304 is configured to extract a target distance from at least one first distance, and determine a target storage node indicated by the target distance, where the target distance is A first distance among at least one first distance that meets a first distance criterion; the first storage module 305 is configured to obtain data to be stored in a data storage instruction, and store the data to be stored in a target storage node. The data storage device provided in the embodiment of the present application can determine the target storage node closest to the physical location to store the user's to-be-stored data according to the physical location where the user requests data storage, so that the data stored by users in different locations can achieve distributed storage. , The storage data of all users will not be stored in a storage cluster, so that when the user's data query request is received, the data that the user needs to query can be quickly obtained, the time consumed for querying the data is shortened, and the intelligence is better. As shown in FIG. 3B, the first node determination module 304 includes a sorting sub-module 3041 and a determination sub-module 3042. The sorting sub-module 3041 is configured to sort at least one first distance from large to small to generate a first sorting result; and the determining sub-module 3042 is configured to extract a first-ranked first distance from the first sorting result As the target distance, the storage node indicated by the target distance is determined as the target storage node. As shown in FIG. 3C, the device further includes a second distance determination module 306, a second node determination module 307, and a first backup module 308. The second distance determining module 306 is configured to determine a plurality of second distances between the target storage node and other storage nodes in the at least one storage node. The other storage nodes are storage nodes other than the target node among the at least one storage node. A second node determining module 307 is configured to determine, according to a plurality of second distances, a preset number of backup storage nodes whose second distance meets the second distance standard; the first backup module 308 is configured to transmit data to be stored to a pre-stored node. Set a number of backup storage nodes and perform data backup. As shown in FIG. 3D, the device further includes a second receiving module 309, a second storage module 310, and a first synchronization module 311. The second receiving module 309 is configured to receive a user's data addition instruction to acquire a second physical location of the user, and the data addition instruction carries data to be added; and the second storage module 310 is configured to: The storage node with a different location and a distance from the second physical location that satisfies the distance standard is any backup storage node of a preset number of backup storage nodes, and then stores the data to be added to the backup storage node; the first synchronization module 311 , Used to synchronize the data to be added to the target storage node based on the backup storage node. As shown in FIG. 3E, the device further includes a position acquisition module 312, a first sending module 313, and a second sending module 314. The location acquisition module 312 is configured to receive a user's data query request, to obtain the user's query location, and the data query request carries the data identifier of the data to be queried; the first sending module 313 is configured to query the location if the query location is the same as the first physical location. If they are the same, a first data acquisition request is sent to the target storage node, and the first data acquisition request is used to instruct the target storage node to acquire the data to be queried indicated by the data identifier, and return the data to be queried to the user; the second sending module 314, If the query location is inconsistent with the first physical location, the query storage node is determined according to the query location, and a second data acquisition request is sent to the query storage node. The second data acquisition request is used to instruct the query storage node to obtain the pending data through the target storage node. Query the data and return the data to be queried to the user. As shown in FIG. 3F, the device further includes a first statistics module 315 and a prediction module 316. The first statistics module 315 is configured to collect statistics on data queries performed by users at the query location to generate query rules of the user at the query location. The query rule includes at least historical query times and historical query time. The prediction module 316 is configured to be based on The query rule predicts the next query time of the user at the query location, and synchronizes all data in the target storage node to the query storage node at the next query time. As shown in FIG. 3G, the device further includes a second statistics module 317, a third node determination module 318, a second synchronization module 319, and a second backup module 320. The second statistics module is configured to count the number of times the user performs data query at the query location to generate the number of queries; the third node determination module is configured to use the query storage node as a new target storage node if the number of query times is greater than the threshold of the number of times; The second synchronization module is used to synchronize all data in the target storage node to the new target storage node; the second backup module is used to determine a preset number of new backup storage nodes based on the new target storage node, and back up all data to A preset number of new backup storage nodes. Correspondingly, an embodiment of the present application further provides a storage device, which stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the data storage method shown in FIG. 2B to FIG. 2D is implemented. Based on the foregoing embodiments of the method shown in FIGS. 2B to 2D and the virtual device shown in FIGS. 3A to 3G. An embodiment of the present application further provides a physical device for data storage. The physical device includes a storage device and a processor; the storage device is configured to store computer-readable instructions; and the processor is configured to execute the computer-readable instructions. The instruction is read to implement the data storage method shown in FIG. 2B to FIG. 2D. From the description of the above embodiments, it can be clearly understood that the present application can be implemented by hardware, or by software plus necessary universal hardware platform. Based on this understanding, the technical solution of this application can be embodied in the form of a software product, which can be stored in a non-volatile readable storage medium (which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.), It includes several computer-readable instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in each implementation scenario of this application.

Claims

A data storage method, comprising:

Receiving a user's data storage instruction, where the data storage instruction includes at least data to be stored and a terminal identifier of the user's terminal;

Locating the terminal of the user based on the terminal identifier, and acquiring a first physical location of the user;

Determining at least one storage node according to the first physical location, and determining at least a first distance between the at least one storage node and the first physical location, the at least one storage node being deployed at the first The storage resource pool indicated by the physical location;

Extracting a target distance from the at least one first distance, and determining a target storage node indicated by the target distance, where the target distance is a first distance among the at least one first distance that meets a first distance criterion;

Acquiring the data to be stored in the data storage instruction, and storing the data to be stored to the target storage node.
The method according to claim 1, wherein extracting a target distance from the at least one first distance and determining a target storage node indicated by the target distance comprises:

Ranking the at least one first distance from large to small to generate a first ranking result;

A first distance ranked last in the first ranking result is extracted as the target distance, and a storage node indicated by the target distance is determined as the target storage node.
The method according to claim 1, further comprising:

Determining a plurality of second distances between the target storage node and other storage nodes in the at least one storage node, where the other storage nodes are storage nodes other than the target node among the at least one storage node;

Determining, according to the plurality of second distances, a preset number of backup storage nodes whose second distance meets the second distance criterion;

Transmitting the data to be stored to the preset number of backup storage nodes, and performing data backup.
The method according to claim 3, further comprising:

Receiving a user's data addition instruction to obtain a second physical location of the user, where the data addition instruction carries data to be added;

If the second physical location is different from the first physical location, and the storage node whose distance from the second physical location satisfies the distance criterion is any backup storage of the preset number of backup storage nodes Node, storing the data to be added to the backup storage node;

Based on the backup storage node, the data to be added is synchronized to the target storage node.
The method according to claim 1, further comprising:

Receiving a data query request of the user, obtaining a query position of the user, and the data query request carrying a data identifier of data to be queried;

If the query location is consistent with the first physical location, sending a first data acquisition request to the target storage node, where the first data acquisition request is used to instruct the target storage node to acquire the data identifier indication Data to be queried, and returning the data to be queried to the user;

If the query location is inconsistent with the first physical location, determining a query storage node according to the query location, and sending a second data acquisition request to the query storage node, where the second data acquisition request is used to indicate the The query storage node obtains the query data through the target storage node and returns the query data to the user.
The method according to claim 5, further comprising:

Collect statistics on data queries performed by the user at the query location, and generate a query rule of the user at the query location, where the query rule includes at least historical query times and historical query time;

Based on the query rule, predict the next query time of the user at the query location, and synchronize all data in the target storage node to the query storage node at the next query time.
The method according to claim 5, further comprising:

Statistics the number of times that the user performs data query at the query location, and generates the query times;

If the number of queries is greater than a threshold, the query storage node is used as a new target storage node;

Synchronizing all data in the target storage node to the new target storage node;

Based on the new target storage node, a preset number of new backup storage nodes is determined, and the entire data is backed up to the preset number of new backup storage nodes.
A data storage device, comprising:

A first receiving module, configured to receive a user's data storage instruction, where the data storage instruction includes at least data to be stored and a terminal identifier of the user's terminal;

A positioning module, configured to locate the terminal of the user based on the terminal identifier, and obtain a first physical location of the user;

A first distance determining module, configured to determine at least one storage node according to the first physical location, and determine at least a first distance between the at least one storage node and the first physical location, the at least one The storage node is deployed in a storage resource pool indicated by the first physical location;

A first node determining module, configured to extract a target distance from the at least one first distance, and determine a target storage node indicated by the target distance, where the target distance satisfies a first distance criterion among the at least one first distance First distance

A first storage module is configured to obtain the data to be stored in the data storage instruction, and store the data to be stored in the target storage node.
The apparatus according to claim 8, wherein the first node determining module comprises:

A sorting submodule, configured to sort the at least one first distance from large to small to generate a first sorting result;

A determining submodule, configured to extract a first-ranked first distance in the first ranking result as the target distance, and determine a storage node indicated by the target distance as the target storage node.
The apparatus according to claim 8, further comprising:

A second distance determining module, configured to determine a plurality of second distances between the target storage node and other storage nodes in the at least one storage node, where the other storage nodes are in addition to the at least one storage node Storage nodes outside the target node;

A second node determining module, configured to determine, according to the plurality of second distances, a preset number of backup storage nodes whose second distances meet a second distance criterion;

A first backup module is configured to transmit the data to be stored to the preset number of backup storage nodes, and perform data backup.
The apparatus according to claim 10, further comprising:

A second receiving module, configured to receive a user's data addition instruction and obtain a second physical location of the user, where the data addition instruction carries data to be added;

A second storage module, configured to store the preset number of backup storage nodes if the second physical location is different from the first physical location and the distance from the second physical location meets the distance criterion Any backup storage node among the nodes, storing the data to be added to the backup storage node;

A first synchronization module is configured to synchronize the data to be added to the target storage node based on the backup storage node.
The apparatus according to claim 8, further comprising:

A location acquisition module, configured to receive a data query request of the user, and obtain a query location of the user, where the data query request carries a data identifier of data to be queried;

A first sending module, configured to send a first data acquisition request to the target storage node if the query location is consistent with the first physical location, and the first data acquisition request is used to indicate the target storage node Acquiring the data to be queried indicated by the data identifier, and returning the data to be queried to the user;

A second sending module, configured to determine a query storage node according to the query location if the query location is inconsistent with the physical location, and send a second data acquisition request to the query storage node, the second data acquisition The request is used to instruct the query storage node to obtain the query data through the target storage node and return the query data to the user.
The device according to claim 12, further comprising:

A first statistics module, configured to collect statistics on data queries performed by the user at the query location, and generate a query rule of the user at the query location, where the query rule includes at least historical query times and historical query times;

A prediction module is configured to predict the next query time of the user at the query location based on the query rule, and synchronize all data in the target storage node to the query storage node at the next query time.
The device according to claim 12, further comprising:

A second statistics module, configured to count the number of data queries performed by the user at the query location to generate the query times;

A third node determining module, configured to use the query storage node as a new target storage node if the number of query times is greater than a threshold value;

A second synchronization module, configured to synchronize all data in the target storage node to the new target storage node;

A second backup module is configured to determine a preset number of new backup storage nodes based on the new target storage node, and back up the entire data to the preset number of new backup storage nodes.
A computer device includes a memory and a processor. The memory stores computer-readable instructions, and the method of implementing data storage when the processor executes the computer-readable instructions includes receiving data storage of a user. An instruction, the data storage instruction includes at least data to be stored and a terminal identifier of the user's terminal; positioning the user's terminal based on the terminal identifier to obtain a first physical location of the user; A first physical location, determining at least one storage node, and determining at least a first distance between the at least one storage node and the first physical location, the at least one storage node being deployed at the first physical location indication A storage resource pool of the storage device; extracting a target distance from the at least one first distance, and determining a target storage node indicated by the target distance, the target distance being the first of the at least one first distance that meets a first distance criterion A distance; acquiring the data to be stored in the data storage instruction, and storing the data to be stored Storage to the target storage node.
The computer device according to claim 15, wherein when the processor executes the computer-readable instructions, the processor extracts a target distance from the at least one first distance, and determines a target indicated by the target distance. The storage node includes: sorting the at least one first distance from large to small to generate a first ranking result; and extracting the first ranked first distance from the first ranking result as the target distance, and determining The storage node indicated by the target distance serves as the target storage node.
The computer device according to claim 15, wherein the method further comprises: determining a plurality of second distances between the target storage node and other storage nodes in the at least one storage node, the other storage The node is a storage node other than the target node among the at least one storage node; according to the plurality of second distances, determining a preset number of backup storage nodes whose second distance satisfies a second distance criterion; The stored data is transmitted to the preset number of backup storage nodes, and data backup is performed.
A computer non-volatile readable storage medium having computer readable instructions stored thereon, characterized in that implementing the data storage method when the computer readable instructions are executed by a processor includes: receiving a user's data storage instruction, The data storage instruction includes at least data to be stored and a terminal identifier of the user's terminal; positioning the user's terminal based on the terminal identifier to obtain a first physical location of the user; and according to the first Physical location, determining at least one storage node, and determining at least a first distance between the at least one storage node and the first physical location, the at least one storage node being deployed at a storage indicated by the first physical location A resource pool; extracting a target distance from the at least one first distance, and determining a target storage node indicated by the target distance, the target distance being a first distance among the at least one first distance that meets a first distance criterion Obtaining the data to be stored in the data storage instruction, and storing the data to be stored in the destination Target storage node.
The computer non-volatile readable storage medium according to claim 18, wherein when the computer-readable instructions are executed by a processor, the target distance is extracted from the at least one first distance, and the target distance is determined. The target storage node indicated by the target distance includes: sorting the at least one first distance from large to small to generate a first ranking result; and extracting the first distance ranked last in the first ranking result as For the target distance, a storage node indicated by the target distance is determined as the target storage node.
The computer non-volatile readable storage medium according to claim 18, wherein the method further comprises: determining a plurality of first storage nodes between the target storage node and other storage nodes in the at least one storage node. Two distances, the other storage nodes are storage nodes other than the target node among the at least one storage node; and based on the plurality of second distances, determining a preset number of backups whose second distance meets a second distance criterion A storage node; transmitting the data to be stored to the preset number of backup storage nodes, and performing data backup.