WO2024000987A1 - Data storage method, server and storage medium - Google Patents

Data storage method, server and storage medium Download PDF

Info

Publication number
WO2024000987A1
WO2024000987A1 PCT/CN2022/130421 CN2022130421W WO2024000987A1 WO 2024000987 A1 WO2024000987 A1 WO 2024000987A1 CN 2022130421 W CN2022130421 W CN 2022130421W WO 2024000987 A1 WO2024000987 A1 WO 2024000987A1
Authority
WO
WIPO (PCT)
Prior art keywords
bitmap
participating
users
projects
server
Prior art date
Application number
PCT/CN2022/130421
Other languages
French (fr)
Chinese (zh)
Inventor
赵阳
王大飞
吴泽勇
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2024000987A1 publication Critical patent/WO2024000987A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management

Definitions

  • the present application relates to the field of computers, and in particular, to a data storage method, server and storage medium.
  • the data management background can use bitmap algorithms to record marketing activity information.
  • the data management backend can create a bitmap for each interest.
  • the data management background can set the size of each bitmap according to the number of system users. Each system user can correspond to one bit in a bitmap.
  • the data management background can set the value of the bit corresponding to user A in the bitmap corresponding to equity B to 1 to realize the record of user A acquiring equity B.
  • each bitmap in this bitmap algorithm usually needs to be set according to the number of system users.
  • the number of system users is very large, but the number of users actually participating in the marketing activity is relatively small, it is easy to waste a large amount of data space in each bitmap, and there is a problem of low data storage efficiency.
  • This application provides a data storage method, server and storage medium to solve the problem that the number of participating users actually participating in the marketing activity is relatively small, which easily leads to the waste of a large amount of data space in each bitmap and low data storage efficiency. question.
  • this application provides a data storage method, including:
  • each project in the current activity corresponds to a bitmap
  • the target number of shards for each bitmap is determined based on the number of system users, the estimated number of participating users, and the number of allowed participating projects. ;
  • the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity.
  • determining the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects specifically includes:
  • calculating the bitmap space occupancy corresponding to each number of fragments within the range specifically includes:
  • the bitmap space occupation corresponding to the number of slices is determined.
  • determining the bitmap space occupancy corresponding to the number of slices based on the number of system users, the estimated number of participating users, the number of allowed participation projects, and the number of slices specifically includes: : Determine the total length of keywords for each bitmap according to the number of items allowed to participate, the number of fragments and the fixed length of keywords; wherein each fragment corresponds to one keyword;
  • the bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupancy of all the slices of each bitmap, and the number of allowed participating items.
  • recording the system user's participation information in the current activity specifically includes:
  • the value of the system user in the slice of the bitmap is modified from the first data to the second data; when the number of participating projects is greater than or equal to the number of allowed participating projects when, denying the system user to continue to participate in the current activity.
  • bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity.
  • Methods also include:
  • the key-value pair is used to record the participation information of the system user in the current activity.
  • the method also includes:
  • this application provides a data storage device, including:
  • the acquisition module is used to obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity; where each project in the current activity corresponds to a bitmap;
  • a processing module configured to determine each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects when the ratio between the number of system users and the estimated number of participating users is greater than a preset threshold.
  • processing module is specifically used for:
  • processing module is specifically used for:
  • the bitmap space occupation corresponding to the number of slices is determined.
  • processing module is specifically used for:
  • the bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupancy of all the slices of each bitmap, and the number of allowed participating items.
  • processing module is specifically used for:
  • bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity.
  • Processing module also used for:
  • the key-value pair is used to record the participation information of the system user in the current activity.
  • processing module is also used to:
  • this application provides a server, including: a memory and a processor;
  • the memory is used to store a computer program; the processor is used to execute the data storage method in the first aspect and any possible design of the first aspect according to the computer program stored in the memory.
  • the present application provides a computer-readable storage medium.
  • a computer program is stored in the computer-readable storage medium.
  • the server executes the first aspect and any one of the first aspects. possible data storage methods in the design.
  • the present application provides a computer program product.
  • the computer program product includes a computer program.
  • the server executes the first aspect and any possible design of the first aspect.
  • the data storage method in .
  • the data storage method provided by this application is to obtain the number of system users; determine the estimated number of participating users based on the current activity; obtain the number of projects allowed to participate in the current activity, each project in the current activity corresponds to a bitmap; calculate the number of system users The ratio to the estimated number of participating users; use this ratio to compare with the preset threshold; when the ratio is greater than the preset threshold, use a bitmap to store the participation information of the current activity; according to the number of system users, the estimated number of participating users and the number of projects allowed to participate in, calculate the target number of shards for each bitmap; fragment the bitmap according to the number of target shards, and use the fragmented bitmap to record the system user's participation information in the current activity , to achieve the effect of improving the utilization of storage space, avoiding too concentrated access to storage space, and improving the efficiency of data storage and reading.
  • Figure 1 is a schematic diagram of a bitmap provided by an embodiment of the present application.
  • Figure 2 is a schematic bitmap diagram of a right provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a scenario for issuing activity rights according to an embodiment of the present application.
  • Figure 4 is a flow chart of a data storage method provided by an embodiment of the present application.
  • Figure 5 is a flow chart of a data storage method provided by an embodiment of the present application.
  • Figure 6 is a flow chart of a data storage method provided by an embodiment of the present application.
  • Figure 7 is a flow chart of a data storage process provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a data storage device provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of the hardware structure of a server provided by an embodiment of the present application.
  • first, second, third, fourth, etc. in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe specific objects. Sequence or sequence. It should be understood that data so used are interchangeable where appropriate.
  • first information may also be called second information, and similarly, the second information may also be called first information.
  • a data management backend can be set up in the server.
  • the data management backend can record information on the number of users participating in the activity, the rights issuance of the activity, the number of rights obtained by each user, and other dimensions.
  • the record of the number of rights and interests obtained by each user can also be verified to verify the maximum number of rights and interests that each user can obtain in this activity, thereby ensuring that the number of rights and interests obtained by the user in this activity is less than the preset upper limit.
  • the server can use a redis cluster to store the number of rights each user has obtained.
  • the server can query the number of rights the user has obtained stored in redis, and perform verification based on the number of rights. If the number of rights is less than the preset upper limit, the server can send the rights to the user. Otherwise, the user will no longer be able to obtain this benefit.
  • the server can use key-value pairs to store the number of rights and interests that each user has obtained in redis.
  • each user needs a key storage space and a value storage space.
  • the key may include the user's user ID and other information used to uniquely identify the user.
  • the length of Key is usually fixed.
  • a key occupies about 50 bytes of storage space.
  • the number of users participating in an activity is usually proportional to the space required to store user participation information.
  • the space occupied by the keys of all users can reach:
  • bitmap bitmap can better save storage space when facing a large number of users.
  • each user needs to correspond to one bit (1bit) of storage space.
  • the 1-bit storage space can be set with a value of 0 or 1.
  • the 1-bit storage space can be used to indicate whether the user has obtained the right.
  • the server can allocate 100 million bits of storage space to store whether the 100 million users have obtained the rights.
  • each grid can correspond to 1 bit. Among them, the last grid corresponds to the 100 millionth bit.
  • the storage space occupied by this bitmap is:
  • the storage space required for a bitmap is 12 MB.
  • N bitmaps can be set in the server. That is, its space occupation is 12N trillion.
  • the first coupon, the second coupon and the third coupon respectively correspond to 3 bitmaps of three rights.
  • the values of the grids with an offset value of 1 in the three bitmaps are all 1.
  • two of the three grids with an offset value of 4 in the three bitmaps have a value of 1.
  • the key-value data structure can save more space.
  • the size of the bitmap needs to be set accordingly according to the user ID. That is, the memory size occupied by the bitmap is determined by the maximum offset. Even if the bitmap only stores information about one user, if the user ID is large, it will still take up a lot of space. For example, when the 100 millionth user participates in the activity, as shown in Figure 1, the size of the bitmap is 12M. In extreme cases, assuming that only the 100 millionth user participates in the activity, a large amount of the 12M storage space will be wasted.
  • the key-value data structure is different.
  • the server can create a key-value pair corresponding to a user when he or she participates in an activity. That is, when only the 100 millionth user participates in the activity, the server only needs to use 50 bytes of storage space. It can be seen that the number of users in the system is very large, but the number of users who actually participated in the marketing activity is relatively small, which easily leads to the waste of a large amount of data space in each bitmap and the problem of low data storage efficiency.
  • this application proposes a data storage method based on bitmap sharding.
  • the server can divide a bitmap into L slices, and the offset of each slice is the maximum offset/L.
  • the participation information of users from the 1st to the maximum offset/L will be stored on the first shard, and the participation information of the users from the 1st to the maximum offset/L+1 to 2 (maximum offset/L) will be stored on the second shard, and so on. For example, when the number of system users is 100 million and only the 100 millionth user participates in the activity, only the bit corresponding to the maximum offset of the last shard is used, and the memory of other shards will not be occupied.
  • the actual storage space used is only 12/L(M). In this way, even if there is a large offset, it will only waste the space of one fragment.
  • the use of this sharding method greatly improves memory utilization.
  • the use of this sharding method can also achieve load balancing, thereby avoiding excessive concentration of requests to access storage space, resulting in reduced data processing efficiency.
  • the server After implementing sharding on the bitmap, the server also needs to set a key in redis for each shard.
  • the use of this key allows the server to quickly find the corresponding shards.
  • the value corresponding to the key stores a slice of a bitmap. Therefore, the more shards there are, the more redis keys will be, and the keys will occupy more memory.
  • the existing limit on the number of shards is only related to the number of redis cluster nodes. As long as the number of server shards exceeds the number of redis cluster nodes so that requests are not concentrated on a certain node. Obviously, the setting of this number of shards cannot be optimal.
  • the server can determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
  • the target number of shards is the optimal number of shards.
  • the server can fragment the bitmap according to the target number of fragments, thereby recording the user's participation information.
  • each fragment can store 2 ⁇ 20 ⁇ 105w user information.
  • the server can divide a bitmap into 95 slices.
  • the server can use the redis command SETBIT(key, offset, value) to set the activity participation information when the system user's activity participation information changes.
  • the server can calculate the slice number (sliceNo) where the system user is located.
  • the calculation formula is:
  • sliceNo 0.
  • the server can calculate the offset of the system user on its corresponding shard.
  • the calculation formula is:
  • the server can determine the fragment sequence number and offset based on the above two formulas. According to the slice sequence number and offset, the above redis instruction can become: SETBIT(key+sliceNo,offset,value).
  • the command is: SETBIT(key+95,385280,1). This command can control the server to set the status of the 100 millionth user at position 385280 in the 95th shard.
  • the amount of storage space occupied can be:
  • bitmap can effectively save the storage space of the bitmap when the number of event participants is small.
  • FIG 3 shows a schematic diagram of a scenario for issuing activity rights according to an embodiment of the present application.
  • the server can be associated with third-party channels, the application's product trading functions, and the application's customer behavior.
  • the server detects that a user clicks to view marketing activities on a third-party channel, the server can issue benefits to the user.
  • the server detects that a user completes a product transaction in the application, the server can issue benefits to the user.
  • the server detects that the user performs corresponding behavioral operations in the application, the server can issue rights to the user.
  • the server determines that it needs to issue rights to the user, it calls the coupon issuance interface in MES-SERVICE to realize the issuance of the rights.
  • the server can query the redis cluster for the number of rights and interests that the user has obtained.
  • the server can call MES-SERVICE to verify the amount of equity. If the user's number of rights is less than the preset upper limit, the server can issue the rights to the user.
  • the server can also write the rights issuance record into the flow table of the mes database (mesdb).
  • the server can update the number of rights and interests that the user has obtained in redis.
  • the service can also use redis synchronization rules to synchronize this information in the redis cluster.
  • the server is used as the execution subject to execute the data storage method in the following embodiment.
  • the execution subject may be a hardware device of the server, or a software application in the server that implements the following embodiments, or a computer-readable storage medium installed with a software application that implements the following embodiments, or a computer-readable storage medium that implements the following embodiments. Code for the software application of the embodiment.
  • Figure 4 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiment shown in Figure 3, as shown in Figure 4, with the server as the execution subject, the method of this embodiment may include the following steps:
  • each currently active item corresponds to a bitmap.
  • the server can obtain the number of system users.
  • the number of users of the system can be the number of users who have opened cards in the bank.
  • the number of system users can be the number of users who have completed registration in the application.
  • the server can also determine the estimated number of participating users based on current activity.
  • the estimated number of participating users may be a number estimated by the server based on the historical number of participants of the event.
  • the estimated number of participating users may also be the number of users who meet the participation rules of the current activity and are filtered by the server based on the participation conditions of the current activity.
  • the server can also obtain the number of currently active projects allowed to participate.
  • the number of projects allowed to participate in is the number of projects that each user participating in the activity is allowed to participate in in the current activity.
  • the current activity may include 20 activities.
  • each user can participate in up to 6 activities, so the number of allowed participation projects is 6.
  • the current activity includes 5 rights issuance channels, and each rights issuance channel distributes rights in 6 times.
  • each user can obtain up to 8 rights. At this time, the number of participating projects allowed is 8.
  • the server can determine the ratio of the number of system users to the estimated number of participating users by calculating the quotient of the estimated number of participating users divided by the number of system users. When the number of system users remains unchanged, the greater the estimated number of participating users, the greater the ratio. For this feature, the server can use this ratio to compare with a preset threshold. When the ratio is greater than the preset threshold, it means that the estimated number of participating users is large, and the server can use a bitmap to store the participation information of the current activity.
  • the server can store the participation information of the current activity in the form of key-value pairs. This process may be specifically shown in step S304 in Figure 6 .
  • the server can calculate the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects.
  • the server can fragment each bitmap according to the target number of fragments, thereby saving storage space. In this process, the specific process of the server calculating the number of target shards can be shown in Figure 5.
  • the server may fragment each bitmap according to the target number of fragments calculated in step S102.
  • the server can store the fragmented bitmap using key-value pairs.
  • key is used to store the sharding information of the shard.
  • the fragmentation information includes at least a bitmap sequence number and a fragmentation sequence number. For example, when three bitmaps are included and the target number of fragments is 10, the key value with bitmap number 1 and fragment number 5 is used to store the bitmap information of the fifth fragment in the first bitmap. Among them, value is used to store the bitmap information of the fragment.
  • the maximum storage space occupied by this shard is the ratio of the number of system users to the number of target shards.
  • the bitmap information of each shard includes the participation information of the number of system users/the number of target shards (bits).
  • the specific steps may include:
  • Step 1 The server determines the number of participating projects of the system user based on the system user's value in each bitmap slice.
  • the server can determine the fragment sequence number and offset corresponding to the system user in each bitmap based on the user ID of the system user.
  • the user ID is a numerical code used to uniquely identify a system user. For example, when including 100 million system users, the user ID of the 100 millionth system user could be 100000000.
  • each shard is used to record the participation information of 1 million system users. Therefore, each slice can contain 1 million bits.
  • the shard number of the 100 millionth user is 10, and the offset is 1 million. That is, the 100 millionth user can be stored in the 1 millionth bit of the 10th shard.
  • the server can obtain the value corresponding to the system user in each bitmap.
  • the server can determine the number of projects in which the system user participates based on this value. Since in the bitmap, when the user participates in the activity, its value is set to the second data. Therefore, the server can directly count the number of the second data. Because the value of each bit only includes 0 and 1. The second data is 1. Therefore, the server can directly calculate the number of participating projects of the system users through accumulation.
  • Step 2 When the number of participating projects is less than the number of allowed participating projects, the server modifies the system user's value in the slice of the bitmap from the first data to the second data. When the number of participating projects is equal to the number of allowed participating projects, the server refuses the system user to participate in the project.
  • the server can compare the number of projects the system user participates in and the number of projects allowed to participate. Since the number of participating projects allowed is the same as the number of bitmaps. Therefore, the calculated maximum number of participating projects is equal to the number of allowed participating projects.
  • the number of participating projects is equal to the number of projects allowed to participate, it means that the user of the system has obtained all the rights and interests allowed and participated in all the activities that can be participated in the current activity. Therefore, when the number of participating projects is equal to the number of allowed participating projects, the server refuses the system user to continue participating in the current activity. The server will no longer issue new rights to users of this system.
  • the server can modify the value in one bitmap from the first data to the second data. For example, when 5 bitmaps are included and the user has obtained 2 benefits, the value in 2 of the 5 bitmaps is the second data. At this time, the server can randomly select a bitmap from three other bitmaps whose value is the first data, and modify the value corresponding to the system user from the first data to the second data. Because the value of each bit in the bitmap includes two types: 0 and 1. Normally the first data is 0 and the second data is 1. The first data is used to mean that the user has not obtained the rights. The second data is used to indicate that the user has obtained the rights.
  • the server can obtain the number of system users.
  • the server can also determine the estimated number of participating users based on current activity.
  • the server can also obtain the number of currently active projects allowed to participate.
  • Each item in the current activity corresponds to a bitmap.
  • the server can calculate the ratio of the number of system users to the estimated number of participating users.
  • the server can use this ratio to compare to a preset threshold. When the ratio is greater than the preset threshold, the server can use a bitmap to store the participation information of the current activity.
  • the server can calculate the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
  • the server can fragment the bitmap according to the target number of fragments, and use the fragmented bitmap to record the participation information of system users in the current activity.
  • the sharded storage of the bitmap is realized, which improves the utilization of storage space, avoids overly concentrated access to the storage space, and improves the efficiency of data storage and reading.
  • Figure 5 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiments shown in Figures 3 and 4, as shown in Figure 5, with the server as the execution subject, the calculation process of the target number of shards in this embodiment may include the following steps:
  • S201 Determine the range of the number of shards based on the estimated number of participating users.
  • the server can determine the maximum number of shards based on the estimated number of participating users.
  • the maximum number of shards is the estimated number of participating users. For example, when there are 100 system users and the estimated number of participating users is 10, the maximum number of shards is 10. The minimum number of shards is 1. When a bitmap includes 1 fragment, the bitmap is not fragmented. When including 100 system users and the estimated number of participating users is 10, in the worst case, the system users participating in the current activity are evenly distributed among all system users. At this time, when there are 10 shards and there is 1 user in each shard, all 10 shards will be used.
  • each bitmap fragment in addition to the storage space occupied by the bitmap fragment, each bitmap fragment also corresponds to a key, and the space occupied by the key will increase as the number of bitmap fragments increases. Therefore, the lower limit of the shard number range is 1, and the upper limit is the estimated number of participating users.
  • the server can use the number of each shard in the range, together with the number of system users, the estimated number of participating users, and the number of allowed participating projects, to calculate the number of each shard.
  • the corresponding bitmap space occupied For example, when the lower limit of the number of shards is 1 and the upper limit is 1000, the server can calculate the bitmap space usage when the number of shards is 1, 2,..., 1000 respectively.
  • the calculated data can be shown in Table 1.
  • Table 1 enumerates the bitmap space usage (Mem) corresponding to each number of shards when the number of system users remains unchanged.
  • the unit of the bitmap space occupied (Mem) is byte.
  • the server can also pre-calculate a shard number comparison table based on the number of different system users, different estimated number of participating users, and different number of shards.
  • the comparison table can be as shown in Table 2.
  • U represents the estimated number of participating users
  • t represents the number of shards
  • Mem represents the bitmap space occupied in byte.
  • the specific process of calculating the bitmap space usage may include:
  • Step 1 The server determines the total length of keywords for each bitmap based on the number of projects allowed to participate, the number of shards, and the fixed length of keywords. Among them, each fragment corresponds to a keyword.
  • the fragmented bitmap needs to store key and value parts in Redis.
  • the storage space occupied by these two parts of data is divided into two parts: Mem (key) and Mem (value). Among them, the bitmap information of each fragment is stored in value. And each shard can correspond to a key. Therefore, when the number of shards increases, the storage space occupied will increase linearly with the increase in the number of shards. That is, the number of shards in Mem(key) is proportional to the occupied storage space.
  • the storage space occupied by key is:
  • Mem(key) storage space occupied by each key ⁇ number of shards
  • the key components in the redis bitmap structure are mainly the system number, activity number, bitmap serial number and fragmentation serial number.
  • the bitmap serial number can be determined based on the number of projects allowed to participate.
  • the fragment sequence number is determined based on the number of target fragments.
  • the system number is determined based on the server's system number.
  • the activity number is determined based on the activity number of the current activity. This activity number uniquely identifies the current activity.
  • the storage space occupied by a single key is specifically:
  • Mem(key) SDS(9)+system number+activity number+bitmap serial number+fragment serial number
  • the storage space occupied by the keys of all slices of all bitmaps can be expressed as:
  • Mem(Ct key ) represents the total length of keys of all slices of each bitmap.
  • Mem key represents a constant value of the storage space required by a shard.
  • bitmap number is directly omitted in this application.
  • Step 2 The server determines each fragment based on the average fragment length of each fragment, the probability value of each fragment occupying fifty percent of the storage space, the average storage space saved by each fragment, and the number of fragments. The space occupied by all slices of the bitmap.
  • the storage space occupied by value can be estimated using probability. For example, the probability that one user occupies half of the storage space of a bitmap of size m is 50%. When the number of users is m/2, the probability of occupying half of the bitmap memory is 100%. It can be seen that the actual number of users is directly proportional to the probability of occupying 50% of the storage space of the shard. That is, the number of shards in Mem(value) is inversely proportional to the occupied storage space.
  • the value in the redis bitmap is used to store the number of issued rights for each user. Each position in the bitmap represents a user, with 1 representing issued rights and 0 representing unissued rights.
  • the storage space occupied by this value is:
  • the storage space occupied by a single value can be expressed as:
  • bitmap Its unit can be byte.
  • SDS(25) requires fixed storage space for this key.
  • maximum offset of the bitmap is the offset of the largest number of system users that may appear in the shard.
  • the storage space occupied by the value of all slices of all bitmaps can be expressed as:
  • Mem(Ct value ) is the space occupied by all slices of each bitmap.
  • S is the number of system users.
  • U is the estimated number of participating users.
  • t is the number of shards.
  • e is used to represent the maximum length of each fragment. The calculation formula of e can be:
  • the server can determine the average maximum memory space occupied by each shard based on the value of e.
  • the maximum memory space occupied by each fragment can be expressed as Mem(e). For example, when the maximum length of each fragment is 100, the Mem(e) is 100bit. in, Used to represent the probability value of occupying 50% of the storage space of the shard. For example, the probability that a user appears in each position of a shard is 1/e. Taking the middle position of the shard as the boundary, the probability that a user appears in the first 50% of the storage space of the shard and the probability of appearing in the last 50% of the storage space of the shard are both 50%. When the number of users participating in the current activity is e/2, the probability of occupying 50% of the storage space of the shard is 100%.
  • the 1st, 3rd, 9th, 12th, and 19th system users have obtained their rights and interests, while the remaining system users have not yet obtained their rights and interests.
  • the four slices of the bitmap shown in Table 4 can be obtained.
  • the gray part in Table 4 shows the space that can be saved by each shard. Since the 1-bit storage space corresponding to the last dark gray grid in the bitmap shown in Table 3 can be saved in the bitmap. Therefore, in the fourth shard shown in Table 4, the 1-bit storage space corresponding to the dark gray grid does not belong to the optimization of storage space brought about by sharding. Therefore, in actual calculations, the average storage space of each fragment will be multiplied by t-1, thereby calculating the optimization of storage space brought by the bitmap fragmentation to the first t-1 fragments.
  • Step 3 The server determines the bitmap space occupancy based on the total length of the keywords of each bitmap, the space occupied by all slices of each bitmap, and the number of allowed participating projects.
  • the server can determine the total length of the keyword Mem (Ct key ) of each bitmap and the space occupied by all slices of each bitmap Mem (Ct value ) based on the above steps.
  • the length Mem (Ct key ), the space occupied by all slices Mem (Ct value ) and the number of allowed participating projects determine the bitmap space occupied. Its formula can be:
  • Mem(Ct) [Mem(Ct key )+Mem(Ct value )] ⁇ N
  • N is the number of bitmaps, that is, the number of projects allowed to participate.
  • the server can sort according to the bitmap space occupancy after determining the bitmap space occupancy corresponding to each number of fragments according to the above steps.
  • the server can select the number of shards corresponding to the minimum bitmap space occupation as the target number of shards.
  • the server can use this number of fragments to complete fragmentation of the bitmap.
  • the server can complete the recording of the participation information of the current activity in the fragmented bitmap.
  • the server can determine the maximum number of shards based on the estimated number of participating users, thereby determining the range of the number of shards.
  • the server can use the number of each shard in this range, together with the number of system users, the estimated number of participating users, and the number of allowed participating projects, to calculate the bitmap space occupation corresponding to each shard number.
  • the server can sort the bitmaps based on their space usage.
  • the server can select the number of shards corresponding to the minimum bitmap space occupation as the target number of shards.
  • the optimal number of fragments is calculated by calculating the bitmap space occupancy corresponding to each possible number of fragments, thereby improving the space utilization when storing the participation information of the current activity, and at the same time Through sharding, the storage efficiency of the participation information of the current activity is also improved.
  • Figure 6 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiments shown in Figures 3 to 5, as shown in Figure 6, with the server as the execution subject, the method of this embodiment may include the following steps:
  • each currently active item corresponds to a bitmap.
  • S302. Determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
  • steps S301 and S302 are implemented similarly to steps S101 and S102 in the embodiment of FIG. 2, and will not be described again in this embodiment. As shown in Figure 7, this step is equivalent to S401.
  • the server can obtain the estimated number of participating users U in the current activity and the number N of allowed participating projects in the current activity.
  • S303 Determine the bitmap space occupation based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of target shards.
  • the server can calculate the minimum number of bits based on the number of system users, the estimated number of participating users, the number of allowed participating projects, and the target number of shards in the embodiment shown in Figure 5 after determining the target number of shards.
  • the amount of image space occupied. This step is equivalent to S403 in Figure 7.
  • S304 Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
  • the server can use the structural information in the key-value structure to calculate the space occupied by the key-value pair.
  • This step is equivalent to S402 in Figure 7 .
  • the execution order of S303 and S304 and the execution order of S402 and S403 are not limited by the flow chart and can be exchanged arbitrarily.
  • the key includes the system number, activity number and user ID.
  • the server can create a key-value pair when a system user participates in the current activity, and use the key-value pair to record the system user's participation information in the current activity. Therefore, in the key, in addition to the system number used to identify the server and the activity number identifying the current activity, only the user ID needs to be stored in the key.
  • the user ID is a code used by the server to uniquely identify the system. For example, when including 100 million system users, the user ID of the 100 millionth user could be 100000000.
  • value is used to store the number of benefits the user has obtained or the number of projects he has participated in. For example, when the user has obtained 3 interests, the value of this value is 3. For another example, when the user has participated in 6 projects, the value of this value is 6.
  • the key-value data can be:
  • the calculation process of the space occupancy of the key-value pair also needs to calculate the storage space of the key and the storage space of the value separately.
  • the calculation method of the storage space occupied by the key can be:
  • the server can determine the total amount of storage space required for the key in the current activity based on the estimated number of participating users and the usage of the key's storage space.
  • the storage space occupied by value can be calculated as:
  • log 2 (n) is used to count the number of digits in a number. Because in value, the value is cumulative, that is, only one value is stored in value at each time. Therefore, when the value in value is n, the value requires the largest storage space. log 2 (n) is the number of bits needed to store the value n.
  • the number of projects allowed to participate is basically in single digits, so the value of log 2 (n) in value can be directly simplified to 1byte. The calculation formula for the storage space occupied by this value can be simplified as:
  • the server can calculate the space occupied by the key-value pair Mem(kv).
  • the calculation formula can be:
  • U is the estimated number of participating users, which is the maximum number of participants in the event.
  • SDS(25) is a fixed constant value that redis takes up memory when using the key-value structure to store data. SDS(25) corresponds to 25byte.
  • the server may compare the bitmap space occupancy calculated in step S303 with the key-value pair space occupancy calculated in step S304. This step is equivalent to S404 in Figure 7.
  • the server determines to use the key-value pair to record the system user's participation information in the current activity. That is, when the space occupied by the key-value pair is less than the space occupied by the bitmap, the server will continue to execute S414 in Figure 7. Otherwise, when the space occupied by the key-value pair is greater than or equal to the space occupied by the bitmap, the server can continue to execute step S306. Alternatively, the server may continue to execute S405 in Figure 7.
  • the specific recording process of the participation information may include the following steps:
  • Step 1 The server obtains the number of participating projects of the system user recorded in the key-value pair corresponding to the system user.
  • the server can directly search for the key corresponding to the system user in redis based on the user ID of the system user.
  • the server can determine the value in the key-value pair after determining the key. This value can include the number of projects the system user participates in.
  • the number of participating projects is the number of rights and interests that have been obtained.
  • This step may be shown in step S415 in FIG. 7 .
  • the instruction for the server to obtain the number of participating projects of the system user can be:
  • Step 2 When the number of participating projects is less than the number of allowed participating projects, the server increases the number of participating projects of the system user by the unit value.
  • the server can compare the number of participating projects with the number of allowed participating projects. If the number of participating projects that the system user has participated in is less than the number of participating projects, the server can allow the system user to participate in the project and increase the number of participating projects by the system user by the unit value.
  • the unit value can be 1. For example, when a system user has obtained 3 rights and interests, and the maximum number of rights and interests that the system allows the user to obtain is 6, the server can determine that the user can continue to obtain rights and interests.
  • the server can issue rights and interests to users and modify the record in redis of the number of rights and interests that the user has obtained.
  • This step is equivalent to S417 in Figure 7.
  • the instruction for the server to write the number of participating projects (number of equity) back to redis can be:
  • Step 3 When the number of participating projects is greater than or equal to the number of allowed participating projects, the server refuses the system user to participate in the project.
  • the server determines that the system user has completed participation in the current activity.
  • the server can deny the user further participation in the current activity. For example, when a system user has acquired 6 rights and interests, and the maximum number of rights that the system allows a user to acquire is 6, the server can determine that the user has acquired all rights and interests. The server can refuse the user to obtain the rights when the user requests the rights again.
  • This step is equivalent to the operation when S416 in FIG. 7 is determined to be negative.
  • step S306 is implemented similarly to step S103 in the embodiment of FIG. 2, and will not be described again in this embodiment.
  • the specific implementation process of S306 can be shown as steps S405 to S413 in Figure 7 .
  • the server can determine the fragment sequence number and offset of the system user based on the user ID of the system user.
  • the server can determine the value corresponding to the system user in each bitmap based on the fragment sequence number and offset.
  • the server can AND these values. When the operation result is 1, it means that the values in these bitmaps are all 1.
  • the value of the system user in each bitmap is 1, it means that the system user has obtained all the rights and interests that can be obtained, or has participated in all the activities that can be participated in.
  • the server can set the value in a bitmap with a value of 0 to 1.
  • the storage structure of the fragmented bitmap can be:
  • the calculation formula for the offset of the system user on the shard can be:
  • the acquisition instruction can be:
  • value1 GETBIT("6069:bitmap:act01:1:(X/e)",(X%e));
  • value2 GETBIT("6069:bitmap:act01:2:(X/e)",(X%e));
  • value3 GETBIT("6069:bitmap:act01:3:(X/e)",(X%e);
  • valueN GETBIT("6069:bitmap:act01:N:(X/e)",(X%e));
  • the server can calculate the number of participating projects of the system user based on the value in each bitmap. Its calculation formula can be:
  • the server can also perform an AND operation on all value values to more quickly determine whether the user's number of participating projects has reached the upper limit. Its calculation formula can be:
  • the server can obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity.
  • the server can determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate. After determining the target number of shards, the server can calculate the minimum bitmap space occupation based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the target number of shards.
  • the server can use the structural information in the key-value structure to calculate the space occupied by the key-value pair.
  • the server can compare the space occupied by the key-value pair to be smaller than the space occupied by the bitmap. When the space occupied by the key-value pair is less than the space occupied by the bitmap, the server determines to use the key-value pair to record the system user's participation information in the current activity. When the space occupied by the key-value pair is greater than or equal to the space occupied by the bitmap, the server can fragment the bitmap according to the number of target fragments, and use the fragmented bitmap to record the system user's participation information in the current activity.
  • Figure 8 shows a schematic structural diagram of a data storage device provided by an embodiment of the present application. As shown in Figure 8, the data storage device 10 of this embodiment is used to implement operations corresponding to the server in any of the above method embodiments. , the data storage device 10 of this embodiment includes:
  • the acquisition module 11 is used to obtain the number of system users, the estimated number of participating users in the current activity, and the number of allowed participating projects in the current activity. Among them, each currently active item corresponds to a bitmap.
  • the processing module 12 is configured to determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects when the ratio between the number of system users and the estimated number of participating users is greater than the preset threshold. .
  • the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the system user's participation information in the current activity.
  • processing module 12 is specifically used for:
  • processing module 12 is specifically used for:
  • the estimated number of participating users Based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of shards, determine the bitmap space occupation corresponding to the number of shards.
  • processing module 12 is specifically used for:
  • each bitmap is determined based on the number of projects allowed to participate, the number of shards, and the fixed length of the keyword. Among them, each fragment corresponds to a keyword.
  • the bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupied by all slices of each bitmap, and the number of allowed participating projects.
  • processing module 12 is specifically used for:
  • the number of participating projects of the system user is determined based on the value of the system user in each bitmap fragment.
  • the value of the system user in the slice of the bitmap is modified from the first data to the second data.
  • bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the system user's participation information in the current activity.
  • the processing module 12 is also used to:
  • the key-value pair When the space occupied by the key-value pair is less than the space occupied by the bitmap, the key-value pair is used to record the system user's participation information in the current activity.
  • processing module 12 is also used to:
  • the data storage device 10 provided by the embodiments of the present application can execute the above method embodiments. For its specific implementation principles and technical effects, please refer to the above method embodiments. This embodiment will not be described again here.
  • FIG. 9 shows a schematic diagram of the hardware structure of a server provided by an embodiment of the present application.
  • the server 20 is used to implement operations corresponding to the server in any of the above method embodiments.
  • the server 20 in this embodiment may include: a memory 21 and a processor 22 .
  • the memory 21 is used to store computer programs.
  • the memory 21 may include high-speed random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory, and may also be a U disk or a mobile hard disk. , read-only memory, magnetic disk or optical disk, etc.
  • the processor 22 is configured to execute the computer program stored in the memory to implement the data storage method in the above embodiment.
  • the processor 22 can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), etc. .
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the invention can be directly embodied and executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the memory 21 can be independent or integrated with the processor 22 .
  • the server 20 may also include a bus 23.
  • the bus 23 is used to connect the memory 21 and the processor 22 .
  • the bus 23 may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus, etc.
  • the bus in the drawings of this application is not limited to only one bus or one type of bus.
  • the server provided in this embodiment can be used to execute the above-mentioned data storage method. Its implementation method and technical effects are similar, and will not be described again in this embodiment.
  • This application also provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program When the computer program is executed by a processor, it is used to implement the methods provided by the above-mentioned various embodiments.
  • the computer-readable storage medium may be a computer storage medium or a communication medium.
  • Communication media includes any medium that facilitates transfer of a computer program from one place to another.
  • Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer.
  • a computer-readable storage medium is coupled to a processor such that the processor can read information from the computer-readable storage medium and write information to the computer-readable storage medium.
  • the computer-readable storage medium may also be an integral part of the processor.
  • the processor and computer-readable storage medium may be located in Application Specific Integrated Circuits (ASICs). Additionally, the ASIC can be located in the user equipment.
  • ASICs Application Specific Integrated Circuits
  • the processor and the computer-readable storage medium may also exist as discrete components in the communication device.
  • the computer-readable storage medium can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable memory In addition to programmable read-only memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), programmable read-only memory (Programmable read-only memory, PROM) ), read-only memory (Read-Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • Storage media can be any available media that can be accessed by a general purpose or special purpose computer.
  • the application also provides a computer program product.
  • the computer program product includes a computer program, and the computer program is stored in a computer-readable storage medium.
  • At least one processor of the device can read the computer program from the computer-readable storage medium, and at least one processor executes the computer program so that the device implements the methods provided by the various embodiments described above.
  • An embodiment of the present application also provides a chip.
  • the chip includes a memory and a processor.
  • the memory is used to store a computer program.
  • the processor is used to call and run the computer program from the memory, so that the device equipped with the chip can perform the above various possible implementations. method within the method.
  • the disclosed devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules may be combined or integrated into another unit.
  • a system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or modules, and may be in electrical, mechanical or other forms.
  • Each module may be physically separated, for example, installed in different locations of one device, or installed on different devices, or distributed to multiple network units, or distributed to multiple processors. Individual modules can also be integrated together, for example, installed in the same device, or integrated in a set of codes. Each module can exist in the form of hardware, or can exist in the form of software, or can also be implemented in the form of software plus hardware. This application can select some or all of the modules according to actual needs to achieve the purpose of the solution of this embodiment.
  • the integrated module can be stored in a computer-readable storage medium.
  • the above-mentioned software function module is stored in a storage medium and includes a number of instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute some steps of the methods of various embodiments of the present application.
  • each step in the flow chart in the above embodiment is shown in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least some of the steps in the figure may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and their execution order is not necessarily sequential. may be performed in turn or alternately with other steps or sub-steps of other steps or at least part of stages.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Provided in the present application are a data storage method, a server and a storage medium. The method comprises: a server being able to acquire the number of system users, an estimated number of participating users for the current activity, and the number of items, in which users are allowed to participate, of the current activity; the server being able to calculate the ratio of the number of system users to the estimated number of participating users; when the ratio is greater than a preset threshold value, the server being able to store participation information regarding the current activity by means of a bitmap; the server being able to perform calculation according to the number of system users, the estimated number of participating users, and the number of items in which users are allowed to participate, so as to obtain the number of target fragments of each bitmap; and the server being able to fragment each bitmap according to the number of target fragments, and to record participation information of the system users in the current activity by using the fragmented bitmaps. By means of the method of the present application, the utilization rate of a storage space is improved, thereby improving the data storage and reading efficiency.

Description

数据存储方法、服务器和存储介质Data storage methods, servers and storage media
本申请要求于2022年06月28日提交中国专利局、申请号为CN202210740249.0、申请名称为“数据存储方法、服务器和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on June 28, 2022, with the application number CN202210740249.0 and the application name "Data storage method, server and storage medium", the entire content of which is incorporated by reference in in this application.
技术领域Technical field
本申请涉及计算机领域,尤其涉及一种数据存储方法、服务器和存储介质。The present application relates to the field of computers, and in particular, to a data storage method, server and storage medium.
背景技术Background technique
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,营销活动也不例外。但由于金融行业的系统用户数量巨大,因此对技术提出了更高的要求。在营销活动中,电子权益代替了原有的纸质权益,可以被更加便捷的推广和使用。同时,数据管理后台还可以对营销活动的参与人数以及权益发放情况进行记录。With the development of computer technology, more and more technologies are applied in the financial field. The traditional financial industry is gradually transforming into financial technology (Fintech), and marketing activities are no exception. However, due to the huge number of system users in the financial industry, higher requirements are placed on technology. In marketing activities, electronic rights replace original paper rights and can be promoted and used more conveniently. At the same time, the data management background can also record the number of participants in marketing activities and the distribution of rights and interests.
现有技术中,数据管理后台可以使用位图算法记录营销活动的信息。例如,数据管理后台可以针对每一权益建立一张位图。数据管理后台可以根据系统用户数量设置每一张位图大小。每一个系统用户可以对应于一张位图中的一位。当该系统内的用户A获取权益B后,该数据管理后台可以将该权益B对应的位图中,该用户A对应的位的值设置为1,实现用户A获取权益B的记录。In the existing technology, the data management background can use bitmap algorithms to record marketing activity information. For example, the data management backend can create a bitmap for each interest. The data management background can set the size of each bitmap according to the number of system users. Each system user can correspond to one bit in a bitmap. When user A in the system acquires equity B, the data management background can set the value of the bit corresponding to user A in the bitmap corresponding to equity B to 1 to realize the record of user A acquiring equity B.
然而,该位图算法中的每一张位图的大小通常需要根据系统用户数量进行设定。当系统用户数量非常大,而实际参与该次营销活动的参与用户数量占比较小时,容易导致每一位图中大量数据空间的浪费,存在数据存储效率低的问题。However, the size of each bitmap in this bitmap algorithm usually needs to be set according to the number of system users. When the number of system users is very large, but the number of users actually participating in the marketing activity is relatively small, it is easy to waste a large amount of data space in each bitmap, and there is a problem of low data storage efficiency.
发明内容Contents of the invention
本申请提供一种数据存储方法、服务器和存储介质,用以解决实际参与该次营销活动的参与用户数量占比较小时,容易导致每一位图中大量数据空间的浪费,存在数据存储效率低的问题。This application provides a data storage method, server and storage medium to solve the problem that the number of participating users actually participating in the marketing activity is relatively small, which easily leads to the waste of a large amount of data space in each bitmap and low data storage efficiency. question.
第一方面,本申请提供一种数据存储方法,包括:In the first aspect, this application provides a data storage method, including:
获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量;其中,当前活动的每一项目对应一个位图;Obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity; among them, each project in the current activity corresponds to a bitmap;
当系统用户数量和预估参与用户数量的比值大于预设阈值时,根据所述系统 用户数量、所述预估参与用户数量和所述允许参与项目数量,确定每一位图的目标分片数量;When the ratio between the number of system users and the estimated number of participating users is greater than the preset threshold, the target number of shards for each bitmap is determined based on the number of system users, the estimated number of participating users, and the number of allowed participating projects. ;
根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录系统用户在所述当前活动的参与信息。The bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity.
可选地,所述根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定每一位图的目标分片数量,具体包括:Optionally, determining the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects, specifically includes:
根据所述预估参与用户数量,确定所述分片数量的范围;Determine the range of the number of shards based on the estimated number of participating users;
根据所述分片数量的范围,计算所述范围内每一所述分片数量对应的位图空间占用量;According to the range of the number of fragments, calculate the bitmap space occupation corresponding to each number of fragments within the range;
比较每一所述分片数量对应的所述位图空间占用量,确定其中位图空间占用量的最小值对应的分片数量为目标分片数量。Compare the bitmap space occupancy corresponding to each of the number of fragments, and determine the number of fragments corresponding to the minimum value of the bitmap space occupancy as the target number of fragments.
可选地,所述计算所述范围内每一所述分片数量对应的位图空间占用量,具体包括:Optionally, calculating the bitmap space occupancy corresponding to each number of fragments within the range specifically includes:
根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述分片数量,确定所述分片数量对应的位图空间占用量。According to the number of system users, the estimated number of participating users, the number of allowed participation items, and the number of slices, the bitmap space occupation corresponding to the number of slices is determined.
可选地,所述根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述分片数量,确定所述分片数量对应的位图空间占用量,具体包括:根据所述允许参与项目数量、所述分片数量和关键字固定长度,确定每一所述位图的关键字总长度;其中,每一所述分片对应于一个关键字;Optionally, determining the bitmap space occupancy corresponding to the number of slices based on the number of system users, the estimated number of participating users, the number of allowed participation projects, and the number of slices, specifically includes: : Determine the total length of keywords for each bitmap according to the number of items allowed to participate, the number of fragments and the fixed length of keywords; wherein each fragment corresponds to one keyword;
根据每一所述分片的平均分片长度、每一所述分片占用百分之五十的存储空间的概率值、每一所述分片平均节省的存储空间值和所述分片数量,确定每一所述位图的全部所述分片的空间占用量;According to the average fragment length of each fragment, the probability value of each fragment occupying fifty percent of the storage space, the average storage space value saved by each fragment and the number of fragments , determine the space occupied by all the fragments of each bitmap;
根据每一所述位图的所述关键字总长度、每一所述位图的全部所述分片的空间占用量和所述允许参与项目数量,确定所述位图空间占用量。The bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupancy of all the slices of each bitmap, and the number of allowed participating items.
可选地,所述记录所述系统用户在所述当前活动的参与信息,具体包括:Optionally, recording the system user's participation information in the current activity specifically includes:
根据所述系统用户在每一位图的分片中的值,确定所述系统用户的参与项目数量;Determine the number of participating projects of the system user according to the value of the system user in each bitmap slice;
当所述参与项目数量小于允许参与项目数量时,将所述系统用户在一位图的分片中的值从第一数据修改为第二数据;当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is less than the number of allowed participating projects, the value of the system user in the slice of the bitmap is modified from the first data to the second data; when the number of participating projects is greater than or equal to the number of allowed participating projects when, denying the system user to continue to participate in the current activity.
可选地,根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录所述系统用户在所述当前活动的参与信息之前,所述方法,还包括:Optionally, the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity. Methods also include:
根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述目标分片数量,确定位图空间占用量;Determine the bitmap space occupancy according to the number of system users, the estimated number of participating users, the number of allowed participating projects, and the number of target shards;
根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量, 确定键值对空间占用量;Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of allowed participating projects;
当所述键值对空间占用量小于所述位图空间占用量时,使用键值对记录所述系统用户在所述当前活动的参与信息。When the space occupied by the key-value pair is less than the space occupied by the bitmap, the key-value pair is used to record the participation information of the system user in the current activity.
可选地,所述方法,还包括:Optionally, the method also includes:
获取所述系统用户对应的键值对中记录的所述系统用户的参与项目数量;Obtain the number of participating projects of the system user recorded in the key-value pair corresponding to the system user;
当所述参与项目数量小于允许参与项目数量时,将所述系统用户的参与项目数量增加单位数值;When the number of participating projects is less than the number of allowed participating projects, increase the number of participating projects of the system user by a unit value;
当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user is refused to continue to participate in the current activity.
第二方面,本申请提供一种数据存储装置,包括:In a second aspect, this application provides a data storage device, including:
获取模块,用于获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量;其中,当前活动的每一项目对应一个位图;The acquisition module is used to obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity; where each project in the current activity corresponds to a bitmap;
处理模块,用于在系统用户数量和预估参与用户数量的比值大于预设阈值时,根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定每一位图的目标分片数量;根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录系统用户在所述当前活动的参与信息。A processing module configured to determine each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects when the ratio between the number of system users and the estimated number of participating users is greater than a preset threshold. The target number of fragments; fragment the bitmap according to the target number of fragments, and use the fragmented bitmap to record the participation information of system users in the current activity.
可选地,所述处理模块,具体用于:Optionally, the processing module is specifically used for:
根据所述预估参与用户数量,确定所述分片数量的范围;Determine the range of the number of shards based on the estimated number of participating users;
根据所述分片数量的范围,计算所述范围内每一所述分片数量对应的位图空间占用量;According to the range of the number of fragments, calculate the bitmap space occupation corresponding to each number of fragments within the range;
比较每一所述分片数量对应的所述位图空间占用量,确定其中位图空间占用量的最小值对应的分片数量为目标分片数量。Compare the bitmap space occupancy corresponding to each of the number of fragments, and determine the number of fragments corresponding to the minimum value of the bitmap space occupancy as the target number of fragments.
可选地,所述处理模块,具体用于:Optionally, the processing module is specifically used for:
根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述分片数量,确定所述分片数量对应的位图空间占用量。According to the number of system users, the estimated number of participating users, the number of allowed participation items, and the number of slices, the bitmap space occupation corresponding to the number of slices is determined.
可选地,所述处理模块,具体用于:Optionally, the processing module is specifically used for:
根据所述允许参与项目数量、所述分片数量和关键字固定长度,确定每一所述位图的关键字总长度;其中,每一所述分片对应于一个关键字;Determine the total length of keywords for each bitmap according to the number of items allowed to participate, the number of fragments and the fixed length of keywords; wherein each fragment corresponds to one keyword;
根据每一所述分片的平均分片长度、每一所述分片占用百分之五十的存储空间的概率值、每一所述分片平均节省的存储空间值和所述分片数量,确定每一所述位图的全部所述分片的空间占用量;According to the average fragment length of each fragment, the probability value of each fragment occupying fifty percent of the storage space, the average storage space value saved by each fragment and the number of fragments , determine the space occupied by all the fragments of each bitmap;
根据每一所述位图的所述关键字总长度、每一所述位图的全部所述分片的空间占用量和所述允许参与项目数量,确定所述位图空间占用量。The bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupancy of all the slices of each bitmap, and the number of allowed participating items.
可选地,所述处理模块,具体用于:Optionally, the processing module is specifically used for:
根据所述系统用户在每一位图的分片中的值,确定所述系统用户的参与项目数量;Determine the number of participating projects of the system user according to the value of the system user in each bitmap slice;
当所述参与项目数量小于允许参与项目数量时,将所述系统用户在一位图的分片中的值从第一数据修改为第二数据;When the number of participating projects is less than the number of allowed participating projects, modify the value of the system user in the slice of the bitmap from the first data to the second data;
当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user is refused to continue to participate in the current activity.
可选地,根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录所述系统用户在所述当前活动的参与信息之前,所述处理模块,还用于:Optionally, the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity. Processing module, also used for:
根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述目标分片数量,确定位图空间占用量;Determine the bitmap space occupancy according to the number of system users, the estimated number of participating users, the number of allowed participating projects, and the number of target shards;
根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定键值对空间占用量;Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of allowed participating projects;
当所述键值对空间占用量小于所述位图空间占用量时,使用键值对记录所述系统用户在所述当前活动的参与信息。When the space occupied by the key-value pair is less than the space occupied by the bitmap, the key-value pair is used to record the participation information of the system user in the current activity.
可选地,所述处理模块,还用于:Optionally, the processing module is also used to:
获取所述系统用户对应的键值对中记录的所述系统用户的参与项目数量;Obtain the number of participating projects of the system user recorded in the key-value pair corresponding to the system user;
当所述参与项目数量小于允许参与项目数量时,将所述系统用户的参与项目数量增加单位数值;When the number of participating projects is less than the number of allowed participating projects, increase the number of participating projects of the system user by a unit value;
当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user is refused to continue to participate in the current activity.
第三方面,本申请提供一种服务器,包括:存储器和处理器;In a third aspect, this application provides a server, including: a memory and a processor;
所述存储器用于存储计算机程序;所述处理器用于根据所述存储器存储的计算机程序执行第一方面及第一方面任一种可能的设计中的数据存储方法。The memory is used to store a computer program; the processor is used to execute the data storage method in the first aspect and any possible design of the first aspect according to the computer program stored in the memory.
第四方面,本申请提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序,当服务器的至少一个处理器执行该计算机程序时,服务器执行第一方面及第一方面任一种可能的设计中的数据存储方法。In a fourth aspect, the present application provides a computer-readable storage medium. A computer program is stored in the computer-readable storage medium. When at least one processor of the server executes the computer program, the server executes the first aspect and any one of the first aspects. possible data storage methods in the design.
第五方面,本申请提供一种计算机程序产品,所述计算机程序产品包括计算机程序,当服务器的至少一个处理器执行该计算机程序时,服务器执行第一方面及第一方面任一种可能的设计中的数据存储方法。In a fifth aspect, the present application provides a computer program product. The computer program product includes a computer program. When at least one processor of the server executes the computer program, the server executes the first aspect and any possible design of the first aspect. The data storage method in .
本申请提供的数据存储方法,通过获取系统用户数量;根据当前活动,确定预估参与用户数量;获取当前活动的允许参与项目数量,当前活动中的每一项目对应一个位图;计算系统用户数量与预估参与用户数量的比值;使用该比值与预设阈值进行比较;当该比值大于预设阈值时,使用位图的方式存储当前活动的参与信息;根据系统用户数量、预估参与用户数量和允许参与项目数量,计算得到每一位图的目标分片数量;根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息的手段,实现提高存储空间的利用率,避免过于集中的访问存储空间,提高数据存储和读取的效率的效果。The data storage method provided by this application is to obtain the number of system users; determine the estimated number of participating users based on the current activity; obtain the number of projects allowed to participate in the current activity, each project in the current activity corresponds to a bitmap; calculate the number of system users The ratio to the estimated number of participating users; use this ratio to compare with the preset threshold; when the ratio is greater than the preset threshold, use a bitmap to store the participation information of the current activity; according to the number of system users, the estimated number of participating users and the number of projects allowed to participate in, calculate the target number of shards for each bitmap; fragment the bitmap according to the number of target shards, and use the fragmented bitmap to record the system user's participation information in the current activity , to achieve the effect of improving the utilization of storage space, avoiding too concentrated access to storage space, and improving the efficiency of data storage and reading.
附图说明Description of drawings
为了更清楚地说明本申请或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in this application or the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are of the present invention. For some embodiments of the application, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.
图1为本申请一实施例提供的一种位图示意图;Figure 1 is a schematic diagram of a bitmap provided by an embodiment of the present application;
图2为本申请一实施例提供的一种权益的位图示意图;Figure 2 is a schematic bitmap diagram of a right provided by an embodiment of the present application;
图3为本申请一实施例提供的一种活动权益发放的场景示意图;Figure 3 is a schematic diagram of a scenario for issuing activity rights according to an embodiment of the present application;
图4为本申请一实施例提供的一种数据存储方法的流程图;Figure 4 is a flow chart of a data storage method provided by an embodiment of the present application;
图5为本申请一实施例提供的一种数据存储方法的流程图;Figure 5 is a flow chart of a data storage method provided by an embodiment of the present application;
图6为本申请一实施例提供的一种数据存储方法的流程图;Figure 6 is a flow chart of a data storage method provided by an embodiment of the present application;
图7为本申请一实施例提供的一种数据存储过程的流程图;Figure 7 is a flow chart of a data storage process provided by an embodiment of the present application;
图8为本申请一实施例提供的一种数据存储装置的结构示意图;Figure 8 is a schematic structural diagram of a data storage device provided by an embodiment of the present application;
图9为本申请一实施例提供的一种服务器的硬件结构示意图。Figure 9 is a schematic diagram of the hardware structure of a server provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of this application clearer, the technical solutions in this application will be clearly and completely described below in conjunction with the drawings in this application. Obviously, the described embodiments are part of the embodiments of this application. , not all examples. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换。例如,在不脱离本文范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。The terms "first", "second", "third", "fourth", etc. in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe specific objects. Sequence or sequence. It should be understood that data so used are interchangeable where appropriate. For example, without departing from the scope of this article, the first information may also be called second information, and similarly, the second information may also be called first information.
取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to determining."
再者,如同在本文中所使用的,单数量形式“一”、“一个”和“该”也可用于描述复数量形式,除非上下文中有相反的指示。Furthermore, as used herein, the singular forms "a," "an," and "the" may also be used to describe the plural forms unless the context dictates otherwise.
应当进一步理解,术语“包含”、“包括”表明存在的特征、步骤、操作、元件、组件、项目、种类、和/或组,但不排除一个或多个其他特征、步骤、操作、元件、组件、项目、种类、和/或组的存在、出现或添加。It should be further understood that the terms "comprising" and "including" indicate the presence of features, steps, operations, elements, components, items, categories, and/or groups, but do not exclude one or more other features, steps, operations, elements, The presence, occurrence, or addition of components, items, categories, and/or groups.
此处使用的术语“或”和“和/或”被解释为包括性的,或意味着任一个或 任何组合。因此,“A、B或C”或者“A、B和/或C”意味着“以下任一个:A;B;C;A和B;A和C;B和C;A、B和C”。仅当元件、功能、步骤或操作的组合在某些方式下内在地互相排斥时,才会出现该定义的例外。The terms "or" and "and/or" as used herein are to be construed as inclusive or to mean any one or any combination. Therefore, "A, B or C" or "A, B and/or C" means "any of the following: A; B; C; A and B; A and C; B and C; A, B and C" . Exceptions to this definition occur only when the combination of elements, functions, steps, or operations is inherently mutually exclusive in some manner.
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,营销活动也不例外。但由于金融行业的系统用户数量巨大,因此对技术提出的更高的要求。在营销活动中,使用电子权益代替原有的纸质权益,可以被更加便捷的推广和使用的同时,还可以更加精准的把控权益发放情况,从而精准控制营销成本。其中,权益可以包括优惠券、折扣券、会员卡折扣等。例如,服务器中可以设置有数据管理后台,该数据管理后台可以对参与活动的用户数量、活动的权益发放情况、每一用户的获取权益数量等维度的信息进行记录。其中,对每一用户获取的权益数量的记录,还可以校验每一用户在该次活动中最多可以获得的权益数量,从而保证用户在该次活动中获取的权益数量小于预设上限。With the development of computer technology, more and more technologies are applied in the financial field. The traditional financial industry is gradually transforming into financial technology (Fintech), and marketing activities are no exception. However, due to the huge number of system users in the financial industry, higher requirements are placed on technology. In marketing activities, using electronic rights and interests instead of original paper rights can not only be more conveniently promoted and used, but also more accurately control the distribution of rights, thereby accurately controlling marketing costs. Among them, benefits can include coupons, discount coupons, membership card discounts, etc. For example, a data management backend can be set up in the server. The data management backend can record information on the number of users participating in the activity, the rights issuance of the activity, the number of rights obtained by each user, and other dimensions. Among them, the record of the number of rights and interests obtained by each user can also be verified to verify the maximum number of rights and interests that each user can obtain in this activity, thereby ensuring that the number of rights and interests obtained by the user in this activity is less than the preset upper limit.
现有技术中,为了精准控制权益发放数量,服务器可以采用redis集群来存储每一用户已经获取的权益数量。在每一次权益发放时,服务器可以查询用户在redis中存储的已经获取的权益数量,并根据该权益数量进行校验。如果该权益数量小于预设上限,则服务器可以将该权益发送给该用户。否则,该用户将无法继续获取该权益。服务器可以使用键值对(key-value)的方式,将每一用户已经获取的权益数量存储在redis上。在该key-value的数据结构中,每一个用户需要一个key的存储空间和一个value的存储空间。其中,key中可以包括用户的用户ID等用于唯一标识该用户的信息。Key的长度通常是固定的。通常一个key占用约50byte的存储空间。参与活动的用户数量,与存储用户的参与信息所需要占用空间通常呈正比。当拥有海量用户时,假设有1亿用户,使用key-value的数据结构时,其所有用户的key的空间占用量就可以达到:In the existing technology, in order to accurately control the number of rights issued, the server can use a redis cluster to store the number of rights each user has obtained. At each time the rights are issued, the server can query the number of rights the user has obtained stored in redis, and perform verification based on the number of rights. If the number of rights is less than the preset upper limit, the server can send the rights to the user. Otherwise, the user will no longer be able to obtain this benefit. The server can use key-value pairs to store the number of rights and interests that each user has obtained in redis. In the key-value data structure, each user needs a key storage space and a value storage space. The key may include the user's user ID and other information used to uniquely identify the user. The length of Key is usually fixed. Usually a key occupies about 50 bytes of storage space. The number of users participating in an activity is usually proportional to the space required to store user participation information. When there are a large number of users, assuming there are 100 million users, and using the key-value data structure, the space occupied by the keys of all users can reach:
100,000,000×50(byte)÷1024÷1024≈4768(M)100,000,000×50(byte)÷1024÷1024≈4768(M)
因此,相较于使用key-value的数据结构,位图bitmap可以在面对海量用户时,更好的节省存储空间。在bitmap中,则每一用户需要对应一个位(1bit)的存储空间。该1bit的存储空间可以设置有数值0或者1。该1bit的存储空间可以用于指示该用户是否已经获取该权益。假设有1亿用户,服务器可以分配1亿bit的存储空间,用于存储该1亿用户是否获取该权益。如图1所示,每一个格子可以对应于1bit。其中,最后一个格子对应第1亿个bit。该bitmap的存储空间占用量为:Therefore, compared to using key-value data structures, bitmap bitmap can better save storage space when facing a large number of users. In the bitmap, each user needs to correspond to one bit (1bit) of storage space. The 1-bit storage space can be set with a value of 0 or 1. The 1-bit storage space can be used to indicate whether the user has obtained the right. Assuming there are 100 million users, the server can allocate 100 million bits of storage space to store whether the 100 million users have obtained the rights. As shown in Figure 1, each grid can correspond to 1 bit. Among them, the last grid corresponds to the 100 millionth bit. The storage space occupied by this bitmap is:
100,000,000(bit)÷8÷1024÷1024≈12(M)100,000,000(bit)÷8÷1024÷1024≈12(M)
即,当存在1亿用户时,一个位图所需要的存储空间为12兆。当该次营销活动中,每一用户获取权益的预设上限为N时,服务器中可以设置有N个bitmap。即,其空间占用量为12N兆。例如,如图2所示,第一张券、第二张券和第三 张券分别对应三个权益的3个bitmap。当偏移值为1的用户获得三个权益时,该三个位图中偏移值为1的格子的值均为1。当偏移值为4的用户获取2个权益时,该三个位图中偏移值为4的三个格子中,有两个格子的值为1。That is, when there are 100 million users, the storage space required for a bitmap is 12 MB. When the preset upper limit of rights obtained by each user in this marketing activity is N, N bitmaps can be set in the server. That is, its space occupation is 12N trillion. For example, as shown in Figure 2, the first coupon, the second coupon and the third coupon respectively correspond to 3 bitmaps of three rights. When a user with an offset value of 1 obtains three benefits, the values of the grids with an offset value of 1 in the three bitmaps are all 1. When a user with an offset value of 4 obtains 2 interests, two of the three grids with an offset value of 4 in the three bitmaps have a value of 1.
然而,当存在海量的系统用户,但是实际参与活动的用户数量较少时,该key-value的数据结构更能节省空间。因为在bitmap中,一旦id很大的用户参与活动后,该bitmap的大小就需要根据该用户id对应设置。即位图占用的内存大小是由最大偏移量决定。即使位图只存储了一个用户的信息,如果这个用户id很大,仍会占用大量空间。例如,当第1亿个用户参与活动时,如图1所示,该bitmap的大小为12M。在极端情况下,假设只有第1亿个用户参与活动,则该12M的存储空间中,大量的存储空间将被浪费。而key-value的数据结构则不同。服务器可以在一个用户参与活动时,创建该用户对应的键值对。即,当只有第1亿个用户参与活动时,服务器只需要使用50byte的存储空间即可。可见,在系统用户数量非常大,而实际参与该次营销活动的参与用户数量占比较小时,容易导致每一位图中大量数据空间的浪费,存在数据存储效率低的问题。However, when there are a large number of system users but the number of users actually participating in the activity is small, the key-value data structure can save more space. Because in the bitmap, once a user with a large ID participates in an activity, the size of the bitmap needs to be set accordingly according to the user ID. That is, the memory size occupied by the bitmap is determined by the maximum offset. Even if the bitmap only stores information about one user, if the user ID is large, it will still take up a lot of space. For example, when the 100 millionth user participates in the activity, as shown in Figure 1, the size of the bitmap is 12M. In extreme cases, assuming that only the 100 millionth user participates in the activity, a large amount of the 12M storage space will be wasted. The key-value data structure is different. The server can create a key-value pair corresponding to a user when he or she participates in an activity. That is, when only the 100 millionth user participates in the activity, the server only needs to use 50 bytes of storage space. It can be seen that the number of users in the system is very large, but the number of users who actually participated in the marketing activity is relatively small, which easily leads to the waste of a large amount of data space in each bitmap and the problem of low data storage efficiency.
针对上述问题,本申请提出了一种基于bitmap分片的数据存储方法。服务器可以将一个bitmap分为L片,每一片的偏移量为最大偏移量/L。并且,第1至最大偏移量/L个用户的参与信息将被存储在第一个分片上,第最大偏移量/L+1至2(最大偏移量/L)个用户的参与信息将被存储在第二个分片上,以此类推。例如,当系统用户数量为1亿,并且只有第1亿个用户参与活动时,只有最后一个分片的最大偏移量对应的位被使用,而其他分片内存并不会被占用。因此,实际使用的存储空间只有12/L(M)。这样即使有个偏移量很大,也只会浪费一个分片的空间。该分片方法的使用,极大提高了内存利用率。此外,该分片方法的使用,还可以负载均衡,从而避免访问存储空间的请求过于集中,导致数据处理效率降低。In response to the above problems, this application proposes a data storage method based on bitmap sharding. The server can divide a bitmap into L slices, and the offset of each slice is the maximum offset/L. Moreover, the participation information of users from the 1st to the maximum offset/L will be stored on the first shard, and the participation information of the users from the 1st to the maximum offset/L+1 to 2 (maximum offset/L) will be stored on the second shard, and so on. For example, when the number of system users is 100 million and only the 100 millionth user participates in the activity, only the bit corresponding to the maximum offset of the last shard is used, and the memory of other shards will not be occupied. Therefore, the actual storage space used is only 12/L(M). In this way, even if there is a large offset, it will only waste the space of one fragment. The use of this sharding method greatly improves memory utilization. In addition, the use of this sharding method can also achieve load balancing, thereby avoiding excessive concentration of requests to access storage space, resulting in reduced data processing efficiency.
在对位图实现分片后,服务器还需要为每一个分片在redis中设置一个key。该key的使用,可以使服务器快速找到对应的分片。与该key对应的value中则存储有一个位图的一个分片。因此,分片越多redis的key也越多,key占用内存也就越多。现有技术中,对于分片数量并没有专门的一个可靠的算法来计算。现有的分片数量的限制只与redis集群节点数量相关。服务器只要分片数量超过redis集群节点数量从而不至于导致请求集中在某个节点即可。显然,该分片数量的设置无法达到最优。因此,本申请中,服务器可以根据系统用户数量、预估参与用户数量和允许参与项目数量,确定每一位图的目标分片数量。该目标分片数量即为最优的分片数量。服务器可以根据该目标分片数量对位图进行分片,从而实现用户的参与信息的记录。After implementing sharding on the bitmap, the server also needs to set a key in redis for each shard. The use of this key allows the server to quickly find the corresponding shards. The value corresponding to the key stores a slice of a bitmap. Therefore, the more shards there are, the more redis keys will be, and the keys will occupy more memory. In the existing technology, there is no specific and reliable algorithm for calculating the number of shards. The existing limit on the number of shards is only related to the number of redis cluster nodes. As long as the number of server shards exceeds the number of redis cluster nodes so that requests are not concentrated on a certain node. Obviously, the setting of this number of shards cannot be optimal. Therefore, in this application, the server can determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate. The target number of shards is the optimal number of shards. The server can fragment the bitmap according to the target number of fragments, thereby recording the user's participation information.
例如,假设现有1亿用户,用户id为从1至1亿递增。将位图分片后,每个分片中可以存放2^20≈105w用户的信息。服务器可以将一个位图分为95片。For example, assuming there are 100 million users, the user ID increases from 1 to 100 million. After the bitmap is fragmented, each fragment can store 2^20≈105w user information. The server can divide a bitmap into 95 slices.
服务器可以通过redis指令SETBIT(key,offset,value),在系统用户的活动参与信息发生变化时,设置该活动参与信息。The server can use the redis command SETBIT(key, offset, value) to set the activity participation information when the system user's activity participation information changes.
服务器可以计算系统用户所在分片序号(sliceNo),其计算公式为:The server can calculate the slice number (sliceNo) where the system user is located. The calculation formula is:
sliceNo=userId/(2^20)sliceNo=userId/(2^20)
即,当用户ID(userID)<2^20时,sliceNo=0。当用户ID在2^20至2^21之间时,sliceNo=1,以此类推。That is, when user ID (userID)<2^20, sliceNo=0. When the user ID is between 2^20 and 2^21, sliceNo=1, and so on.
服务器可以计算系统用户在其对应的分片上的偏移量(offset),其计算公式为:The server can calculate the offset of the system user on its corresponding shard. The calculation formula is:
offset=userId%(2^20)offset=userId%(2^20)
服务器可以根据上述两个公式确定分片序号以及偏移量。根据该分片序号和偏移量,上述redis指令可以变为:SETBIT(key+sliceNo,offset,value)。The server can determine the fragment sequence number and offset based on the above two formulas. According to the slice sequence number and offset, the above redis instruction can become: SETBIT(key+sliceNo,offset,value).
若需要对第1亿个用户进行设置,则指令为:SETBIT(key+95,385280,1)。该指令可以控制服务器在第95个分片的385280个位置上设置第1亿个用户的状态。当服务器仅有第1亿个用户参与当前活动,其他94个分片未被使用时,其占用的存储空间量可以为:If you need to set the 100 millionth user, the command is: SETBIT(key+95,385280,1). This command can control the server to set the status of the 100 millionth user at position 385280 in the 95th shard. When the server only has the 100 millionth user participating in the current activity and the other 94 shards are not used, the amount of storage space occupied can be:
385280/8/1024/1024≈43.37kb385280/8/1024/1024≈43.37kb
可见,对位图进行分片,可以在活动参与人数较少时,有效节省位图的存储空间。It can be seen that slicing the bitmap can effectively save the storage space of the bitmap when the number of event participants is small.
下面以具体地实施例对本申请的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。The technical solution of the present application will be described in detail below with specific examples. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
图3示出了本申请一实施例提供的一种活动权益发放的场景示意图。如图3所示,服务器可以与第三方渠道、应用程序的产品交易功能、应用程序的客户行为关联。当服务器检测到用户在第三方渠道点击查看营销活动时,服务器可以向该用户发放权益。或者,当服务器检测到用户在应用程序完成产品交易时,服务器可以向该用户发放权益。或者,当服务器检测到用户在应用程序执行对应的行为操作时,服务器可以向该用户发放权益。服务器在确定需要向用户发放权益后,调用MES-SERVICE中的发券接口,实现该权益的发放。其后,服务器可以在redis集群中查询该用户已经获取的权益数量。服务器可以调用MES-SERVICE,对该权益数量进行校验。如果该用户的权益数量小于预设上限,则服务器可以向该用户发放该权益。同时,服务器还可以将该权益的发放记录写入mes数据库(mesdb)的流水表中。服务器可以在redis中更新该用户已经获取的权益数量。同时,服务还可以使用redis同步规则在redis集群中同步这些信息。Figure 3 shows a schematic diagram of a scenario for issuing activity rights according to an embodiment of the present application. As shown in Figure 3, the server can be associated with third-party channels, the application's product trading functions, and the application's customer behavior. When the server detects that a user clicks to view marketing activities on a third-party channel, the server can issue benefits to the user. Alternatively, when the server detects that a user completes a product transaction in the application, the server can issue benefits to the user. Or, when the server detects that the user performs corresponding behavioral operations in the application, the server can issue rights to the user. After the server determines that it needs to issue rights to the user, it calls the coupon issuance interface in MES-SERVICE to realize the issuance of the rights. Afterwards, the server can query the redis cluster for the number of rights and interests that the user has obtained. The server can call MES-SERVICE to verify the amount of equity. If the user's number of rights is less than the preset upper limit, the server can issue the rights to the user. At the same time, the server can also write the rights issuance record into the flow table of the mes database (mesdb). The server can update the number of rights and interests that the user has obtained in redis. At the same time, the service can also use redis synchronization rules to synchronize this information in the redis cluster.
本申请中,以服务器为执行主体,执行如下实施例的数据存储方法。具体地,该执行主体可以为服务器的硬件装置,或者为服务器中实现下述实施例的软件应用,或者为安装有实现下述实施例的软件应用的计算机可读存储介质,或者为实现下述实施例的软件应用的代码。In this application, the server is used as the execution subject to execute the data storage method in the following embodiment. Specifically, the execution subject may be a hardware device of the server, or a software application in the server that implements the following embodiments, or a computer-readable storage medium installed with a software application that implements the following embodiments, or a computer-readable storage medium that implements the following embodiments. Code for the software application of the embodiment.
图4示出了本申请一实施例提供的一种数据存储方法的流程图。在图3所示实施例的基础上,如图4所示,以服务器为执行主体,本实施例的方法可以包括如下步骤:Figure 4 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiment shown in Figure 3, as shown in Figure 4, with the server as the execution subject, the method of this embodiment may include the following steps:
S101、获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量。其中,当前活动的每一项目对应一个位图。S101. Obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity. Among them, each currently active item corresponds to a bitmap.
本实施例中,服务器可以获取系统用户数量。例如,在银行等金融领域中,该系统用户数量可以为在该行开卡的用户数量。又如,在某应用程序中,该系统用户数量可以为在该应用程序中完成注册的用户数量。In this embodiment, the server can obtain the number of system users. For example, in the financial field such as banks, the number of users of the system can be the number of users who have opened cards in the bank. For another example, in a certain application, the number of system users can be the number of users who have completed registration in the application.
服务器还可以根据当前活动,确定预估参与用户数量。可选地,该预估参与用户数量可以为服务器根据该活动的历史参与人数估算得到的数量。或者,该预估参与用户数量还可以为服务器根据当前活动的参与条件筛选得到的符合当前活动参与规则的用户数量。The server can also determine the estimated number of participating users based on current activity. Optionally, the estimated number of participating users may be a number estimated by the server based on the historical number of participants of the event. Alternatively, the estimated number of participating users may also be the number of users who meet the participation rules of the current activity and are filtered by the server based on the participation conditions of the current activity.
服务器还可以获取当前活动的允许参与项目数量。该允许参与项目数量为每一参与活动的用户,在当前活动中允许参与的项目数量。例如,当前活动可以包括20项活动,在活动期间,每一用户最多可以参与6项活动,则该允许参与项目数量为6。又如,当前活动包括5种权益发放途径,每一权益发放途径分6次发放权益。在活动期间,为了控制活动成本,每一用户最多可以获取8个权益。此时,允许参与项目数量为8。The server can also obtain the number of currently active projects allowed to participate. The number of projects allowed to participate in is the number of projects that each user participating in the activity is allowed to participate in in the current activity. For example, the current activity may include 20 activities. During the activity, each user can participate in up to 6 activities, so the number of allowed participation projects is 6. For another example, the current activity includes 5 rights issuance channels, and each rights issuance channel distributes rights in 6 times. During the event, in order to control event costs, each user can obtain up to 8 rights. At this time, the number of participating projects allowed is 8.
S102、当系统用户数量和预估参与用户数量的比值大于预设阈值时,根据系统用户数量、预估参与用户数量和允许参与项目数量,确定每一位图的目标分片数量。S102. When the ratio between the number of system users and the estimated number of participating users is greater than the preset threshold, determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
本实施例中,在数据存储过程中,如果系统用户数量较大,而预估参与用户数量较少,则使用键值对的方式存储当前活动的参与信息可以更加有效的节省存储空间。随着预估参与用户数量增多,键值对消耗的存储空间将会线性增长。而位图所需的存储空间则相对固定。因此,当系统用户数量不变,而预估参与用户数量较多时,使用位图算法存储当前活动的参与信息可以节省更多的存储空间。因此,服务器可以通过计算预估参与用户数量除以系统用户数量的商,确定系统用户数量与预估参与用户数量的比值。在系统用户数量不变的情况下,预估参与用户数量越多,则比值越大。针对该特性,服务器可以使用该比值与预设阈值进行比较。当该比值大于预设阈值时,说明预估参与用户数量较多,则服务器可以使用位图的方式存储当前活动的参与信息。In this embodiment, during the data storage process, if the number of system users is large but the estimated number of participating users is small, using key-value pairs to store the participation information of the current activity can more effectively save storage space. As the estimated number of participating users increases, the storage space consumed by key-value pairs will increase linearly. The storage space required for bitmaps is relatively fixed. Therefore, when the number of system users remains unchanged but the estimated number of participating users is large, using a bitmap algorithm to store the participation information of the current activity can save more storage space. Therefore, the server can determine the ratio of the number of system users to the estimated number of participating users by calculating the quotient of the estimated number of participating users divided by the number of system users. When the number of system users remains unchanged, the greater the estimated number of participating users, the greater the ratio. For this feature, the server can use this ratio to compare with a preset threshold. When the ratio is greater than the preset threshold, it means that the estimated number of participating users is large, and the server can use a bitmap to store the participation information of the current activity.
一种示例中,当该比值小于等于预设阈值时,说明预估参与用户数量较少,则服务器可以使用键值对的方式存储当前活动的参与信息。该过程具体可以如图6中步骤S304所示。In one example, when the ratio is less than or equal to the preset threshold, it means that the estimated number of participating users is small, and the server can store the participation information of the current activity in the form of key-value pairs. This process may be specifically shown in step S304 in Figure 6 .
当确定服务器使用位图方式存储参与信息后,由于系统用户数量通常过于庞 大,而预估参与用户数量小于系统用户数量。因此,当服务器确定使用位图算法来存储当前活动的参与信息时,位图中通常存在较多未被使用的位。为了进一步提高位图的存储空间的利用率,服务器可以根据系统用户数量、预估参与用户数量和允许参与项目数量,计算得到每一位图的目标分片数量。服务器可以根据该目标分片数量对每一位图进行分片,从而节约存储空间。该过程中,服务器计算目标分片数量的具体过程可以如图5所示。After it is determined that the server uses bitmaps to store participation information, the estimated number of participating users is smaller than the number of system users because the number of system users is usually too large. Therefore, when the server determines to use a bitmap algorithm to store participation information for the current activity, there are usually more unused bits in the bitmap. In order to further improve the utilization of bitmap storage space, the server can calculate the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects. The server can fragment each bitmap according to the target number of fragments, thereby saving storage space. In this process, the specific process of the server calculating the number of target shards can be shown in Figure 5.
S103、根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息。S103. Segment the bitmap according to the target number of shards, and use the segmented bitmap to record the system user's participation information in the current activity.
本实施例中,服务器可以根据步骤S102计算得到的目标分片数量,对每一位图进行分片。服务器可以使用键值对(key-value)存储该分片后的位图。其中,key用于存储该分片的分片信息。该分片信息中至少包括位图序号和分片序号。例如,当包括3张位图,目标分片数量为10时,位图序号为1且分片序号为5的key值用于存储第一张位图中第五个分片的位图信息。其中,value用于存储该分片的位图信息。该分片的最大占用存储空间为系统用户数量与目标分片数量的比值。则,每一分片的位图信息中包括系统用户数量/目标分片数量(bit)的参与信息。In this embodiment, the server may fragment each bitmap according to the target number of fragments calculated in step S102. The server can store the fragmented bitmap using key-value pairs. Among them, key is used to store the sharding information of the shard. The fragmentation information includes at least a bitmap sequence number and a fragmentation sequence number. For example, when three bitmaps are included and the target number of fragments is 10, the key value with bitmap number 1 and fragment number 5 is used to store the bitmap information of the fifth fragment in the first bitmap. Among them, value is used to store the bitmap information of the fragment. The maximum storage space occupied by this shard is the ratio of the number of system users to the number of target shards. Then, the bitmap information of each shard includes the participation information of the number of system users/the number of target shards (bits).
一种示例中,当服务器需要记录一个系统用户在当前活动的参与信息时,其具体步骤可以包括:In an example, when the server needs to record a system user's participation information in the current activity, the specific steps may include:
步骤1、服务器根据系统用户在每一位图的分片中的值,确定系统用户的参与项目数量。 Step 1. The server determines the number of participating projects of the system user based on the system user's value in each bitmap slice.
本步骤中,服务器可以根据系统用户的用户ID,确定该系统用户在每一位图中对应的分片序号和偏移量。其中,用户ID为用于唯一标识一个系统用户的数字编码。例如,当包括1亿个系统用户时,第1亿个系统用户的用户ID可以为100000000。当目标分片数量为100时,每一个分片用于记录1百万个系统用户的参与信息。因此,每一个分片中可以包括1百万个位。该第1亿个用户的分片序号为10,且偏移量为1百万。即该第1亿个用户可以被存储在第10个分片的第1百万个位上。根据该分片序号和该偏移量,服务器可以获取每一位图中,该系统用户对应的值。服务器可以根据该值,确定该系统用户参与的项目数量。由于在位图中,当用户参与活动时,其值被设置为第二数据。因此,服务器可以直接统计该第二数据的个数。由于在每一位中其值仅包括0和1两种。该第二数据为1。因此,服务器可以直接通过累加的方式,计算得到该系统用户的参与项目数量。In this step, the server can determine the fragment sequence number and offset corresponding to the system user in each bitmap based on the user ID of the system user. Among them, the user ID is a numerical code used to uniquely identify a system user. For example, when including 100 million system users, the user ID of the 100 millionth system user could be 100000000. When the target number of shards is 100, each shard is used to record the participation information of 1 million system users. Therefore, each slice can contain 1 million bits. The shard number of the 100 millionth user is 10, and the offset is 1 million. That is, the 100 millionth user can be stored in the 1 millionth bit of the 10th shard. According to the fragment sequence number and the offset, the server can obtain the value corresponding to the system user in each bitmap. The server can determine the number of projects in which the system user participates based on this value. Since in the bitmap, when the user participates in the activity, its value is set to the second data. Therefore, the server can directly count the number of the second data. Because the value of each bit only includes 0 and 1. The second data is 1. Therefore, the server can directly calculate the number of participating projects of the system users through accumulation.
步骤2、当参与项目数量小于允许参与项目数量时,服务器将系统用户在一位图的分片中的值从第一数据修改为第二数据。当参与项目数量等于允许参与项目数量时,服务器拒绝系统用户参与项目。Step 2: When the number of participating projects is less than the number of allowed participating projects, the server modifies the system user's value in the slice of the bitmap from the first data to the second data. When the number of participating projects is equal to the number of allowed participating projects, the server refuses the system user to participate in the project.
本步骤中,服务器可以比较系统用户的参与项目数量和允许参与项目数量。 由于允许参与项目数量与位图数量相同。因此,计算得到的参与项目数量最大值为与允许参与项目数量相等。当参与项目数量等于允许参与项目数量时,说明该系统用户已经获取到允许获取的全部的权益,参与了当前活动可以参与的全部活动内容。因此,当参与项目数量等于允许参与项目数量时,服务器拒绝系统用户继续参与当前活动。服务器将不再给该系统用户发放新的权益。否则,当参与项目数量小于允许参与项目数量时,说明该系统用户还未获取全部允许获取的权益。或者,系统用户还可以继续参与当前活动的其他项目。因此,当参与项目数量小于允许参与项目数量时,服务器可以将一个位图中的值从第一数据修改为第二数据。例如,当包括5个位图,且用户已经获取了2个权益时,该5个位图中存在2个位图中的值为第二数据。此时,服务器可以从另外3个值为第一数据的位图中随机选择一个位图,并将该系统用户对应的值从第一数据修改为第二数据。由于在位图中每一个位的值包括0和1两种。通常情况下第一数据为0,第二数据为1。该第一数据用于是指该用户还未获取该权益。而第二数据则用于指示该用户已经获取该权益。In this step, the server can compare the number of projects the system user participates in and the number of projects allowed to participate. Since the number of participating projects allowed is the same as the number of bitmaps. Therefore, the calculated maximum number of participating projects is equal to the number of allowed participating projects. When the number of participating projects is equal to the number of projects allowed to participate, it means that the user of the system has obtained all the rights and interests allowed and participated in all the activities that can be participated in the current activity. Therefore, when the number of participating projects is equal to the number of allowed participating projects, the server refuses the system user to continue participating in the current activity. The server will no longer issue new rights to users of this system. Otherwise, when the number of participating projects is less than the number of projects allowed to participate, it means that the user of the system has not obtained all the rights allowed. Alternatively, system users can continue to participate in other currently active projects. Therefore, when the number of participating projects is less than the number of allowed participating projects, the server can modify the value in one bitmap from the first data to the second data. For example, when 5 bitmaps are included and the user has obtained 2 benefits, the value in 2 of the 5 bitmaps is the second data. At this time, the server can randomly select a bitmap from three other bitmaps whose value is the first data, and modify the value corresponding to the system user from the first data to the second data. Because the value of each bit in the bitmap includes two types: 0 and 1. Normally the first data is 0 and the second data is 1. The first data is used to mean that the user has not obtained the rights. The second data is used to indicate that the user has obtained the rights.
本申请提供的数据存储方法,服务器可以获取系统用户数量。服务器还可以根据当前活动,确定预估参与用户数量。服务器还可以获取当前活动的允许参与项目数量。当前活动中的每一项目对应一个位图。服务器可以计算系统用户数量与预估参与用户数量的比值。服务器可以使用该比值与预设阈值进行比较。当该比值大于预设阈值时,服务器可以使用位图的方式存储当前活动的参与信息。服务器可以根据系统用户数量、预估参与用户数量和允许参与项目数量,计算得到每一位图的目标分片数量。服务器可以根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息。本申请中,通过计算分片数量,实现位图的分片存储,提高了存储空间的利用率,避免了过于集中的访问存储空间,提高了数据存储和读取的效率。With the data storage method provided by this application, the server can obtain the number of system users. The server can also determine the estimated number of participating users based on current activity. The server can also obtain the number of currently active projects allowed to participate. Each item in the current activity corresponds to a bitmap. The server can calculate the ratio of the number of system users to the estimated number of participating users. The server can use this ratio to compare to a preset threshold. When the ratio is greater than the preset threshold, the server can use a bitmap to store the participation information of the current activity. The server can calculate the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate. The server can fragment the bitmap according to the target number of fragments, and use the fragmented bitmap to record the participation information of system users in the current activity. In this application, by calculating the number of shards, the sharded storage of the bitmap is realized, which improves the utilization of storage space, avoids overly concentrated access to the storage space, and improves the efficiency of data storage and reading.
图5示出了本申请一实施例提供的一种数据存储方法的流程图。在图3和图4所示实施例的基础上,如图5所示,以服务器为执行主体,本实施例中目标分片数量的计算过程可以包括如下步骤:Figure 5 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiments shown in Figures 3 and 4, as shown in Figure 5, with the server as the execution subject, the calculation process of the target number of shards in this embodiment may include the following steps:
S201、根据预估参与用户数量,确定分片数量的范围。S201. Determine the range of the number of shards based on the estimated number of participating users.
本实施例中,服务器可以根据预估参与用户数量,确定分片数量的最大值。该分片数量的最大值为该预估参与用户数量。例如,当包括100个系统用户,预估参与用户数量为10时,分片数量的最大值为10。分片数量的最小值为1。当一个位图包括1个分片时,即对该位图不进行分片。当包括100个系统用户,预估参与用户数量为10时,在最坏的情况下,参与当前活动的系统用户平均的分布在全部系统用户中。此时,当存在10个分片时,每一个分片中分别存在1个用户,则该10个分片都将被使用。当存在20个分片时,由于一共仅存在10个 用户,因此,也仅存在10个分片被使用。由于每一分片除了位图的分片所占用的存储空间外,每一位图的分片还对应于一个key,该key所占用的空间将随着位图分片数量的增加而增加。因此,分片数量范围的下限为1,上限为预估参与用户数量。In this embodiment, the server can determine the maximum number of shards based on the estimated number of participating users. The maximum number of shards is the estimated number of participating users. For example, when there are 100 system users and the estimated number of participating users is 10, the maximum number of shards is 10. The minimum number of shards is 1. When a bitmap includes 1 fragment, the bitmap is not fragmented. When including 100 system users and the estimated number of participating users is 10, in the worst case, the system users participating in the current activity are evenly distributed among all system users. At this time, when there are 10 shards and there is 1 user in each shard, all 10 shards will be used. When there are 20 shards, since there are only 10 users in total, only 10 shards are used. Since each fragment, in addition to the storage space occupied by the bitmap fragment, each bitmap fragment also corresponds to a key, and the space occupied by the key will increase as the number of bitmap fragments increases. Therefore, the lower limit of the shard number range is 1, and the upper limit is the estimated number of participating users.
S202、根据分片数量的范围,计算范围内每一分片数量对应的位图空间占用量。S202. According to the range of the number of fragments, calculate the bitmap space occupation corresponding to each number of fragments within the range.
本步骤中,服务器可以在确定分片数量的范围后,使用该范围中的每一分片数量,与系统用户数量、预估参与用户数量和允许参与项目数量一起,计算得到每一分片数量对应的位图空间占用量。例如,当分片数量的下限为1,上限为1000时,服务器可以分别计算分片数量为1、2、……、1000时的位图空间占用量。计算得到的数据可以如表1所示。在表1中枚举了在系统用户数量不变时,每一分片数量对应的位图空间占用量(Mem)。该位图空间占用量(Mem)的单位为byte。In this step, after determining the range of the number of shards, the server can use the number of each shard in the range, together with the number of system users, the estimated number of participating users, and the number of allowed participating projects, to calculate the number of each shard. The corresponding bitmap space occupied. For example, when the lower limit of the number of shards is 1 and the upper limit is 1000, the server can calculate the bitmap space usage when the number of shards is 1, 2,..., 1000 respectively. The calculated data can be shown in Table 1. Table 1 enumerates the bitmap space usage (Mem) corresponding to each number of shards when the number of system users remains unchanged. The unit of the bitmap space occupied (Mem) is byte.
表1Table 1
Figure PCTCN2022130421-appb-000001
Figure PCTCN2022130421-appb-000001
一种示例中,为了提高计算效率,服务器还可以针对不同系统用户数量、不同预估参与用户数量、不同分片数量,预先计算得到分片数量对照表。在系统用户数量不变时,该对照表可以如表2所示。其中,U表示预估参与用户数量,t表示分片数量,Mem(byte)表示位图空间占用量,单位为byte。表2In one example, in order to improve calculation efficiency, the server can also pre-calculate a shard number comparison table based on the number of different system users, different estimated number of participating users, and different number of shards. When the number of system users remains unchanged, the comparison table can be as shown in Table 2. Among them, U represents the estimated number of participating users, t represents the number of shards, and Mem (byte) represents the bitmap space occupied in byte. Table 2
Figure PCTCN2022130421-appb-000002
Figure PCTCN2022130421-appb-000002
Figure PCTCN2022130421-appb-000003
Figure PCTCN2022130421-appb-000003
一种示例中,服务器在确定系统用户数量、预估参与用户数量、允许参与项目数量和分片数量后,计算位图空间占用量的具体过程可以包括:In one example, after the server determines the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of shards, the specific process of calculating the bitmap space usage may include:
步骤1、服务器根据允许参与项目数量、分片数量和关键字固定长度,确定每一位图的关键字总长度。其中,每一分片对应于一个关键字。 Step 1. The server determines the total length of keywords for each bitmap based on the number of projects allowed to participate, the number of shards, and the fixed length of keywords. Among them, each fragment corresponds to a keyword.
本步骤中,分片后的位图在Redis中需要存储key和value两个部分。该两个部分的数据占用存储空间分Mem(key)、Mem(value)两个部分。其中,value中存储有每一个分片的位图信息。并且每一个分片可以对应于一个key。因此,当分片数量增加时,占用的存储空间将会随着分片数量的增加线性增加。即Mem(key)中分片数量与占用存储空间成正比。In this step, the fragmented bitmap needs to store key and value parts in Redis. The storage space occupied by these two parts of data is divided into two parts: Mem (key) and Mem (value). Among them, the bitmap information of each fragment is stored in value. And each shard can correspond to a key. Therefore, when the number of shards increases, the storage space occupied will increase linearly with the increase in the number of shards. That is, the number of shards in Mem(key) is proportional to the occupied storage space.
key占用的存储空间为:The storage space occupied by key is:
Mem(key)=每一个key占用的存储空间×分片数量Mem(key)=storage space occupied by each key×number of shards
在redis位图结构中key的组成部分主要为系统编号、活动号、位图序号和分片序号。其中,位图序号可以根据允许参与项目数量确定。分片序号根据目标分片数量确定。系统编号根据服务器的系统编号确定。活动号根据当前活动的活 动编号确定。该活动编号用于唯一标识当前活动。The key components in the redis bitmap structure are mainly the system number, activity number, bitmap serial number and fragmentation serial number. Among them, the bitmap serial number can be determined based on the number of projects allowed to participate. The fragment sequence number is determined based on the number of target fragments. The system number is determined based on the server's system number. The activity number is determined based on the activity number of the current activity. This activity number uniquely identifies the current activity.
单个key占用的存储空间具体为:The storage space occupied by a single key is specifically:
Mem(key)=SDS(9)+系统编号+活动号+位图序号+分片序号Mem(key)=SDS(9)+system number+activity number+bitmap serial number+fragment serial number
其单位可以为byte。由于对于当前活动而言,系统编号和活动号都是固定的,因此,上述公式中SDS(9)、系统编号和活动号三个部分可以作为常量进行计算。其中,SDS(9)为该key需要固定使用的存储空间。该常量值可以记为:Its unit can be byte. Since the system number and activity number are fixed for the current activity, the three parts of SDS(9), system number and activity number in the above formula can be calculated as constants. Among them, SDS(9) requires fixed storage space for this key. The constant value can be written as:
Mem key=SDS(9)+系统编号+活动号 Mem key =SDS(9)+system number+activity number
而全部位图的全部分片的key所占用的存储空间可以表示为:The storage space occupied by the keys of all slices of all bitmaps can be expressed as:
Figure PCTCN2022130421-appb-000004
Figure PCTCN2022130421-appb-000004
其中,Mem(Ct key)表示每一位图的全部分片的关键字总长度。Mem key表示一个分片所需要的存储空间的常量值。t表示分片数量。
Figure PCTCN2022130421-appb-000005
表示在一个位图中,全部分片编号所占用的存储空间。例如,当包括10个分片时,t=10。需要使用到的分片编号可以包括1,2,……10。由于数据存储时,可以使用二进制进行存储,因此,服务器可以使用log 2(i)+1确定数字为i的分片编号需要的位数。例如,当i=3时,log 2(i)+1=2.58。取整后,可以确定需要使用2bit来存储该值为3的分片编号。又如,当i=9时,log 2(i)+1=4.17。取整后,可以确定需要使用4bit来存储该值为9的分片编号。对每一分片编号进行计算并累加后可以得到,当t=10时,该分片编号所占用的存储空间为29bit。在该计算中实际省略了位图序号所占用的空间的计算。由于在实际使用中,为了控制成本,允许参与项目数量通常为一个较小的数值。例如,允许参与项目数量可以为3、5、6等值。因此,全部位图的位图编号所占用的存储空间,相较于全部位图的存储空间的占用总量影响非常小。例如,当包括1亿系统用户时,一个位图的存储空间占用量就得到了12M。而当包括10个位图时,该10个位图的位图编号占用的存储空间仅为29bit。因此,为了便于计算,本申请中直接省略了该位图编号的计算。
Among them, Mem(Ct key ) represents the total length of keys of all slices of each bitmap. Mem key represents a constant value of the storage space required by a shard. t represents the number of shards.
Figure PCTCN2022130421-appb-000005
Indicates the storage space occupied by all slice numbers in a bitmap. For example, when 10 shards are included, t=10. The fragment numbers to be used can include 1, 2,...10. Since data can be stored in binary, the server can use log 2 (i)+1 to determine the number of digits required for the shard number with number i. For example, when i=3, log 2 (i)+1=2.58. After rounding, it can be determined that 2 bits are needed to store the fragment number with a value of 3. For another example, when i=9, log 2 (i)+1=4.17. After rounding, it can be determined that 4 bits are needed to store the fragment number with a value of 9. After calculating and accumulating each fragment number, it can be obtained that when t=10, the storage space occupied by the fragment number is 29 bits. In this calculation, the calculation of the space occupied by the bitmap sequence number is actually omitted. In actual use, in order to control costs, the number of projects allowed to participate is usually a smaller value. For example, the number of projects allowed to participate can be 3, 5, 6, etc. Therefore, the storage space occupied by the bitmap numbers of all bitmaps has a very small impact compared to the total amount of storage space occupied by all bitmaps. For example, when 100 million system users are included, the storage space occupied by a bitmap is 12M. When 10 bitmaps are included, the bitmap numbers of the 10 bitmaps occupy only 29 bits of storage space. Therefore, in order to facilitate calculation, the calculation of the bitmap number is directly omitted in this application.
步骤2、服务器根据每一分片的平均分片长度、每一分片占用百分之五十的存储空间的概率值、每一分片平均节省的存储空间值和分片数量,确定每一位图的全部分片的空间占用量。Step 2. The server determines each fragment based on the average fragment length of each fragment, the probability value of each fragment occupying fifty percent of the storage space, the average storage space saved by each fragment, and the number of fragments. The space occupied by all slices of the bitmap.
本步骤中,value占用的存储空间可以用概率进行估算。例如,1个用户占用大小为m的位图一半存储空间的概率为50%。而当用户数量为m/2时,占用位图一半内存概率为100%。可见实际用户数量与占用分片50%存储空间的概率是成正比的。即Mem(value)中分片数量与占用存储空间成反比。redis位图中value用于存储每个用户的已发放权益数量,位图每个位置代表一个用户,1代表已发权益0代表未发权益。该value占用的存储空间为:In this step, the storage space occupied by value can be estimated using probability. For example, the probability that one user occupies half of the storage space of a bitmap of size m is 50%. When the number of users is m/2, the probability of occupying half of the bitmap memory is 100%. It can be seen that the actual number of users is directly proportional to the probability of occupying 50% of the storage space of the shard. That is, the number of shards in Mem(value) is inversely proportional to the occupied storage space. The value in the redis bitmap is used to store the number of issued rights for each user. Each position in the bitmap represents a user, with 1 representing issued rights and 0 representing unissued rights. The storage space occupied by this value is:
Figure PCTCN2022130421-appb-000006
Figure PCTCN2022130421-appb-000006
其中,单个value占用存储空间可以表示为:Among them, the storage space occupied by a single value can be expressed as:
Mem(value)=SDS(25)+位图最大偏移量Mem(value)=SDS(25)+bitmap maximum offset
其单位可以为byte。其中,SDS(25)为该key需要固定使用的存储空间。其中,位图最大偏移量为该分片中可能出现的最大编号的系统用户的偏移量。而全部位图的全部分片的value所占用的存储空间可以表示为:Its unit can be byte. Among them, SDS(25) requires fixed storage space for this key. Among them, the maximum offset of the bitmap is the offset of the largest number of system users that may appear in the shard. The storage space occupied by the value of all slices of all bitmaps can be expressed as:
Figure PCTCN2022130421-appb-000007
Figure PCTCN2022130421-appb-000007
其中,Mem(Ct value)为每一位图的全部分片的空间占用量。S为系统用户数量。U为预估参与用户数量。t为分片数量。其中,e用于表示每个分片的最大长度。该e的计算公式可以为: Among them, Mem(Ct value ) is the space occupied by all slices of each bitmap. S is the number of system users. U is the estimated number of participating users. t is the number of shards. Among them, e is used to represent the maximum length of each fragment. The calculation formula of e can be:
Figure PCTCN2022130421-appb-000008
Figure PCTCN2022130421-appb-000008
服务器可以根据该e的值,确定每个分片的平均最大占用内存空间。该每个分片的最大占用内存空间可以表示为Mem(e)。例如,当每个分片的最大长度为100时,该Mem(e)为100bit。其中,
Figure PCTCN2022130421-appb-000009
用于表示占用分片50%的存储空间的概率值。例如,1个用户出现在一个分片的每一个位置的概率为1/e。以该分片的中间位置为界,1个用户出现在该分片的前50%的存储空间的概率和出现在该分片后50%的存储空间的概率均为50%。当参与当前活动的用户数量为e/2时,占用该分片的50%的存储空间概率为100%。
The server can determine the average maximum memory space occupied by each shard based on the value of e. The maximum memory space occupied by each fragment can be expressed as Mem(e). For example, when the maximum length of each fragment is 100, the Mem(e) is 100bit. in,
Figure PCTCN2022130421-appb-000009
Used to represent the probability value of occupying 50% of the storage space of the shard. For example, the probability that a user appears in each position of a shard is 1/e. Taking the middle position of the shard as the boundary, the probability that a user appears in the first 50% of the storage space of the shard and the probability of appearing in the last 50% of the storage space of the shard are both 50%. When the number of users participating in the current activity is e/2, the probability of occupying 50% of the storage space of the shard is 100%.
其中,
Figure PCTCN2022130421-appb-000010
为每增加一个分片,平均减少浪费的空间大小。在存储空间使用的最坏情况下,U个系统用户平均分布于S个系统用户中。此时,平均两个系统用户之间的间隔大小为
Figure PCTCN2022130421-appb-000011
因此,每增加一个分片,新的分片上将大概率空出
Figure PCTCN2022130421-appb-000012
的存储空间。或者,可以认为,每增加一个分片,新的分片上可能空出的存储空间的大小在0至
Figure PCTCN2022130421-appb-000013
之间。为了涵盖极端情况出现的可能性,本实施例中取中位数
Figure PCTCN2022130421-appb-000014
作为可能空出的存储空间。例如,以表3所示的位图为例。该位图中包括20个位,可以对应于20个系统用户。其中,第1、3、9、12、19个系统用户已经获取了权益,其余系统用户尚未获取权益。将该位图分为4个分片以后,可以得到表4所示的位图的四个分片。
in,
Figure PCTCN2022130421-appb-000010
For each additional shard, the amount of wasted space is reduced on average. In the worst case of storage space usage, U system users are evenly distributed among S system users. At this time, the average distance between two system users is
Figure PCTCN2022130421-appb-000011
Therefore, every time a shard is added, the new shard will most likely be empty.
Figure PCTCN2022130421-appb-000012
of storage space. Alternatively, it can be considered that every time a shard is added, the size of the storage space that may be available on the new shard ranges from 0 to
Figure PCTCN2022130421-appb-000013
between. In order to cover the possibility of extreme situations, the median is taken in this example
Figure PCTCN2022130421-appb-000014
as possible free storage space. For example, take the bitmap shown in Table 3 as an example. This bitmap includes 20 bits, which can correspond to 20 system users. Among them, the 1st, 3rd, 9th, 12th, and 19th system users have obtained their rights and interests, while the remaining system users have not yet obtained their rights and interests. After dividing the bitmap into four slices, the four slices of the bitmap shown in Table 4 can be obtained.
表3table 3
Figure PCTCN2022130421-appb-000015
Figure PCTCN2022130421-appb-000015
表4Table 4
Figure PCTCN2022130421-appb-000016
Figure PCTCN2022130421-appb-000016
表4中灰色部分为每一个分片可以节省的空间。由于在表3所示位图中最后一个深灰色格子对应的1bit的存储空间,在位图中已经可以被节省。因此,在表4所示的第四个分片中,该深灰色格子对应的1bit的存储空间并不属于分片带来的存储空间的优化。因此,在实际计算中,平均每一个分片的存储空间将乘以t-1,从而计算得到位图的分片给前t-1个分片带来的存储空间的优化。The gray part in Table 4 shows the space that can be saved by each shard. Since the 1-bit storage space corresponding to the last dark gray grid in the bitmap shown in Table 3 can be saved in the bitmap. Therefore, in the fourth shard shown in Table 4, the 1-bit storage space corresponding to the dark gray grid does not belong to the optimization of storage space brought about by sharding. Therefore, in actual calculations, the average storage space of each fragment will be multiplied by t-1, thereby calculating the optimization of storage space brought by the bitmap fragmentation to the first t-1 fragments.
步骤3、服务器根据每一位图的关键字总长度、每一位图的全部分片的空间占用量和允许参与项目数量,确定位图空间占用量。Step 3. The server determines the bitmap space occupancy based on the total length of the keywords of each bitmap, the space occupied by all slices of each bitmap, and the number of allowed participating projects.
本步骤中,服务器可以在根据上述步骤确定每一位图的关键字总长度Mem(Ct key)和每一位图的全部分片的空间占用量Mem(Ct value)后,根据该关键字总长度Mem(Ct key)、该全部分片的空间占用量Mem(Ct value)和允许参与项目数量,确定位图空间占用量。其公式可以为: In this step, the server can determine the total length of the keyword Mem (Ct key ) of each bitmap and the space occupied by all slices of each bitmap Mem (Ct value ) based on the above steps. The length Mem (Ct key ), the space occupied by all slices Mem (Ct value ) and the number of allowed participating projects determine the bitmap space occupied. Its formula can be:
Mem(Ct)=[Mem(Ct key)+Mem(Ct value)]×N Mem(Ct)=[Mem(Ct key )+Mem(Ct value )]×N
其中,N为位图数量,即允许参与项目数量。Among them, N is the number of bitmaps, that is, the number of projects allowed to participate.
S203、比较每一分片数量对应的位图空间占用量,确定其中位图空间占用量的最小值对应的分片数量为目标分片数量。S203. Compare the bitmap space occupancy corresponding to the number of each shard, and determine the number of shards corresponding to the minimum bitmap space occupancy as the target number of shards.
本步骤中,服务器可以在根据上述步骤确定每一分片数量对应的位图空间占用量后,根据该位图空间占用量进行排序。服务器可以选择位图空间占用量的最小值对应的分片的数量作为目标分片数量。服务器可以使用该分片数量完成对位图的分片。服务器可以在分片后的位图中完成当前活动的参与信息的记录。In this step, the server can sort according to the bitmap space occupancy after determining the bitmap space occupancy corresponding to each number of fragments according to the above steps. The server can select the number of shards corresponding to the minimum bitmap space occupation as the target number of shards. The server can use this number of fragments to complete fragmentation of the bitmap. The server can complete the recording of the participation information of the current activity in the fragmented bitmap.
本申请提供的数据存储方法,服务器可以根据预估参与用户数量,确定分片数量的最大值,从而确定分片数量的范围。服务器可以使用该范围中的每一分片数量,与系统用户数量、预估参与用户数量和允许参与项目数量一起,计算得到每一分片数量对应的位图空间占用量。服务器可以根据该位图空间占用量进行排序。服务器可以选择位图空间占用量的最小值对应的分片的数量作为目标分片数量。本申请中,通过计算每一可能的分片数量对应的位图空间占用量的方式,实现最优的分片数量的计算,从而提高了当前活动的参与信息在存储时的空间利用率,同时通过分片的方式,还提高了当前活动的参与信息在存储时存储效率。With the data storage method provided by this application, the server can determine the maximum number of shards based on the estimated number of participating users, thereby determining the range of the number of shards. The server can use the number of each shard in this range, together with the number of system users, the estimated number of participating users, and the number of allowed participating projects, to calculate the bitmap space occupation corresponding to each shard number. The server can sort the bitmaps based on their space usage. The server can select the number of shards corresponding to the minimum bitmap space occupation as the target number of shards. In this application, the optimal number of fragments is calculated by calculating the bitmap space occupancy corresponding to each possible number of fragments, thereby improving the space utilization when storing the participation information of the current activity, and at the same time Through sharding, the storage efficiency of the participation information of the current activity is also improved.
图6示出了本申请一实施例提供的一种数据存储方法的流程图。在图3至图5所示实施例的基础上,如图6所示,以服务器为执行主体,本实施例的方法可以包括如下步骤:Figure 6 shows a flow chart of a data storage method provided by an embodiment of the present application. Based on the embodiments shown in Figures 3 to 5, as shown in Figure 6, with the server as the execution subject, the method of this embodiment may include the following steps:
S301、获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许 参与项目数量。其中,当前活动的每一项目对应一个位图。S301. Obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity. Among them, each currently active item corresponds to a bitmap.
S302、根据系统用户数量、预估参与用户数量和允许参与项目数量,确定每一位图的目标分片数量。S302. Determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
本实施例中,步骤S301和S302与图2实施例中的步骤S101和S102实现方式类似,本实施例此处不再赘述。如图7所示,该步骤相当于S401。服务器可以获取当前活动的预估参与用户数量U和当前活动的允许参与项目数量N。In this embodiment, steps S301 and S302 are implemented similarly to steps S101 and S102 in the embodiment of FIG. 2, and will not be described again in this embodiment. As shown in Figure 7, this step is equivalent to S401. The server can obtain the estimated number of participating users U in the current activity and the number N of allowed participating projects in the current activity.
S303、根据系统用户数量、预估参与用户数量、允许参与项目数量和目标分片数量,确定位图空间占用量。S303. Determine the bitmap space occupation based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of target shards.
本实施例中,服务器可以在如图5所示实施例,在确定目标分片数量后,根据系统用户数量、预估参与用户数量、允许参与项目数量和目标分片数量,计算得到最小的位图空间占用量。该步骤相当于图7中的S403。In this embodiment, the server can calculate the minimum number of bits based on the number of system users, the estimated number of participating users, the number of allowed participating projects, and the target number of shards in the embodiment shown in Figure 5 after determining the target number of shards. The amount of image space occupied. This step is equivalent to S403 in Figure 7.
S304、根据系统用户数量、预估参与用户数量和允许参与项目数量,确定键值对空间占用量。S304. Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
本实施例中,服务器可以使用key-value结构中结构信息计算得到键值对空间占用量。该步骤相当于图7中的S402。其中,S303和S304的执行顺序和S402和S403的执行顺序可以并不受流程图限制,可以任意交换。In this embodiment, the server can use the structural information in the key-value structure to calculate the space occupied by the key-value pair. This step is equivalent to S402 in Figure 7 . Among them, the execution order of S303 and S304 and the execution order of S402 and S403 are not limited by the flow chart and can be exchanged arbitrarily.
在键值对中,key包括系统编号、活动号和用户id。在键值对存储方式中,服务器可以在一个系统用户参加当前活动时,创建一个键值对,并使用该键值对记录该系统用户的当前活动的参与信息。因此,在key中,除了用于标识服务器的系统编号以及标识当前活动的活动号外,该key中仅需要存储用户ID即可。该用户ID为该服务器中用于唯一标识该系统的编码。例如,当包括1亿个系统用户时,第1亿个用户的用户ID可以为100000000。在键值对中,value用于存储该用户已获取的权益数量或者已经参与的项目数量。例如,当用户已经获取了3个权益时,该value的值为3。又如,当用户已经参与了6个项目时,该value的值为6。In the key-value pair, the key includes the system number, activity number and user ID. In the key-value storage method, the server can create a key-value pair when a system user participates in the current activity, and use the key-value pair to record the system user's participation information in the current activity. Therefore, in the key, in addition to the system number used to identify the server and the activity number identifying the current activity, only the user ID needs to be stored in the key. The user ID is a code used by the server to uniquely identify the system. For example, when including 100 million system users, the user ID of the 100 millionth user could be 100000000. In the key-value pair, value is used to store the number of benefits the user has obtained or the number of projects he has participated in. For example, when the user has obtained 3 interests, the value of this value is 3. For another example, when the user has participated in 6 projects, the value of this value is 6.
例如,该key-value的数据可以为:For example, the key-value data can be:
key=6069:act01:openIdA;value=1key=6069:act01:openIdA; value=1
key=6069:act01:openIdB;value=3key=6069:act01:openIdB; value=3
key=6069:act01:openIdC;value=2key=6069:act01:openIdC; value=2
与计算位图空间占用量的方式类似,该键值对空间占用量的计算过程中同样需要分别计算key的存储空间和value的存储空间。其中,key占用的存储空间的计算方法可以为:Similar to the method of calculating the bitmap space occupancy, the calculation process of the space occupancy of the key-value pair also needs to calculate the storage space of the key and the storage space of the value separately. Among them, the calculation method of the storage space occupied by the key can be:
Mem(key)=SDS(9)+系统编号+活动号+用户序号Mem(key)=SDS(9)+system number+activity number+user serial number
其单位为byte。当系统用户数量为1亿时,一个用户的key值的存储空间大约为20byte。对于每一个用户而言,该key的存储空间的使用量较为固定。因此,服务器可以根据预估参与用户数量和该key的存储空间的使用量,确定当前 活动中key一共需要使用的存储空间的量。Its unit is byte. When the number of system users is 100 million, the storage space for a user's key value is approximately 20 bytes. For each user, the storage space usage of the key is relatively fixed. Therefore, the server can determine the total amount of storage space required for the key in the current activity based on the estimated number of participating users and the usage of the key's storage space.
其中,value占用的存储空间的计算方式可以为:Among them, the storage space occupied by value can be calculated as:
Mem(value)=SDS(25)+log 2(n) Mem(value)=SDS(25)+log 2 (n)
其单位为byte。其中,n为活动参与规则中每一系统用户允许参与项目数量或者可获得的权益数量。log 2(n)用于计算数字的位数。由于在value中,该数值是累加的,即每一时刻value中仅存储有一个数值。因此,当value中的数值为n时,该value需要使用到的存储空间最大。log 2(n)即为存储数值n需要使用的位数。而允许参与项目数量基本在个位数,所以可以将value中log 2(n)的值直接简化为1byte。该value占用的存储空间的计算公式可简化为: Its unit is byte. Among them, n is the number of projects each system user is allowed to participate in or the number of rights and interests that can be obtained in the activity participation rules. log 2 (n) is used to count the number of digits in a number. Because in value, the value is cumulative, that is, only one value is stored in value at each time. Therefore, when the value in value is n, the value requires the largest storage space. log 2 (n) is the number of bits needed to store the value n. The number of projects allowed to participate is basically in single digits, so the value of log 2 (n) in value can be directly simplified to 1byte. The calculation formula for the storage space occupied by this value can be simplified as:
Mem(value)=SDS(25)+1byteMem(value)=SDS(25)+1byte
根据上述Mem(key)和Mem(value),服务器可以计算得到键值对空间占用量Mem(kv),其计算公式可以为:Based on the above Mem(key) and Mem(value), the server can calculate the space occupied by the key-value pair Mem(kv). The calculation formula can be:
Mem(kv)=[Mem(key)+Mem(value)]*UMem(kv)=[Mem(key)+Mem(value)]*U
其中,U为预估参与用户数量,即活动最大参与人数。SDS(25)为redis在使用key-value结构存储数据时占用内存的固定常量值。SDS(25)即对应于25byte。Among them, U is the estimated number of participating users, which is the maximum number of participants in the event. SDS(25) is a fixed constant value that redis takes up memory when using the key-value structure to store data. SDS(25) corresponds to 25byte.
S305、当键值对空间占用量小于位图空间占用量时,使用键值对记录系统用户在当前活动的参与信息。S305. When the space occupied by the key-value pair is less than the space occupied by the bitmap, use the key-value pair to record the system user's participation information in the current activity.
本实施例中,服务器可以比较上述步骤S303中计算得到位图空间占用量和步骤S304中计算得到的键值对空间占用量。该步骤相当于图7中S404。当键值对空间占用量小于位图空间占用量时,服务器确定使用键值对记录系统用户在当前活动的参与信息。即,当键值对空间占用量小于位图空间占用量时,服务器将继续执行图7中S414。否则,当键值对空间占用量大于等于位图空间占用量时,服务器可以继续执行步骤S306。或者,服务器可以继续执行图7中的S405。In this embodiment, the server may compare the bitmap space occupancy calculated in step S303 with the key-value pair space occupancy calculated in step S304. This step is equivalent to S404 in Figure 7. When the space occupied by the key-value pair is less than the space occupied by the bitmap, the server determines to use the key-value pair to record the system user's participation information in the current activity. That is, when the space occupied by the key-value pair is less than the space occupied by the bitmap, the server will continue to execute S414 in Figure 7. Otherwise, when the space occupied by the key-value pair is greater than or equal to the space occupied by the bitmap, the server can continue to execute step S306. Alternatively, the server may continue to execute S405 in Figure 7.
一种示例中,当服务器确定需要使用键值对吉利系统用户在当前活动的参与信息时,参与信息的具体记录过程可以包括如下步骤:In one example, when the server determines that it is necessary to use key values to pair the Geely system user's participation information in the current activity, the specific recording process of the participation information may include the following steps:
步骤1、服务器获取系统用户对应的键值对中记录的系统用户的参与项目数量。 Step 1. The server obtains the number of participating projects of the system user recorded in the key-value pair corresponding to the system user.
本步骤中,服务器可以直接根据系统用户的用户ID,在redis中搜索该系统用户对应的key。服务器可以在确定该key后确定该键值对中的value。该value中可以包括该系统用户的参与项目数量。该参与项目数量即为已经获取的权益数量。该步骤可以如图7中的步骤S415所示。其中,服务器获取系统用户的参与项目数量的指令可以为:In this step, the server can directly search for the key corresponding to the system user in redis based on the user ID of the system user. The server can determine the value in the key-value pair after determining the key. This value can include the number of projects the system user participates in. The number of participating projects is the number of rights and interests that have been obtained. This step may be shown in step S415 in FIG. 7 . Among them, the instruction for the server to obtain the number of participating projects of the system user can be:
Value=GET(6069:act01:openIdA);Value=GET(6069:act01:openIdA);
步骤2、当参与项目数量小于允许参与项目数量时,服务器将系统用户的参与项目数量增加单位数值。Step 2. When the number of participating projects is less than the number of allowed participating projects, the server increases the number of participating projects of the system user by the unit value.
本步骤中,服务器可以比较该参与项目数量和允许参与项目数量。如果系统用户已经参与的参与项目数量小于参与项目数量,则服务器可以允许系统用户参 与该项目,并将系统用户的参与项目数量增加单位数值。该单位数值可以为1。例如,当系统用户已经获取了3个权益,且系统允许用户获取的最多权益数量为6时,服务器可以确定该用户还可以继续获取权益。服务器可以将权益发放给用户,并修改redis中该用户已经获取的权益数量的记录。该步骤相当于图7中的S417。其中,服务器将参与项目数量(权益数量)写回redis的指令可以为:In this step, the server can compare the number of participating projects with the number of allowed participating projects. If the number of participating projects that the system user has participated in is less than the number of participating projects, the server can allow the system user to participate in the project and increase the number of participating projects by the system user by the unit value. The unit value can be 1. For example, when a system user has obtained 3 rights and interests, and the maximum number of rights and interests that the system allows the user to obtain is 6, the server can determine that the user can continue to obtain rights and interests. The server can issue rights and interests to users and modify the record in redis of the number of rights and interests that the user has obtained. This step is equivalent to S417 in Figure 7. Among them, the instruction for the server to write the number of participating projects (number of equity) back to redis can be:
SET(6069:act01:openIdA,value+1);SET(6069:act01:openIdA,value+1);
步骤3、当参与项目数量大于等于允许参与项目数量时,服务器拒绝系统用户参与项目。Step 3. When the number of participating projects is greater than or equal to the number of allowed participating projects, the server refuses the system user to participate in the project.
本步骤中,如果系统用户已经参与的参与项目数量达到参与项目数量,则服务器确定该系统用户已经完成当前活动的参与。服务器可以拒绝用户继续参与当前活动。例如,当系统用户已经获取了6个权益,且系统允许用户获取的最多权益数量为6时,服务器可以确定该用户已经获取了全部权益。服务器可以在用户再次请求获取权益时,拒绝用户获取该权益。该步骤相当于图7中S416判断为否时的操作。In this step, if the number of participating projects that the system user has participated in reaches the number of participating projects, the server determines that the system user has completed participation in the current activity. The server can deny the user further participation in the current activity. For example, when a system user has acquired 6 rights and interests, and the maximum number of rights that the system allows a user to acquire is 6, the server can determine that the user has acquired all rights and interests. The server can refuse the user to obtain the rights when the user requests the rights again. This step is equivalent to the operation when S416 in FIG. 7 is determined to be negative.
S306、根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息。S306. Segment the bitmap according to the target number of shards, and use the segmented bitmap to record the system user's participation information in the current activity.
其中,步骤S306与图2实施例中的步骤S103实现方式类似,本实施例此处不再赘述。其中,该S306的具体实现过程可以如图7中S405至S413的步骤所示。服务器可以根据系统用户的用户ID确定该系统用户的分片序号和偏移量。服务器可以根据该分片序号和偏移量,确定每一位图中,该系统用户对应的值。服务器可以对这些值进行与运算。当运算结果为1时,说明这些位图中的值均为1。当该系统用户在每一位图中的值均为1时,说明该系统用户已经获取了全部可以获取的权益,或者,已经参与了全部可以参与的活动。否则,当运算结果为0时,说明这些位图中存在至少一个值为0。该结果说明该系统用户还可以继续获取权益或者以及参与活动项目。当用户获取权益或者参与活动项目时,服务器可以将其中值为0的一个位图中的值设置为1。Among them, step S306 is implemented similarly to step S103 in the embodiment of FIG. 2, and will not be described again in this embodiment. The specific implementation process of S306 can be shown as steps S405 to S413 in Figure 7 . The server can determine the fragment sequence number and offset of the system user based on the user ID of the system user. The server can determine the value corresponding to the system user in each bitmap based on the fragment sequence number and offset. The server can AND these values. When the operation result is 1, it means that the values in these bitmaps are all 1. When the value of the system user in each bitmap is 1, it means that the system user has obtained all the rights and interests that can be obtained, or has participated in all the activities that can be participated in. Otherwise, when the operation result is 0, it means that there is at least one value in these bitmaps that is 0. This result shows that users of this system can continue to obtain rights and interests or participate in active projects. When a user obtains benefits or participates in an active project, the server can set the value in a bitmap with a value of 0 to 1.
其中,该分片后的位图的存储结构可以为:Among them, the storage structure of the fragmented bitmap can be:
位图1:Bitmap 1:
位图分片1:key=6069:bitmap:act01:1:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice 1:key=6069:bitmap:act01:1:sliceNo;value=1 0 1 0 0 0 0 0…
位图分片2:key=6069:bitmap:act01:1:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice 2:key=6069:bitmap:act01:1:sliceNo;value=1 0 1 0 0 0 0 0…
位图分片M:key=6069:bitmap:act01:1:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice M:key=6069:bitmap:act01:1:sliceNo; value=1 0 1 0 0 0 0 0…
位图2:Bitmap 2:
位图分片1:key=6069:bitmap:act01:2:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice 1:key=6069:bitmap:act01:2:sliceNo;value=1 0 1 0 0 0 0 0…
位图分片2:key=6069:bitmap:act01:2:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice 2:key=6069:bitmap:act01:2:sliceNo;value=1 0 1 0 0 0 0 0…
位图分片M:key=6069:bitmap:act01:2:sliceNo;value=1 0 1 0 0 0 0 0…Bitmap slice M:key=6069:bitmap:act01:2:sliceNo; value=1 0 1 0 0 0 0 0…
当用户ID为X时,分片序号SLICE NO的计算公式可以为: When the user ID is X, the calculation formula of SLICE NO can be:
Figure PCTCN2022130421-appb-000017
Figure PCTCN2022130421-appb-000017
该系统用户在该分片上的偏移量offset的计算公式可以为:The calculation formula for the offset of the system user on the shard can be:
offset=X%eoffset=X%e
当服务器确定分片序号和偏移量后,服务器可以获取该系统用户在在N个位图的分片中的value值,其获取指令可以为:After the server determines the fragment sequence number and offset, the server can obtain the value of the system user in the N bitmap fragments. The acquisition instruction can be:
value1=GETBIT(“6069:bitmap:act01:1:(X/e)”,(X%e));value1=GETBIT("6069:bitmap:act01:1:(X/e)",(X%e));
value2=GETBIT(“6069:bitmap:act01:2:(X/e)”,(X%e));value2=GETBIT("6069:bitmap:act01:2:(X/e)",(X%e));
value3=GETBIT(“6069:bitmap:act01:3:(X/e)”,(X%e);value3=GETBIT("6069:bitmap:act01:3:(X/e)",(X%e);
valueN=GETBIT(“6069:bitmap:act01:N:(X/e)”,(X%e));valueN=GETBIT("6069:bitmap:act01:N:(X/e)",(X%e));
服务器可以根据每一位图中的value值,计算得到该系统用户的参与项目数量。其计算公式可以为:The server can calculate the number of participating projects of the system user based on the value in each bitmap. Its calculation formula can be:
count=value1+value2+value3…+valueNcount=value1+value2+value3…+valueN
服务器还可以将全部的value值进行与运算,更加快速的确定该用户的参与项目数量是否已经达到上限。其计算公式可以为:The server can also perform an AND operation on all value values to more quickly determine whether the user's number of participating projects has reached the upper limit. Its calculation formula can be:
result=value1&value2&value3&….&valueNresult=value1&value2&value3&….&valueN
当resule的值为1时,说明该系统用户的参与项目数量已经达到上限。否则,说明该系统用户的的参与项目数量还未达到上限。该过程的判断逻辑,具体可以为:When the value of result is 1, it means that the number of participating projects for this system user has reached the upper limit. Otherwise, it means that the number of participating projects for users of the system has not reached the upper limit. The judgment logic of this process can be specifically as follows:
Figure PCTCN2022130421-appb-000018
Figure PCTCN2022130421-appb-000018
本申请提供的数据存储方法,服务器可以获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量。服务器可以根据系统用户数量、预估参与用户数量和允许参与项目数量,确定每一位图的目标分片数量。服务器可以在确定目标分片数量后,根据系统用户数量、预估参与用户数量、允许参与项目数量和目标分片数量,计算得到最小的位图空间占用量。服务器可以使用key-value结构中结构信息计算得到键值对空间占用量。服务器可以比较键值对空间占用量小于位图空间占用量。当键值对空间占用量小于位图空间占用量时,服务器确定使用键值对记录系统用户在当前活动的参与信息。当键值对空间占用量大于等于位图空间占用量时,服务器可以根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息。本申请中,通过比 较键值对空间占用量和位图空间占用量,实现根据不同的系统用户数量、预估参与用户数量、允许参与项目数量,选择最优的存储方式对当前活动的参与信息进行记录,提高参与信息的存储效率和存储空间利用率。With the data storage method provided by this application, the server can obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity. The server can determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate. After determining the target number of shards, the server can calculate the minimum bitmap space occupation based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the target number of shards. The server can use the structural information in the key-value structure to calculate the space occupied by the key-value pair. The server can compare the space occupied by the key-value pair to be smaller than the space occupied by the bitmap. When the space occupied by the key-value pair is less than the space occupied by the bitmap, the server determines to use the key-value pair to record the system user's participation information in the current activity. When the space occupied by the key-value pair is greater than or equal to the space occupied by the bitmap, the server can fragment the bitmap according to the number of target fragments, and use the fragmented bitmap to record the system user's participation information in the current activity. In this application, by comparing the key-value pair space occupancy and the bitmap space occupancy, it is possible to select the optimal storage method for the participation information of the current activity based on the number of different system users, the estimated number of participating users, and the number of projects allowed to participate. Record and improve the storage efficiency and storage space utilization of participation information.
图8示出了本申请一实施例提供的一种数据存储装置的结构示意图,如图8所示,本实施例的数据存储装置10用于实现上述任一方法实施例中对应于服务器的操作,本实施例的数据存储装置10包括:Figure 8 shows a schematic structural diagram of a data storage device provided by an embodiment of the present application. As shown in Figure 8, the data storage device 10 of this embodiment is used to implement operations corresponding to the server in any of the above method embodiments. , the data storage device 10 of this embodiment includes:
获取模块11,用于获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量。其中,当前活动的每一项目对应一个位图。The acquisition module 11 is used to obtain the number of system users, the estimated number of participating users in the current activity, and the number of allowed participating projects in the current activity. Among them, each currently active item corresponds to a bitmap.
处理模块12,用于在系统用户数量和预估参与用户数量的比值大于预设阈值时,根据系统用户数量、预估参与用户数量和允许参与项目数量,确定每一位图的目标分片数量。根据目标分片数量对位图进行分片,并使用分片后的位图,记录系统用户在当前活动的参与信息。The processing module 12 is configured to determine the target number of shards for each bitmap based on the number of system users, the estimated number of participating users, and the number of allowed participating projects when the ratio between the number of system users and the estimated number of participating users is greater than the preset threshold. . The bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the system user's participation information in the current activity.
可选地,处理模块12,具体用于:Optionally, the processing module 12 is specifically used for:
根据预估参与用户数量,确定分片数量的范围。Determine the range of the number of shards based on the estimated number of participating users.
根据分片数量的范围,计算范围内每一分片数量对应的位图空间占用量。According to the range of the number of fragments, calculate the bitmap space occupation corresponding to each number of fragments within the range.
比较每一分片数量对应的位图空间占用量,确定其中位图空间占用量的最小值对应的分片数量为目标分片数量。Compare the bitmap space occupancy corresponding to each number of shards, and determine the number of shards corresponding to the minimum bitmap space occupancy as the target number of shards.
可选地,处理模块12,具体用于:Optionally, the processing module 12 is specifically used for:
根据系统用户数量、预估参与用户数量、允许参与项目数量和分片数量,确定分片数量对应的位图空间占用量。Based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of shards, determine the bitmap space occupation corresponding to the number of shards.
可选地,处理模块12,具体用于:Optionally, the processing module 12 is specifically used for:
根据允许参与项目数量、分片数量和关键字固定长度,确定每一位图的关键字总长度。其中,每一分片对应于一个关键字。The total keyword length of each bitmap is determined based on the number of projects allowed to participate, the number of shards, and the fixed length of the keyword. Among them, each fragment corresponds to a keyword.
根据每一分片的平均分片长度、每一分片占用百分之五十的存储空间的概率值、每一分片平均节省的存储空间值和分片数量,确定每一位图的全部分片的空间占用量。Based on the average fragment length of each fragment, the probability value of each fragment occupying 50% of the storage space, the average storage space saved by each fragment and the number of fragments, determine all the bitmaps The space occupied by the shard.
根据每一位图的关键字总长度、每一位图的全部分片的空间占用量和允许参与项目数量,确定位图空间占用量。The bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupied by all slices of each bitmap, and the number of allowed participating projects.
可选地,处理模块12,具体用于:Optionally, the processing module 12 is specifically used for:
根据系统用户在每一位图的分片中的值,确定系统用户的参与项目数量。The number of participating projects of the system user is determined based on the value of the system user in each bitmap fragment.
当参与项目数量小于允许参与项目数量时,将系统用户在一位图的分片中的值从第一数据修改为第二数据。When the number of participating projects is less than the number of allowed participating projects, the value of the system user in the slice of the bitmap is modified from the first data to the second data.
当参与项目数量大于等于允许参与项目数量时,拒绝系统用户继续参与当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user will be denied continued participation in the current activity.
可选地,根据目标分片数量对位图进行分片,并使用分片后的位图,记录系 统用户在当前活动的参与信息之前,处理模块12,还用于:Optionally, the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the system user's participation information in the current activity. The processing module 12 is also used to:
根据系统用户数量、预估参与用户数量、允许参与项目数量和目标分片数量,确定位图空间占用量。Determine the bitmap space occupation based on the number of system users, the estimated number of participating users, the number of projects allowed to participate, and the number of target shards.
根据系统用户数量、预估参与用户数量和允许参与项目数量,确定键值对空间占用量。Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of projects allowed to participate.
当键值对空间占用量小于位图空间占用量时,使用键值对记录系统用户在当前活动的参与信息。When the space occupied by the key-value pair is less than the space occupied by the bitmap, the key-value pair is used to record the system user's participation information in the current activity.
可选地,处理模块12,还用于:Optionally, the processing module 12 is also used to:
获取系统用户对应的键值对中记录的系统用户的参与项目数量。Get the number of participating projects of the system user recorded in the key-value pair corresponding to the system user.
当参与项目数量小于允许参与项目数量时,将系统用户的参与项目数量增加单位数值。When the number of participating projects is less than the number of allowed participating projects, increase the number of participating projects for the system user by a unit value.
当参与项目数量大于等于允许参与项目数量时,拒绝系统用户继续参与当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user will be denied continued participation in the current activity.
本申请实施例提供的数据存储装置10,可执行上述方法实施例,其具体实现原理和技术效果,可参见上述方法实施例,本实施例此处不再赘述。The data storage device 10 provided by the embodiments of the present application can execute the above method embodiments. For its specific implementation principles and technical effects, please refer to the above method embodiments. This embodiment will not be described again here.
图9示出了本申请实施例提供的一种服务器的硬件结构示意图。如图9所示,该服务器20,用于实现上述任一方法实施例中对应于服务器的操作,本实施例的服务器20可以包括:存储器21和处理器22。Figure 9 shows a schematic diagram of the hardware structure of a server provided by an embodiment of the present application. As shown in FIG. 9 , the server 20 is used to implement operations corresponding to the server in any of the above method embodiments. The server 20 in this embodiment may include: a memory 21 and a processor 22 .
存储器21,用于存储计算机程序。该存储器21可能包含高速随机存取存储器(Random Access Memory,RAM),也可能还包括非易失性存储(Non-Volatile Memory,NVM),例如至少一个磁盘存储器,还可以为U盘、移动硬盘、只读存储器、磁盘或光盘等。 Memory 21 is used to store computer programs. The memory 21 may include high-speed random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory, and may also be a U disk or a mobile hard disk. , read-only memory, magnetic disk or optical disk, etc.
处理器22,用于执行存储器存储的计算机程序,以实现上述实施例中的数据存储方法。具体可以参见前述方法实施例中的相关描述。该处理器22可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数量字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。The processor 22 is configured to execute the computer program stored in the memory to implement the data storage method in the above embodiment. For details, please refer to the relevant descriptions in the foregoing method embodiments. The processor 22 can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), etc. . A general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc. The steps of the method disclosed in conjunction with the invention can be directly embodied and executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
可选地,存储器21既可以是独立的,也可以跟处理器22集成在一起。Optionally, the memory 21 can be independent or integrated with the processor 22 .
当存储器21是独立于处理器22之外的器件时,服务器20还可以包括总线23。该总线23用于连接存储器21和处理器22。该总线23可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制 总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。When the memory 21 is a device independent of the processor 22, the server 20 may also include a bus 23. The bus 23 is used to connect the memory 21 and the processor 22 . The bus 23 may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, the bus in the drawings of this application is not limited to only one bus or one type of bus.
本实施例提供的服务器可用于执行上述的数据存储方法,其实现方式和技术效果类似,本实施例此处不再赘述。The server provided in this embodiment can be used to execute the above-mentioned data storage method. Its implementation method and technical effects are similar, and will not be described again in this embodiment.
本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序,计算机程序被处理器执行时用于实现上述的各种实施方式提供的方法。This application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program. When the computer program is executed by a processor, it is used to implement the methods provided by the above-mentioned various embodiments.
其中,计算机可读存储介质可以是计算机存储介质,也可以是通信介质。通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。计算机存储介质可以是通用或专用计算机能够存取的任何可用介质。例如,计算机可读存储介质耦合至处理器,从而使处理器能够从该计算机可读存储介质读取信息,且可向该计算机可读存储介质写入信息。当然,计算机可读存储介质也可以是处理器的组成部分。处理器和计算机可读存储介质可以位于专用集成电路(Application Specific Integrated Circuits,ASIC)中。另外,该ASIC可以位于用户设备中。当然,处理器和计算机可读存储介质也可以作为分立组件存在于通信设备中。The computer-readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a computer-readable storage medium is coupled to a processor such that the processor can read information from the computer-readable storage medium and write information to the computer-readable storage medium. Of course, the computer-readable storage medium may also be an integral part of the processor. The processor and computer-readable storage medium may be located in Application Specific Integrated Circuits (ASICs). Additionally, the ASIC can be located in the user equipment. Of course, the processor and the computer-readable storage medium may also exist as discrete components in the communication device.
具体地,该计算机可读存储介质可以是由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random-Access Memory,SRAM),电可擦除可编程只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM),可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM),可编程只读存储器(Programmable read-only memory,PROM),只读存储器(Read-Only Memory,ROM),磁存储器,快闪存储器,磁盘或光盘。存储介质可以是通用或专用计算机能够存取的任何可用介质。Specifically, the computer-readable storage medium can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable memory In addition to programmable read-only memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), programmable read-only memory (Programmable read-only memory, PROM) ), read-only memory (Read-Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk. Storage media can be any available media that can be accessed by a general purpose or special purpose computer.
本申请还提供一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序存储在计算机可读存储介质中。设备的至少一个处理器可以从计算机可读存储介质中读取该计算机程序,至少一个处理器执行该计算机程序使得设备实施上述的各种实施方式提供的方法。The application also provides a computer program product. The computer program product includes a computer program, and the computer program is stored in a computer-readable storage medium. At least one processor of the device can read the computer program from the computer-readable storage medium, and at least one processor executes the computer program so that the device implements the methods provided by the various embodiments described above.
本申请实施例还提供一种芯片,该芯片包括存储器和处理器,存储器用于存储计算机程序,处理器用于从存储器中调用并运行计算机程序,使得安装有芯片的设备执行如上各种可能的实施方式中的方法。An embodiment of the present application also provides a chip. The chip includes a memory and a processor. The memory is used to store a computer program. The processor is used to call and run the computer program from the memory, so that the device equipped with the chip can perform the above various possible implementations. method within the method.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of modules is only a logical function division. In actual implementation, there may be other division methods. For example, multiple modules may be combined or integrated into another unit. A system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or modules, and may be in electrical, mechanical or other forms.
其中,各个模块可以是物理上分开的,例如安装于一个的设备的不同位置,或者安装于不同的设备上,或者分布到多个网络单元上,或者分布到多个处理器上。各个模块也可以是集成在一起的,例如,安装于同一个设备中,或者,集成在一套代码中。各个模块可以以硬件的形式存在,或者也可以以软件的形式存在,或者也可以采用软件加硬件的形式实现。本申请可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。Each module may be physically separated, for example, installed in different locations of one device, or installed on different devices, or distributed to multiple network units, or distributed to multiple processors. Individual modules can also be integrated together, for example, installed in the same device, or integrated in a set of codes. Each module can exist in the form of hardware, or can exist in the form of software, or can also be implemented in the form of software plus hardware. This application can select some or all of the modules according to actual needs to achieve the purpose of the solution of this embodiment.
当各个模块以软件功能模块的形式实现的集成的模块,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例方法的部分步骤。When each module is implemented in the form of a software function module, the integrated module can be stored in a computer-readable storage medium. The above-mentioned software function module is stored in a storage medium and includes a number of instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute some steps of the methods of various embodiments of the present application.
应该理解的是,虽然上述实施例中的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although each step in the flow chart in the above embodiment is shown in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least some of the steps in the figure may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and their execution order is not necessarily sequential. may be performed in turn or alternately with other steps or sub-steps of other steps or at least part of stages.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制。尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换。而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present application, but are not intended to limit it. Although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions recorded in the foregoing embodiments, or make equivalent substitutions for some or all of the technical features. . However, these modifications or substitutions do not cause the essence of the corresponding technical solutions to depart from the scope of the technical solutions of the embodiments of this application.

Claims (10)

  1. 一种数据存储方法,其特征在于,所述方法包括:A data storage method, characterized in that the method includes:
    获取系统用户数量、当前活动的预估参与用户数量和当前活动的允许参与项目数量;其中,当前活动的每一项目对应一个位图;Obtain the number of system users, the estimated number of participating users in the current activity, and the number of projects allowed to participate in the current activity; among them, each project in the current activity corresponds to a bitmap;
    当系统用户数量和预估参与用户数量的比值大于预设阈值时,根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定每一位图的目标分片数量;When the ratio between the number of system users and the estimated number of participating users is greater than the preset threshold, the target number of shards for each bitmap is determined based on the number of system users, the estimated number of participating users, and the number of allowed participating projects. ;
    根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录系统用户在所述当前活动的参与信息。The bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record the participation information of the system user in the current activity.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定每一位图的目标分片数量,具体包括:The method according to claim 1, characterized in that determining the target number of shards for each bitmap based on the number of system users, the estimated number of participating users and the number of allowed participating projects, specifically includes: :
    根据所述预估参与用户数量,确定所述分片数量的范围;Determine the range of the number of shards based on the estimated number of participating users;
    根据所述分片数量的范围,计算所述范围内每一所述分片数量对应的位图空间占用量;According to the range of the number of fragments, calculate the bitmap space occupation corresponding to each number of fragments within the range;
    比较每一所述分片数量对应的所述位图空间占用量,确定其中位图空间占用量的最小值对应的分片数量为目标分片数量。Compare the bitmap space occupancy corresponding to each of the number of fragments, and determine the number of fragments corresponding to the minimum value of the bitmap space occupancy as the target number of fragments.
  3. 根据权利要求2所述的方法,其特征在于,所述计算所述范围内每一所述分片数量对应的位图空间占用量,具体包括:The method according to claim 2, characterized in that the calculation of the bitmap space occupation corresponding to each number of fragments in the range specifically includes:
    根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述分片数量,确定所述分片数量对应的位图空间占用量。According to the number of system users, the estimated number of participating users, the number of allowed participation items, and the number of slices, the bitmap space occupation corresponding to the number of slices is determined.
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述分片数量,确定所述分片数量对应的位图空间占用量,具体包括:The method according to claim 3, characterized in that, based on the number of system users, the estimated number of participating users, the number of allowed participating projects and the number of shards, determining the number of shards corresponding to The amount of bitmap space occupied, specifically includes:
    根据所述允许参与项目数量、所述分片数量和关键字固定长度,确定每一所述位图的关键字总长度;其中,每一所述分片对应于一个关键字;Determine the total length of keywords for each bitmap according to the number of items allowed to participate, the number of fragments and the fixed length of keywords; wherein each fragment corresponds to one keyword;
    根据每一所述分片的平均分片长度、每一所述分片占用百分之五十的存储空间的概率值、每一所述分片平均节省的存储空间值和所述分片数量,确定每一所述位图的全部所述分片的空间占用量;According to the average fragment length of each fragment, the probability value of each fragment occupying fifty percent of the storage space, the average storage space value saved by each fragment and the number of fragments , determine the space occupied by all the fragments of each bitmap;
    根据每一所述位图的所述关键字总长度、每一所述位图的全部所述分片的空间占用量和所述允许参与项目数量,确定所述位图空间占用量。The bitmap space occupancy is determined based on the total length of the keywords of each bitmap, the space occupancy of all the slices of each bitmap, and the number of allowed participating items.
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述记录所述系统用户在所述当前活动的参与信息,具体包括:The method according to any one of claims 1 to 4, characterized in that recording the system user's participation information in the current activity specifically includes:
    根据所述系统用户在每一位图的分片中的值,确定所述系统用户的参与项目数量;Determine the number of participating projects of the system user according to the value of the system user in each bitmap slice;
    当所述参与项目数量小于允许参与项目数量时,将所述系统用户在一位图的分片中的值从第一数据修改为第二数据;When the number of participating projects is less than the number of allowed participating projects, modify the value of the system user in the slice of the bitmap from the first data to the second data;
    当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user is refused to continue to participate in the current activity.
  6. 根据权利要求1-4中任一项所述的方法,其特征在于,根据所述目标分片数量对所述位图进行分片,并使用所述分片后的所述位图,记录所述系统用户在所述当前活动的参与信息之前,所述方法,还包括:The method according to any one of claims 1 to 4, characterized in that the bitmap is fragmented according to the target number of fragments, and the fragmented bitmap is used to record all the bitmaps. Before obtaining the participation information of the system user in the current activity, the method further includes:
    根据所述系统用户数量、所述预估参与用户数量、所述允许参与项目数量和所述目标分片数量,确定位图空间占用量;Determine the bitmap space occupancy according to the number of system users, the estimated number of participating users, the number of allowed participating projects, and the number of target shards;
    根据所述系统用户数量、所述预估参与用户数量和所述允许参与项目数量,确定键值对空间占用量;Determine the space occupied by key-value pairs based on the number of system users, the estimated number of participating users, and the number of allowed participating projects;
    当所述键值对空间占用量小于所述位图空间占用量时,使用键值对记录所述系统用户在所述当前活动的参与信息。When the space occupied by the key-value pair is less than the space occupied by the bitmap, the key-value pair is used to record the participation information of the system user in the current activity.
  7. 根据权利要求6所述的方法,其特征在于,所述方法,还包括:The method according to claim 6, characterized in that the method further includes:
    获取所述系统用户对应的键值对中记录的所述系统用户的参与项目数量;Obtain the number of participating projects of the system user recorded in the key-value pair corresponding to the system user;
    当所述参与项目数量小于允许参与项目数量时,将所述系统用户的参与项目数量增加单位数值;When the number of participating projects is less than the number of allowed participating projects, increase the number of participating projects of the system user by a unit value;
    当所述参与项目数量大于等于允许参与项目数量时,拒绝所述系统用户继续参与所述当前活动。When the number of participating projects is greater than or equal to the number of allowed participating projects, the system user is refused to continue to participate in the current activity.
  8. 一种服务器,其特征在于,所述服务器,包括:存储器,处理器;A server, characterized in that the server includes: a memory and a processor;
    所述存储器用于存储计算机程序;所述处理器用于根据所述存储器存储的计算机程序,实现如权利要求1至6中任意一项所述的数据存储方法。The memory is used to store computer programs; the processor is used to implement the data storage method according to any one of claims 1 to 6 according to the computer program stored in the memory.
  9. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被处理器执行时用于实现如权利要求1至6任一项所述的数据存储方法。A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, it is used to implement the data as claimed in any one of claims 1 to 6 Storage method.
  10. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序,所述计算机程序被处理器执行时实现权利要求1至6任一项所述的数据存储方法。A computer program product, characterized in that the computer program product includes a computer program, and when the computer program is executed by a processor, the data storage method according to any one of claims 1 to 6 is implemented.
PCT/CN2022/130421 2022-06-28 2022-11-07 Data storage method, server and storage medium WO2024000987A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210740249.0A CN114968124A (en) 2022-06-28 2022-06-28 Data storage method, server and storage medium
CN202210740249.0 2022-06-28

Publications (1)

Publication Number Publication Date
WO2024000987A1 true WO2024000987A1 (en) 2024-01-04

Family

ID=82965835

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/130421 WO2024000987A1 (en) 2022-06-28 2022-11-07 Data storage method, server and storage medium

Country Status (2)

Country Link
CN (1) CN114968124A (en)
WO (1) WO2024000987A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968124A (en) * 2022-06-28 2022-08-30 深圳前海微众银行股份有限公司 Data storage method, server and storage medium
CN115623019B (en) * 2022-12-02 2023-03-21 杭州雅拓信息技术有限公司 Distributed operation flow scheduling execution method and system
CN117435756B (en) * 2023-12-18 2024-03-26 云筑信息科技(成都)有限公司 Data processing method for inquiring user retention based on bitmap

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150213463A1 (en) * 2014-01-27 2015-07-30 Umbel Corporation Systems and Methods of Generating and Using a Bitmap Index
CN110879764A (en) * 2019-11-14 2020-03-13 浪潮(北京)电子信息产业有限公司 Bitmap setting method, device and equipment and readable storage medium
CN111124256A (en) * 2018-10-31 2020-05-08 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for managing storage
CN111274249A (en) * 2020-01-19 2020-06-12 深圳前海微众银行股份有限公司 User image data storage optimization method, device and readable storage medium
CN112463046A (en) * 2020-11-24 2021-03-09 苏州浪潮智能科技有限公司 Method, system, terminal and storage medium for dynamically adjusting bitmap space
CN112532748A (en) * 2020-12-24 2021-03-19 北京百度网讯科技有限公司 Message pushing method, device, equipment, medium and computer program product
CN114968124A (en) * 2022-06-28 2022-08-30 深圳前海微众银行股份有限公司 Data storage method, server and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150213463A1 (en) * 2014-01-27 2015-07-30 Umbel Corporation Systems and Methods of Generating and Using a Bitmap Index
CN111124256A (en) * 2018-10-31 2020-05-08 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for managing storage
CN110879764A (en) * 2019-11-14 2020-03-13 浪潮(北京)电子信息产业有限公司 Bitmap setting method, device and equipment and readable storage medium
CN111274249A (en) * 2020-01-19 2020-06-12 深圳前海微众银行股份有限公司 User image data storage optimization method, device and readable storage medium
CN112463046A (en) * 2020-11-24 2021-03-09 苏州浪潮智能科技有限公司 Method, system, terminal and storage medium for dynamically adjusting bitmap space
CN112532748A (en) * 2020-12-24 2021-03-19 北京百度网讯科技有限公司 Message pushing method, device, equipment, medium and computer program product
CN114968124A (en) * 2022-06-28 2022-08-30 深圳前海微众银行股份有限公司 Data storage method, server and storage medium

Also Published As

Publication number Publication date
CN114968124A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
WO2024000987A1 (en) Data storage method, server and storage medium
US11960726B2 (en) Method and apparatus for SSD storage access
CN102495857B (en) Load balancing method for distributed database
WO2019205371A1 (en) Server, message allocation method, and storage medium
CN102947799B (en) Message is sent to and subscribes to recipient from message source
WO2017050014A1 (en) Data storage processing method and device
US9213731B2 (en) Determining whether to relocate data to a different tier in a multi-tier storage system
CN107993151B (en) Fund transaction clearing method, device, equipment and computer readable storage medium
US11093410B2 (en) Cache management method, storage system and computer program product
WO2023284173A1 (en) Task allocation method and system for solid-state drive, electronic device, and storage medium
CN115129621B (en) Memory management method, device, medium and memory management module
CN114861911A (en) Deep learning model training method, device, system, equipment and medium
CN111897819A (en) Data storage method and device, electronic equipment and storage medium
CN115963995A (en) Multi-mode low-energy-consumption distributed cloud storage system, electronic equipment and storage medium
CN110309143A (en) Data similarity determines method, apparatus and processing equipment
WO2021232743A1 (en) Cache management method and apparatus, storage medium, and solid-state non-volatile storage device
CN115151902A (en) Cluster capacity expansion method and device, storage medium and electronic equipment
EP4321981A1 (en) Data processing method and apparatus
CN106537321A (en) Method and device for accessing file, and storage system
CN110262758B (en) Data storage management method, system and related equipment
CN110022348B (en) System and method for dynamic backup sessions
CN115878308A (en) Resource scheduling method and device
CN111813761A (en) Database management method and device and computer storage medium
CN111090633A (en) Small file aggregation method, device and equipment of distributed file system
KR102569002B1 (en) Apparatus and method for automatic optimization of virtual machine in multi-cluster environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22949070

Country of ref document: EP

Kind code of ref document: A1