CN109800260A - High concurrent date storage method, device, computer equipment and storage medium - Google Patents

High concurrent date storage method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109800260A
CN109800260A CN201811531025.9A CN201811531025A CN109800260A CN 109800260 A CN109800260 A CN 109800260A CN 201811531025 A CN201811531025 A CN 201811531025A CN 109800260 A CN109800260 A CN 109800260A
Authority
CN
China
Prior art keywords
data
database
threshold value
block
buffer storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811531025.9A
Other languages
Chinese (zh)
Inventor
丁晶晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Financial Technology Co Ltd Shanghai
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN201811531025.9A priority Critical patent/CN109800260A/en
Publication of CN109800260A publication Critical patent/CN109800260A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application involves technical field of data storage more particularly to a kind of high concurrent date storage method, device, computer equipment and storage mediums.All data being pre-stored in database are obtained, data queue is established, data grouping is carried out to the data queue and is packaged the several data groups of generation;The data group is converged, when the data volume after convergence reaches preset data block generation threshold value, is packaged and generates a data block;Obtain the data count amount and database write frequency threshold value being pre-stored in database, establish several data buffer storage layers, the data block is stored in the data buffer storage layer, the universal nodes for obtaining the data block point in adjacent data buffer storage layer described in two-stage up and down, according to the quantity of the universal nodes determine every level-one described in data number of blocks in data buffer storage layer.The application improves the efficiency of data loading by way of multi-level buffer.

Description

High concurrent date storage method, device, computer equipment and storage medium
Technical field
This application involves technical field of data storage more particularly to a kind of high concurrent date storage methods, device, computer Equipment and storage medium.
Background technique
High concurrent (High Concurrency) be in the design of Internet advertising distribution system architecture the factor that must be taken into consideration it One, it typically refer to by design guarantee system can simultaneously parallel processing much request.The related common some fingers of high concurrent It indicates response time (Response Time), handling capacity (Throughput), query rate QPS (Query Per per second Second), concurrent user number etc..Response time: the time that system responds request.Such as system handles a HTTP and asks It asks and needs 200ms, this 200ms is exactly the response time of system.Handling capacity: the number of requests handled in the unit time.QPS: Respond request number per second.In internet area, this index and handling capacity distinguish without so obvious.Concurrent user number: same When carry normal use system function number of users.Such as an instant communicating system, while the online amount previous generation to a certain degree The table concurrent user number of system.
In the case where carrying out the scene that mass data needs to share immediately and be put in storage immediately, it is easy to cause high concurrent event.Usually The settling mode used when there is high concurrent is to be cached using open source softwares such as Kafka to data, is concurrently sent out with reducing Raw number.
But when using open source softwares such as Kafka, there is only level cache, cause in data super large, data Slow problem is lost or handled, requirement of the big data era to data acquisition timeliness can not be coped with.
Summary of the invention
In view of this, it is necessary to which when for mass data storage, high concurrent event, which occurs, causes entry time long, and processing is slow The problem of, a kind of high concurrent date storage method, device, computer equipment and storage medium are provided.
A kind of high concurrent date storage method, includes the following steps:
All data being pre-stored in database are obtained, data queue is established, data point are carried out to the data queue Group, which is packaged, generates several data groups;
The data group is converged, when the data volume after convergence reaches preset data block generation threshold value, is packaged Generate a data block;
The data count amount and database write frequency threshold value being pre-stored in database are obtained, several data buffer storages are established The data block is stored in the data buffer storage layer by layer, obtains the number divided in data buffer storage layer described in adjacent two-stage up and down According to the universal nodes of block, according to the quantity of the universal nodes determine every level-one described in data number of blocks in data buffer storage layer.
It is described in one of the embodiments, to obtain all data being pre-stored in database, data queue is established, it is right The data queue carries out data grouping and is packaged the several data groups of generation, comprising:
It needs to be written to the data source in database to each and carries out ID mark, the number identified from each with ID According to carrying out data pick-up in source;
Collect the data extracted, in alphabetical order or numerical order is to the data extracted from different data sources It is arranged, generates data queue;
Characteristic character retrieval is carried out to the data queue, characteristic character is obtained and generates position, according to the characteristic character It generates position and the several data groups of packing generation is grouped to the data queue.
Described in one of the embodiments, to converge the data group, data volume after convergence reaches default Data block generate threshold value when, be packaged generate a data block, comprising:
The data volume for obtaining each data storage areas recorded in database, obtains the generation threshold of the data block Value;
Data volume statistics is carried out to the data group that will be written in database, if the data volume in any one data group It is all larger than the generation threshold value of the data block, then the generation threshold value of the data block is revised using hash algorithm, so that Data volume the sum of of the generation threshold value of the data block at least more than two data groups;
The data group that will be written in database is converged, when the sum of data volume reaches the generation threshold of the data block When value, a data block is generated.
It is described in one of the embodiments, to obtain the data count amount being pre-stored in database and database write-in frequency Rate threshold value establishes several data buffer storage layers, the data block is stored in the data buffer storage layer, acquisition is divided in adjacent upper and lower two The universal nodes of data block in the grade data buffer storage layer, according to the quantity of the universal nodes determine every level-one described in data Data number of blocks in cache layer, comprising:
The data count amount and database write frequency threshold value being pre-stored in database are obtained, is pre-stored to count by described Quotient is with database write-in frequency threshold according to the data count amount in library and obtains Data writing time, is obtained the data and is write The ratio of the angle of incidence and default write time determines the number of plies of data buffer storage layer according to the ratio, according to the data buffer storage The number of plies of layer establishes several data buffer storage layers;
The data capacity threshold value for obtaining every level one data cache layer determines according to the data capacity threshold value and is in upper one Grade data block carries out the frequency of data pick-up to next stage data block is in;
Pre- universal nodes are generated according to the frequency of data pick-up, the pre- universal nodes are carried out using neural network model Training obtains the universal nodes;
According to the quantity of the universal nodes between adjacent two-stage data buffer storage layer, the data number of blocks of every level-one is determined.
It is described in one of the embodiments, to need to be written to the progress ID mark of the data source in database to each, Include:
Client ip address locating for the data source is obtained, according to the client ip address to the data source Carry out major class ID mark;
Group ID mark is carried out to the data source according to the application of the content comprising the data source or data;
Splice the major class ID mark and group ID mark, obtains the ID mark of the data source.
It is described in one of the embodiments, that characteristic character retrieval is carried out to the data queue, it is raw to obtain characteristic character At position, position is generated according to the characteristic character, the several data groups of packing generation is grouped to the data queue, comprising:
Application query language carries out characteristic character retrieval to the data queue;
Generation position of each characteristic character in data column is obtained, the packet threshold of data grouping is set, when When being separated by the data volume between shortest two characteristic characters greater than the packet threshold, by the number between the two characteristic characters According to being packaged, a data group is formed;
If the data volume after being grouped according to the characteristic character to the data queue, between each data group Difference is greater than data group data amount threshold value, then is carried out using other spcial characters of the non-characteristic character to the data queue Again it is grouped, until the data volume difference of each data group is less than data group data amount threshold value.
The data capacity threshold value for taking every level one data cache layer in one of the embodiments, according to the data Capacity threshold is determined in upper level data block to the frequency in next stage data block progress data pick-up, comprising:
It is written to the frequency in database according to data, determines the classification situation of data block;
According to the classification situation, the data volume difference between data block described in adjacent two-stage is determined;
The data volume difference between the data block is obtained, determines that the downward level one data block of data blocks at different levels carries out data pumping The frequency taken, and the data transmission frequencies between data block described in adjacent two-stage are less than data and are written to the frequency in database.
A kind of high concurrent data storage device, including following module:
Data group module is generated, is set as obtaining all data being pre-stored in database, data queue is established, to institute It states data queue and carries out the several data groups of data grouping packing generation;
Data block module is generated, is set as converging the data group, data volume after convergence reaches preset When data block generates threshold value, it is packaged and generates a data block;
Data cache module is set as obtaining the data count amount being pre-stored in database and database write frequency threshold Value, establishes several data buffer storage layers, and the data block is stored in the data buffer storage layer, obtains and divides in adjacent two-stage institute up and down The universal nodes for stating the data block in data buffer storage layer, according to the quantity of the universal nodes determine every level-one described in data buffer storage Data number of blocks in layer.
A kind of computer equipment, including memory and processor are stored with computer-readable instruction in the memory, institute When stating computer-readable instruction and being executed by the processor, so that the processor executes above-mentioned high concurrent date storage method Step.
A kind of storage medium being stored with computer-readable instruction, the computer-readable instruction are handled by one or more When device executes, so that the step of one or more processors execute above-mentioned high concurrent date storage method.
Above-mentioned high concurrent date storage method, device, computer equipment and storage medium, including obtain and be pre-stored to data All data in library, establish data queue, carry out data grouping to the data queue and are packaged the several data groups of generation;By institute It states data group to be converged, when the data volume after convergence reaches preset data block generation threshold value, is packaged and generates a data block; The data count amount and database write frequency threshold value being pre-stored in database are obtained, several data buffer storage layers are established, by institute It states data block and is stored in the data buffer storage layer, obtain the logical of the data block divided in data buffer storage layer described in adjacent two-stage up and down With node, according to the quantity of the universal nodes determine every level-one described in data number of blocks in data buffer storage layer.This technology side When case is put in storage for mass data, high concurrent event, which occurs, causes entry time long, handles slow problem, passes through multi-level buffer Mode, and improve the efficiency of data loading.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the application Limitation.
Fig. 1 is a kind of overall flow figure of high concurrent date storage method of the application;
Fig. 2 is the generation data group process schematic in a kind of high concurrent date storage method of the application;
Fig. 3 is the generation data block process schematic in a kind of high concurrent date storage method of the application;
Fig. 4 is a kind of structure chart of high concurrent date storage method device of the application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and It is not used in restriction the application.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.
Fig. 1 is the flow chart of the high concurrent date storage method in the application one embodiment, as shown in Figure 1, a kind of high Concurrent data storage method, comprising the following steps:
S1 obtains all data being pre-stored in database, establishes data queue, carries out data to the data queue Grouping, which is packaged, generates several data groups;
Specifically, can be according to the time that data generate in proper order, to being pre-stored in database when establishing data queue Data be ranked up, for example, several data pick-up timing nodes are arranged, extract when where the data pick-up timing node The data of multiple and different data sources are carved, are to be packaged to generate several data groups to data queue with timing node.
S2 converges the data group, when the data volume after convergence reaches preset data block generation threshold value, beats Packet generates a data block;
Specifically, the data block, which generates threshold value, to be counted according to historical data, that is, obtain all previous carry out data Actual conditions when storage obtain, and when carrying out historical data statistics, the weight of the historical data nearest apart from present moment is big In the weight of the historical data remote apart from present moment, calculates weighted average and obtain the data block generation threshold value.
S3 obtains the data count amount and database write frequency threshold value being pre-stored in database, establishes several data The data block is stored in the data buffer storage layer by cache layer, is obtained and is divided in data buffer storage layer described in adjacent two-stage up and down Data block universal nodes, according to the quantity of the universal nodes determine every level-one described in data block number in data buffer storage layer Amount.
Specifically, data buffer storage layer be in order to when data storage is to database, can store to reduce step by step it is high simultaneously The write-in interface that mass data pours in database together in the case of hair causes database to paralyse, by data count amount and database write Enter frequency threshold and be quotient to be written the time, the write time is compared with the preset write time, to determine data Layer The number of plies.If there are 3 universal nodes between upper and lower two-stage data buffer storage, illustrate there are 3 data blocks in higher level's data buffer storage layer, and The universal nodes quantity of junior's data buffer storage layer then has the quantity of its universal nodes between next stage data buffer storage layer again true It is fixed.
The present embodiment reduces high concurrent condition by being cached step by step to the data that will be stored in database Under, data store the processing pressure into database.
Fig. 2 is the generation data group process schematic in a kind of high concurrent date storage method of the application, as shown, obtaining All data being pre-stored in database are taken, data queue is established, data grouping is carried out to the data queue and is packaged generation Several data groups, comprising:
S101, it needs to be written to the progress ID mark of the data source in database to each, is marked from each with ID Data pick-up is carried out in the data source of knowledge;
Specifically, the IP address is carried out DNS if data source obtains the IP address of data source from outside Parsing, obtains dns resolution code, and the front two character extracted in the dns resolution code is identified as the ID of data source;If Data source then obtains the data source in local storage location, the abbreviated character of storage location is made from local It is identified for the ID of data source.
S102, collect the data extracted, in alphabetical order or numerical order is extracted to from different data sources Data arranged, generate data queue;
Specifically, the data of same data source are first put into a data pool, wait to be extracted, then obtains sequence rule Then, if being sorted alphabetically, initial in data character is extracted from each data pool and is the data of " A ", then is extracted Initial is the data of " B ", and the data in the data pool are lined up in alphabetical order with data queue.
S103, characteristic character retrieval is carried out to the data queue, obtains characteristic character and generate position, according to the feature Character generates position and is grouped the several data groups of packing generation to the data queue.
Specifically, obtaining preset characteristic character file, taken out at random from the characteristic character file using random function One or more characteristic character is taken out, Feature Words character is carried out to the data queue according to the characteristic character extracted and is looked into It askes, records location information of the characteristic character in data queue, be according to the positional information split data queue, divide After form several data segments, one or more described data segment is packed into data group.
The present embodiment is configured by data block generating process, improves the accuracy of data group generation, while also mentioning Efficiency when data storage is risen.
Fig. 3 is the generation data block process schematic in a kind of high concurrent date storage method of the application, as shown, institute It states and converges the data group, when the data volume after convergence reaches preset data block generation threshold value, be packaged and generate one Data block, including
S201, the data volume for obtaining each data storage areas recorded in database, obtain the life of the data block At threshold value;
Specifically, the presetting when data volume of each storage region is according to Database in database and obtain It arrives, when setting the generation threshold value of data block, no more than the data volume of data storage areas, data otherwise can then occur and overflow Out the problem of.
S202, data volume statistics is carried out to the data group that will be written in database, if in any one data group Data volume is all larger than the generation threshold value of the data block, then repairs using hash algorithm to the generation threshold value of the data block It orders, so that data volume the sum of of the generation threshold value of the data block at least more than two data groups;
Wherein, the hash algorithm of this step can use method of residues, first estimate the data number size in entire Hash table. Then it uses this estimated value to remove each original value as divisor, obtains quotient and the remainder.Use remainder as cryptographic Hash.Then basis Threshold value plus-minus cryptographic Hash will be generated and obtain revised generation threshold value.
S203, convergence will be written to the data group in database, when the sum of data volume reaches the data block When generating threshold value, a data block is generated.
The present embodiment, by being optimized to data block forming process, so that data volume accuracy in data block is improved, So as to preferably complete data storage.
In one embodiment, described to obtain the data count amount being pre-stored in database and database write frequency threshold Value, establishes several data buffer storage layers, and the data block is stored in the data buffer storage layer, obtains and divides in adjacent two-stage institute up and down The universal nodes for stating the data block in data buffer storage layer, according to the quantity of the universal nodes determine every level-one described in data buffer storage Data number of blocks in layer, comprising:
The data count amount and database write frequency threshold value being pre-stored in database are obtained, is pre-stored to count by described Quotient is with database write-in frequency threshold according to the data count amount in library and obtains Data writing time, is obtained the data and is write The ratio of the angle of incidence and default write time determines the number of plies of data buffer storage layer according to the ratio, according to the data buffer storage The number of plies of layer establishes several data buffer storage layers;
The data capacity threshold value for obtaining every level one data cache layer determines according to the data capacity threshold value and is in upper one Grade data block carries out the frequency of data pick-up to next stage data block is in;
Pre- universal nodes are generated according to the frequency of data pick-up, the pre- universal nodes are carried out using neural network model Training obtains the universal nodes;
Wherein, neural network model used in the present embodiment can be convolution mind neural network model, convolutional Neural net Network is a kind of neural network model of special deep layer, its particularity is embodied in two aspects, on the one hand between its neuron Connection be it is non-connect entirely, the weight of the connection in another aspect same layer between certain neurons is shared (i.e. identical ).The network structure that its non-full connection and weight are shared is allowed to be more closely similar to biological neural network, reduces network model Complexity reduces the quantity of weight.
According to the quantity of the universal nodes between adjacent two-stage data buffer storage layer, the data number of blocks of every level-one is determined.
Specifically, the quantity of the universal nodes between adjacent two-stage data buffer storage layer is the data buffer storage for being located at upper level Data number of blocks in layer, and transmitted according to data between the difference and upper and lower two-stage of data volume in upper and lower two-stage data buffer storage layer Frequency values, can determine a universal nodes connection next stage data buffer storage layer in data block quantity, such as up and down The difference of data is 1000 in two-stage data buffer storage layer, and the capacity of higher level's data block is 100, and the capacity of junior's data block is 10, 10 universal nodes of setting are then needed, the data block number in data buffer storage layers at different levels can be retrodicted out according to the quantity of universal nodes Amount.
The present embodiment can more quickly by being determined to the data number of blocks in the superior and the subordinate's data buffer storage layer It writes data into database.
It is in one embodiment, described to need to be written to the progress ID mark of the data source in database to each, comprising:
Client ip address locating for the data source is obtained, according to the client ip address to the data source Carry out major class ID mark;
Group ID mark is carried out to the data source according to the application of the content comprising the data source or data;
Splice the major class ID mark and group ID mark, obtains the ID mark of the data source.
Specifically, the IP address of a certain client is 256.256.256.1 then by the data source major class ID of the client It is set as 256;The client for being 256 in data source major class is recorded as A using the data source that XXAPP is generated;So this number ID according to source is 256.A, then the ID mark for all data that the data source generates is 256.A.
In the present embodiment, by being identified respectively to Data Identification progress major class and group, it can increase to data source, sentence Disconnected accuracy.
In one embodiment, described that characteristic character retrieval is carried out to the data queue, it obtains characteristic character and generates position It sets, position is generated according to the characteristic character, the several data groups of packing generation are grouped to the data queue, comprising:
Application query language carries out characteristic character retrieval to the data queue;
Wherein, wherein query language can be sql like language, characteristic character can be set as ".",";" or it is other special Symbol;
Generation position of each characteristic character in data column is obtained, the packet threshold of data grouping is set, when When being separated by the data volume between shortest two characteristic characters greater than the packet threshold, by the number between the two characteristic characters According to being packaged, a data group is formed;
If the data volume after being grouped according to the characteristic character to the data queue, between each data group Difference is greater than data group data amount threshold value, then is carried out using other spcial characters of the non-characteristic character to the data queue Again it is grouped, until the data volume difference of each data group is less than data group data amount threshold value.
Specifically, other feature character, which can be, belongs to different classes of characteristic character with original characteristic character, than If former characteristic character is punctuation mark, then other feature character then can be Roman number or Arabic numerals.
The present embodiment forms data group, improves the speed of data storage by carrying out characteristic character segmentation to data queue Degree and efficiency.
In one embodiment, the data capacity threshold value for taking every level one data cache layer, according to the data capacity Threshold value is determined in upper level data block to the frequency in next stage data block progress data pick-up, comprising:
It is written to the frequency in database according to data, determines the classification situation of data block;
According to the classification situation, the data volume difference between data block described in adjacent two-stage is determined;
The data volume difference between the data block is obtained, determines that the downward level one data block of data blocks at different levels carries out data pumping The frequency taken, and the data transmission frequencies between data block described in adjacent two-stage are less than data and are written to the frequency in database.
Specifically, when determining that data are written to the frequency of database, it can be according to molten in the historical data of database Disconnected situation is set, i.e., in determining database in the case where residual memory space, obtains the history number under the memory space When documented fusing occurs in, the maximum frequency of database, number are written using this frequency as data for the frequency of data write-in Frequency according to write-in database cannot be greater than maximum frequency.If it is maximum greater than frequency that data are written to the frequency in data block Value, then be classified, while the data capacity of every level one data cache layer is certain.According to maximum frequency and each The data capacity of grade data cache layer can be classified data block, then obtain data blocks at different levels according to classification situation Difference, and then obtain the data transmission frequencies between data block described in two-stage.
The present embodiment realizes to the accurate control of data block-grading, more saves when to making data storage to database Shi Gaoxiao.
In one embodiment it is proposed that high concurrent data storage device, as shown in figure 4, including following module:
Data group module is generated, is set as obtaining all data being pre-stored in database, data queue is established, to institute It states data queue and carries out the several data groups of data grouping packing generation;
Data block module is generated, is set as converging the data group, data volume after convergence reaches preset When data block generates threshold value, it is packaged and generates a data block;
Data cache module is set as obtaining the data count amount being pre-stored in database and database write frequency threshold Value, establishes several data buffer storage layers, and the data block is stored in the data buffer storage layer, obtains and divides in adjacent two-stage institute up and down The universal nodes for stating the data block in data buffer storage layer, according to the quantity of the universal nodes determine every level-one described in data buffer storage Data number of blocks in layer.
A kind of computer equipment, including memory and processor are stored with computer-readable instruction in the memory, institute When stating computer-readable instruction and being executed by the processor, so that the processor executes above-mentioned high concurrent date storage method Step.
A kind of storage medium being stored with computer-readable instruction, the computer-readable instruction are handled by one or more When device executes, so that the step of one or more processors execute above-mentioned high concurrent date storage method.The storage medium can Think non-volatile memory medium.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include: read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.
The some exemplary embodiments of the application above described embodiment only expresses, wherein describe it is more specific and detailed, But it cannot be understood as the limitations to the application the scope of the patents.It should be pointed out that for the ordinary skill of this field For personnel, without departing from the concept of this application, various modifications and improvements can be made, these belong to the application Protection scope.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of high concurrent date storage method characterized by comprising
All data being pre-stored in database are obtained, data queue is established, data grouping is carried out to the data queue and is beaten Packet generates several data groups;
The data group is converged, when the data volume after convergence reaches preset data block generation threshold value, is packaged and generates One data block;
The data count amount and database write frequency threshold value being pre-stored in database are obtained, several data buffer storage layers are established, The data block is stored in the data buffer storage layer, obtains the data block divided in data buffer storage layer described in adjacent two-stage up and down Universal nodes, according to the quantity of the universal nodes determine every level-one described in data number of blocks in data buffer storage layer.
2. high concurrent date storage method according to claim 1, which is characterized in that the acquisition is pre-stored to database In all data, establish data queue, carry out data grouping to the data queue and be packaged to generate several data groups, comprising:
It needs to be written to the data source in database to each and carries out ID mark, come from each with the data that ID is identified Data pick-up is carried out in source;
Collect the data extracted, in alphabetical order or numerical order carries out the data extracted from different data sources Arrangement generates data queue;
Characteristic character retrieval is carried out to the data queue, characteristic character is obtained and generates position, generated according to the characteristic character Position is grouped packing to the data queue and generates several data groups.
3. high concurrent date storage method according to claim 1, which is characterized in that described to converge the data group It is poly-, when the data volume after convergence reaches preset data block generation threshold value, it is packaged and generates a data block, comprising:
The data volume for obtaining each data storage areas recorded in database, obtains the generation threshold value of the data block;
Data volume statistics is carried out to the data group that will be written in database, if the data volume in any one data group is big In the generation threshold value of the data block, then the generation threshold value of the data block is revised using hash algorithm, so that described Data volume the sum of of the generation threshold value of data block at least more than two data groups;
The data group that will be written in database is converged, when the sum of data volume reaches the generation threshold value of the data block When, generate a data block.
4. high concurrent date storage method according to claim 1, which is characterized in that the acquisition is pre-stored to database In data count amount and database write frequency threshold value, establish several data buffer storage layers, the data block be stored in the number According to cache layer, the universal nodes for the data block divided in data buffer storage layer described in adjacent two-stage up and down are obtained, according to described logical With the quantity of node determine every level-one described in data number of blocks in data buffer storage layer, comprising:
The data count amount and database write frequency threshold value being pre-stored in database are obtained, is pre-stored to database for described In data count amount and database write-in frequency threshold be quotient and obtain Data writing time, when obtaining data write-in Between ratio with the default write time, the number of plies of data buffer storage layer is determined according to the ratio, according to the data buffer storage layer The number of plies establishes several data buffer storage layers;
The data capacity threshold value for obtaining every level one data cache layer determines according to the data capacity threshold value and is in upper level number According to block to the frequency in next stage data block progress data pick-up;
Pre- universal nodes are generated according to the frequency of data pick-up, the pre- universal nodes are trained using neural network model Obtain the universal nodes;
According to the quantity of the universal nodes between adjacent two-stage data buffer storage layer, the data number of blocks of every level-one is determined.
5. high concurrent date storage method according to claim 2, which is characterized in that described to be written to each needs Data source in database carries out ID mark, comprising:
Client ip address locating for the data source is obtained, the data source is carried out according to the client ip address Major class ID mark;
Group ID mark is carried out to the data source according to the application of the content comprising the data source or data;
Splice the major class ID mark and group ID mark, obtains the ID mark of the data source.
6. high concurrent date storage method according to claim 2, which is characterized in that described to be carried out to the data queue Characteristic character retrieval obtains characteristic character and generates position, generates position according to the characteristic character and carries out to the data queue Grouping, which is packaged, generates several data groups, comprising:
Application query language carries out characteristic character retrieval to the data queue;
Generation position of each characteristic character in data column is obtained, the packet threshold of data grouping is set, when being separated by When data volume between shortest two characteristic characters is greater than the packet threshold, by the data between the two characteristic characters into Row is packaged, and forms a data group;
If the data volume difference after being grouped according to the characteristic character to the data queue, between each data group Greater than data group data amount threshold value, then the data queue is carried out again using other spcial characters of the non-characteristic character Grouping, until the data volume difference of each data group is less than data group data amount threshold value.
7. high concurrent date storage method according to claim 4, which is characterized in that described to take every level one data cache layer Data capacity threshold value, according to the data capacity threshold value, determine in upper level data block in next stage data block into The frequency of row data pick-up, comprising:
It is written to the frequency in database according to data, determines the classification situation of data block;
According to the classification situation, the data volume difference between data block described in adjacent two-stage is determined;
The data volume difference between the data block is obtained, determines that the downward level one data block of data blocks at different levels carries out data pick-up Frequency, and the data transmission frequencies between data block described in adjacent two-stage are less than data and are written to the frequency in database.
8. a kind of high concurrent data storage device characterized by comprising
Data group module is generated, is set as obtaining all data being pre-stored in database, data queue is established, to the number Data grouping, which is carried out, according to queue is packaged the several data groups of generation;
Data block module is generated, is set as converging the data group, data volume after convergence reaches preset data When block generates threshold value, it is packaged and generates a data block;
Data cache module is set as obtaining the data count amount and database write frequency threshold value being pre-stored in database, Several data buffer storage layers are established, the data block is stored in the data buffer storage layer, obtains and divides in described in adjacent two-stage up and down The universal nodes of data block in data buffer storage layer, according to the quantity of the universal nodes determine every level-one described in data buffer storage layer In data number of blocks.
9. a kind of computer equipment, including memory and processor, it is stored with computer-readable instruction in the memory, it is described When computer-readable instruction is executed by the processor, so that the processor executes such as any one of claims 1 to 7 right It is required that the step of high concurrent date storage method.
10. a kind of storage medium for being stored with computer-readable instruction, the computer-readable instruction is handled by one or more When device executes, so that one or more processors execute the high concurrent data as described in any one of claims 1 to 7 claim The step of storage method.
CN201811531025.9A 2018-12-14 2018-12-14 High concurrent date storage method, device, computer equipment and storage medium Pending CN109800260A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811531025.9A CN109800260A (en) 2018-12-14 2018-12-14 High concurrent date storage method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811531025.9A CN109800260A (en) 2018-12-14 2018-12-14 High concurrent date storage method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109800260A true CN109800260A (en) 2019-05-24

Family

ID=66556679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811531025.9A Pending CN109800260A (en) 2018-12-14 2018-12-14 High concurrent date storage method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109800260A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928905A (en) * 2019-11-07 2020-03-27 泰康保险集团股份有限公司 Data processing method and device
CN112118283A (en) * 2020-07-30 2020-12-22 爱普(福建)科技有限公司 Data processing method and system based on multi-level cache
CN112486948A (en) * 2020-11-25 2021-03-12 福建省数字福建云计算运营有限公司 Real-time data processing method
CN112698789A (en) * 2020-12-29 2021-04-23 广州鼎甲计算机科技有限公司 Data caching method, device, equipment and storage medium
CN113254270A (en) * 2021-05-28 2021-08-13 济南浪潮数据技术有限公司 Self-recovery method, system and storage medium for storing cache hotspot data
CN113438274A (en) * 2021-05-26 2021-09-24 曙光网络科技有限公司 Data transmission method and device, computer equipment and readable storage medium
CN113691611A (en) * 2021-08-23 2021-11-23 湖南大学 Block chain distributed high-concurrency transaction processing method, system, equipment and storage medium
CN115348217A (en) * 2022-08-16 2022-11-15 青岛海信智慧生活科技股份有限公司 Method for packaging data and electronic equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928905A (en) * 2019-11-07 2020-03-27 泰康保险集团股份有限公司 Data processing method and device
CN110928905B (en) * 2019-11-07 2024-01-26 泰康保险集团股份有限公司 Data processing method and device
CN112118283B (en) * 2020-07-30 2023-04-18 爱普(福建)科技有限公司 Data processing method and system based on multi-level cache
CN112118283A (en) * 2020-07-30 2020-12-22 爱普(福建)科技有限公司 Data processing method and system based on multi-level cache
CN112486948A (en) * 2020-11-25 2021-03-12 福建省数字福建云计算运营有限公司 Real-time data processing method
CN112486948B (en) * 2020-11-25 2022-05-13 福建省数字福建云计算运营有限公司 Real-time data processing method
CN112698789A (en) * 2020-12-29 2021-04-23 广州鼎甲计算机科技有限公司 Data caching method, device, equipment and storage medium
CN113438274A (en) * 2021-05-26 2021-09-24 曙光网络科技有限公司 Data transmission method and device, computer equipment and readable storage medium
CN113254270A (en) * 2021-05-28 2021-08-13 济南浪潮数据技术有限公司 Self-recovery method, system and storage medium for storing cache hotspot data
CN113691611A (en) * 2021-08-23 2021-11-23 湖南大学 Block chain distributed high-concurrency transaction processing method, system, equipment and storage medium
CN113691611B (en) * 2021-08-23 2022-11-22 湖南大学 Block chain distributed high-concurrency transaction processing method, system, equipment and storage medium
CN115348217A (en) * 2022-08-16 2022-11-15 青岛海信智慧生活科技股份有限公司 Method for packaging data and electronic equipment
CN115348217B (en) * 2022-08-16 2024-03-22 青岛海信智慧生活科技股份有限公司 Method for packaging data and electronic equipment

Similar Documents

Publication Publication Date Title
CN109800260A (en) High concurrent date storage method, device, computer equipment and storage medium
US20200356901A1 (en) Target variable distribution-based acceptance of machine learning test data sets
CA3088899C (en) Systems and methods for preparing data for use by machine learning algorithms
CA2953969C (en) Interactive interfaces for machine learning model evaluations
US20170103324A1 (en) Generating responses using memory networks
CN107103032B (en) Mass data paging query method for avoiding global sequencing in distributed environment
US20160350302A1 (en) Dynamically splitting a range of a node in a distributed hash table
US20190384845A1 (en) Using computing resources to perform database queries according to a dynamically determined query size
WO2016209975A2 (en) Preliminary ranker for scoring matching documents
US10467215B2 (en) Matching documents using a bit vector search index
US20160378807A1 (en) Storage and retrieval of data from a bit vector search index
US10915534B2 (en) Extreme value computation
US10860892B1 (en) Systems and methods of synthetic data generation for data stream
WO2016209964A1 (en) Bit vector search index using shards
US20200364211A1 (en) Predictive database index modification
WO2016209952A1 (en) Reducing matching documents for a search query
CN109543089A (en) A kind of classification method, system and the relevant apparatus of network security information data
CN108304469B (en) Method and device for fuzzy matching of character strings
Lim et al. An analysis of image storage systems for scalable training of deep neural networks
Cherubini et al. Cognitive storage for big data
US20200175022A1 (en) Data retrieval
Lytvyn et al. Development of Intellectual System for Data De-Duplication and Distribution in Cloud Storage.
CN112035428A (en) Distributed storage system, method, apparatus, electronic device, and storage medium
US20160378808A1 (en) Updating a bit vector search index
CN106648891A (en) MapReduce model-based task execution method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination