A kind of sample playback data access method and device
Technical field
This specification embodiment be related to machine learning techniques field more particularly to a kind of sample playback data access method and
Device.
Background technology
Currently, artificial intelligence has become the research hotspot of industry-by-industry, machine learning (or deep learning) algorithm is real
Show the key technology of artificial intelligence, some algorithms have begun to be applied to solve practical business demand at present.Researcher simultaneously
, it was also found that some other periphery problems in addition to algorithm, such as data access, hardware resource occupancy etc., in new applied field
Also new demand is produced under scape, some traditional ripe schemes are no longer applicable in.
By taking sample playback demand in intensified learning as an example, in intensified learning, needed to row before to be trained
It is played back using the input as model learning for sample.Sample playback plays connection behavior income and iteration in intensified learning
Function served as bridge between training, in order to promote learning effect, a variety of playback strategies may be used in sample playback, such as sequence is returned
Put, random playback, by batch playback, by specified probability sampling playback etc..These strategies are all to support on theoretical algorithm,
And it can smoothly be realized respectively in experimental situation, however in practical applications, it needs in a business scenario flexibly
Switch various playback strategy, also needs to consider the practical problems such as distributed business environment, data throughout be huge sometimes, at present simultaneously
There is no scheme to disclosure satisfy that these demands.
Invention content
In view of the above technical problems, a kind of sample playback data access method of this specification embodiment offer and device, skill
Art scheme is as follows:
According to the 1st of this specification embodiment the aspect, a kind of sample playback date storage method, configuration record information are provided
Table, batch information table, data content table;
The record information list, the record identification for storing the sample playback data being newly written;
The batch information table, the batch identification for storing the sample playback data being newly written;
The data content table, for storing sample playback data, every sample playback of data is with record identification and batch
Mark collectively forms identification field;
For any bar data to be stored, following operation is executed:
According to record information list, identified for the data to be stored assignment record;
According to batch information table, batch identification is distributed for the data to be stored;
According to the storage organization of data content table, to record identification, batch identification and the number to be stored distributed
According to content spliced and by splicing result be written data content table;
Record information list, batch information table are updated.
According to the 2nd of this specification embodiment the aspect, a kind of sample playback method for reading data is provided, this method includes:
Determine that playback demand is:Randomly select record playback;
According to record information list, the record sum sum for the sample playback data having been written into is obtained;
Random number array is generated, the random number array includes the n random value chosen from sum record identification,
Wherein n is the sample record quantity needed for playback;
It traverses the random number array and executes following steps, obtain n sample playback of data record:With any in array
Numerical value reads the sample playback data with the record identification as record identification from data content table.
According to the 3rd of this specification embodiment the aspect, a kind of sample playback method for reading data is provided, this method includes:
Determine that playback demand is:Randomly select batch playback;
According to batch information table, the lot count batch_sum for the sample playback data having been written into is obtained;
Generate random number array, the random number array include the n that is chosen from batch_sum record identification it is a with
Machine value, wherein n are the sample batch quantity needed for playback;
It traverses the random number array and executes following steps, obtain n sample playback data batch:With any in array
A numerical value reads the sample playback data with the batch identification as batch identification from data content table.
According to the 4th of this specification embodiment the aspect, a kind of sample playback data storage device, configuration record information are provided
Table, batch information table, data content table;
The record information list, the record identification for storing the sample playback data being newly written;
The batch information table, the batch identification for storing the sample playback data being newly written;
The data content table, for storing sample playback data, every sample playback of data is with record identification and batch
Mark collectively forms identification field;
Described device includes:Distribution module, content writing module, information updating module are identified, it is to be stored for any bar
Data:
The mark distribution module, for according to record information list, being identified for the data to be stored assignment record;And
According to batch information table, batch identification is distributed for the data to be stored;
The content writing module, for the storage organization according to data content table, to record identification, the batch distributed
The content of mark and the data to be stored is spliced and data content table is written in splicing result;
Described information update module, for being updated to record information list, batch information table.
According to the 5th of this specification embodiment the aspect, a kind of sample playback digital independent device is provided, which includes:
Playback demand determining module, for determining that playback demand is:Randomly select record playback;
The total determining module of record, for according to record information list, the record for obtaining the sample playback data having been written into be total
Number sum;
Data read module, for generating random number array, the random number array includes from sum record identification
N random value of selection, wherein n are the sample record quantity needed for playback;It traverses the random number array and executes following steps,
Obtain n sample playback of data record:Using any value in array as record identification, being read from data content table has
The sample playback data of the record identification.
According to the 6th of this specification embodiment the aspect, a kind of sample playback digital independent device is provided, which includes:
Playback demand determining module, for determining that playback demand is:Randomly select batch playback;
Lot count determining module, for according to batch information table, the batch for obtaining the sample playback data having been written into be total
Number batch_sum;
Data read module, for generating random number array, the random number array includes being recorded from batch_sum
The n random value chosen in mark, wherein n are the sample batch quantity needed for playback;Traverse the random number array execute with
Lower step obtains n sample playback data batch:Using any one numerical value in array as batch identification, from data content table
It is middle to read the sample playback data with the batch identification.
The technical solution that this specification embodiment is provided carries out the record information and batch information of sample playback data
It detaches, and configures dedicated list item and store record information and batch information respectively;It, can be flexible when needing to carry out sample playback
Various common sample playback strategies are realized on ground, to preferably meet practical business application demand.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not
This specification embodiment can be limited.
In addition, any embodiment in this specification embodiment does not need to reach above-mentioned whole effects.
Description of the drawings
In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or
Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
Some embodiments described in this specification embodiment for those of ordinary skill in the art can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 a and Fig. 1 b are the flow diagrams of the sample playback date storage method of this specification embodiment;
Fig. 2 is the overall architecture schematic diagram of the sample playback data access arrangement of this specification embodiment;
Fig. 3 is a kind of flow diagram of sample playback method for reading data of this specification embodiment;
Fig. 4 is the structural schematic diagram of the sample playback data storage device of this specification embodiment;
Fig. 5 is the structural schematic diagram of the first sample playback digital independent device of this specification embodiment;
Fig. 6 is the structural schematic diagram of second of sample playback digital independent device of this specification embodiment;
Fig. 7 is the structural schematic diagram of the third sample playback digital independent device of this specification embodiment;
Fig. 8 is a kind of structural schematic diagram of equipment for configuring this specification embodiment device.
Specific implementation mode
In order to make those skilled in the art more fully understand the technical solution in this specification embodiment, below in conjunction with this
Attached drawing in specification embodiment is described in detail the technical solution in this specification embodiment, it is clear that described
Embodiment is only a part of the embodiment of this specification, instead of all the embodiments.The embodiment of base in this manual,
The every other embodiment that those of ordinary skill in the art are obtained, should all belong to the range of protection.
Intensified learning (reinforcement learning), also known as reinforcement function, evaluation study etc., are a kind of important
Machine learning method, in the fields such as intelligent control machine people and analysis prediction, there are many applications.During intensified learning, calculate
Chance is attempted a series of behaviors of selection in the case of no any prompt and is obtained corresponding as a result, by judging this result
Quality evaluated come the behavior for before, which can be used for feeding back to behavior side with the behavior before adjusting, algorithm
Target is to adjust these behaviors to obtain best evaluation, by constantly adjusting so that computer can learn at what
In the case of select which type of behavior that can obtain best result.
The sample playback of intensified learning is using behavioral data as original sample, and according to different nitrification enhancements
Carry out different playback strategies, such as can be played back with sequential playback, random playback, by batch playback, by specified probability sampling etc.
Deng.
Prior art is although may be implemented above-mentioned a variety of playback strategies, each strategy is all implemented separately, no
The actual demand for meeting policy-flexible switching is simply failed to, and brings higher exploitation and maintenance cost.In addition, the prior art
Scheme is all that can delete the sample added earliest after sample playback, memory queue are write completely using single machine memory queue as carrier to realize
This record, similar FIFO (first in first out) queue.But memory queue can only realize that the multithreading inside single machine is shared, Wu Fashi
For actual distributed business environment;And limited by the factors such as memory size and perishability storage, can not yet
Meet the big data quantity in practical application, the demands such as data delay uses.
In view of the above-mentioned problems, this specification embodiment provides a kind of sample playback data access method.On the one hand, this method
Using database as carrier, to ensure the memory capacity and persistence reliable memory of sample playback data.On the other hand, needle
Actual demand to sample playback also provides corresponding data store organisation and data access method.
During intensified learning, each behavior of computer, which can correspond to, generates a behavioral data, the tool of a data
Hold in vivo and may include:Current state state, the behavior action taken, this behavior obtain evaluation reward, etc.,
Particular content in data may be different according to algorithm, this specification for a data include which particular content field simultaneously
It does not limit.
In order to be distinguished to multiple behavior, 1 data as described above is known as 1 record (record), it is different
Record is distinguished with " record identification " field.
In addition, during intensified learning, for the ease of batch playback process, it is also necessary to which multiple behavior is divided into difference
Set, partitioning standards can be by quantity divide (such as every 1000 behavioral datas are denoted as a set), can also be by reality
Using logical partitioning, (such as behavioral data caused by each strategy is denoted as behavior number produced by a batch, each environment on border
According to being denoted as batch, etc.), this specification is not limited for which kind of set to divide logic using.
In order to be distinguished to multiple batches, 1 data acquisition system as described above is known as 1 batch (batch), it is different
Batch distinguished with " batch identification " field.
As it can be seen that in sample playback demand, a sample playback of data should include at least following field:
Record identification, batch identification, content of the act
In this specification by record above-mentioned data structure table be known as " data content table ", wherein " record identification " and
" batch identification " is identification field, and different records may correspond to identical batch, may also correspond to different batches;And
" content of the act " is then content field, and content field can have multiple, and the corresponding particular content field of algorithms of different may be
Different.
In addition to data table of contents, the data store organisation that this specification provides further includes " record information list " and " batch
Information table ":
Record information list:Record identification for storing be newly written sample playback of data;Optionally, it is recording
The current record sum for having been written into sample playback data can also be stored in information table.Here record sum is defined as optionally
Reason is:In some cases, can record sum directly be determined according to " record identification being newly written ", such as:Record mark
Know since 0, and with+1 amplitude natural increasing, in the case where not limiting record sum, record sum=be newly written
Record identification number+1.
Batch information table:Batch identification for storing be newly written sample playback of data;Optionally, in batch
The current lot count for having been written into sample playback data can also be stored in information table.Lot count is defined as to the original of " optional "
Because total being defined as that optional reason is similar, and which is not described herein again with will record.
On the basis of defining above-mentioned storage organization, it is further provided sample playback date storage method, referring to Fig. 1 a and
Shown in Fig. 1 b, for any bar sample playback data to be stored, storage method may comprise steps of:
S101 according to record information list is data to be stored assignment record mark, according to batch information table is number to be stored
According to distribution batch identification;
The most basic mark method of salary distribution is numbered with natural count, by taking record identification as an example, it is assumed that the 1st article of write-in
The number of record is 0, then number consecutively is 1,2,3 ... by the record being subsequently written.Assuming that indicating " to be newly written with cur
The record identification of sample playback data " uses what is identified for data assignment record to be written then before write-in data every time
Calculation formula is:
Cur=cur+1
If being directed to data content table, being pre-configured with l allows the record quantity maximum value of sample playback data of storage, that
It can be to count the period with the maximum value, be identified for data to be stored assignment record, for example, by using following formula assignment record
Mark:
Cur=(cur+1) %max
The method of salary distribution of batch identification and the method for salary distribution of record identification are substantially similar, and difference lies in same due to allowing
Batch include it is a plurality of it is different record, therefore when being allocated to batch identification, it may be necessary to judge current storage data with
Whether the data batch of a upper storage is identical;If it is it is that data to be stored distribution is identical as the data that upper one stores
Batch identification;Otherwise it is that the data to be stored distributes new batch identification.
It should be noted that the above-mentioned scheme for being identified number to record or batch with natural count is a kind of specific
Embodiment, should not be construed as the limitation to this specification scheme, for example, can also use other special algorithms for every record
Or each batch generates identification information, these have no effect on the realization of this specification scheme.
S102, according to the storage organization of data content table, to distributed record identification, batch identification and described wait depositing
The content of storage data is spliced and data content table is written in splicing result;
According to the description of preceding embodiment, data content table includes 3 part substances:Record identification, batch identification, row
For content.
For current data to be stored, " record identification " has been determined in S101 with " batch identification ", and " in behavior
Hold " be to be obtained from applications, above-mentioned three parts content is spliced, you can obtain one [record identification, batch identification,
Content of the act] triple data, data content table is written into the data data line new as one.It is appreciated that
It is that " content of the act " here generally corresponds to multiple specific fields, and may mostly be obtained from applications when actual storage
Data carry out certain conversion process, these have no effect on the realization of this specification scheme.
S103 is updated record information list, batch information table.
According to the definition of record information list and batch information table it is found that after the completion of data table of contents are written, record is believed
Breath table, batch information table will also be updated accordingly.
Most basic update operation is to " record identifications for the sample playback data being newly written " and " to be newly written
The batch identification of sample playback data " is updated.Updated value is the record distributed by data to be stored in S101
Mark and batch identification.
In fact, according to the calculation formula in S101, it is believed that while calculating new logo, update is also completed,
Such processing mode can also be used in practical application.But strictly speaking, more new logo operation should confirm
Data carry out after being written successfully to data content table, therefore the method and step that this specification is provided is still according to above-mentioned stringent meaning
The flow scheme design of justice, but those skilled in the art will be apparent to the skilled artisan that this method and step should not be construed as the restriction to scheme.
In addition, if being provided with the record sum of playback of data in record information list, update operation should also
Including:Record sum is updated, also i.e. by original record sum+1.
Similarly, if being provided with the lot count of playback of data in batch information table, update operation is also answered
This includes:Lot count is updated.Specifically, belonging to a new batch if it is the record being newly written, then will
Former lot count+1;If the record batch being newly written does not change, keep lot count constant.
Common sample playback demand can be divided into specified playback and random playback two major classes according to playback object, wherein
Specified playback includes the specified playback to designated recorder and/or given batch, for specified playback demand, directly with specified note
Record mark and/or batch identification build querying condition, and corresponding data is read from data content table.For example, it is desired to which sequence is returned
It puts 0-99 items in batch 1 to record, then can first generate a sequence array list={ 0,1,2 ... 99 }, then traverse the sequence
Array, with batch=1, record=list { } is condition assembly inquiry request, and corresponding data is read from data content table and is
It can.
For random playback demand, basic ideas are as follows:
1) return visit demand is determined, including the range (such as global playback, given batch playback, etc.) of random read take and right
As type (record or batch);
2) sum of the object type within the scope of this is determined according to record information list or batch information table;
3) a random number array is generated, which includes the n random value chosen from above-mentioned sum, and wherein n is back
Put required sample record or batch size;
4) and then the random number array is traversed, assembled inquiry request reads corresponding data from tables of data.
For example, it is desired to which random playback n items record from global all records, then can be obtained first according to record information list
The sum for obtaining global record is sum, and then generates a random number array list=n_random (sum) for including n numerical value,
Then the random number array is traversed, with record=list { } for condition assembly inquiry request, phase is read from data content table
Answer data.
It should be noted that data access method described above, is suitable for the sample playback data of same business, such as
Fruit is there are multiple business, and in order to realize the multiplexing storage of multiple business data, can further increase service identification field (can be with
It is one or more, such as Apply Names, application version etc.) to be distinguished to different business.In this case, one
Sample playback data should also contain at least one service identification field, and the service identification field needs while configuration is recording
In information table, batch information table, data content table, to establish the association between three tables.From the angle of data maintenance convenience
For, general different business can use independent record identification and batch identification system respectively, theoretically also allow certainly not
Same set of mark system is shared with business, this has no effect on the realization of this specification scheme.
In addition, record information list, batch information table in this specification embodiment, data content table only represent logical meaning
On basic dividing mode, can be to one or more in three tables under the premise of not departing from above-mentioned logic in practical application
It is a to merge, further fractionation can also be done to one or more of three tables, these belong to this specification scheme
Protection domain.
Below in conjunction with specific example, the sample playback data access package provided this specification carries out specifically
It is bright.
It, can be more using many practical problems of deeply study solution, such as bonus granting in intelligent decision field
How intelligence is according to system pressure when few amount of money can cause the concern of different user, fatigue strength control how to vary with each individual, promote greatly
Automatic dilatation capacity reducing etc..In true service environment, not only data producer (service application end) is distributed, and data make
User (model training end) is also distributed, and data consumer needs using a variety of playback strategies of support.This specification
The sample playback data access package based on HBase as storage medium is proposed, to meet large-scale distributed true production
Environment.
Fig. 2 show the overall architecture schematic diagram of sample playback data access arrangement;
Online operation system generates a large number of services daily record, these daily records are written to sample playback component by working process,
Sample playback component includes write-in end (write), reads end (read) and the HBase as persistence carrier, we utilize
HBase designs specific table structure and Rowkey to realize that the institute of sample playback is functional.Intensified learning training system can be with
Trunking mode receives playback of data and is trained, and can also be single machine training consumption.
In HBase, Rowkey is capable major key, is looked into using Rowkey or Rowkey ranges, that is, scan
Look for data.There are two types of the concepts of " row ":There can be multiple Qualifier below Family and Qualifier, a Family,
So can simply be interpreted as, the row in HBase are two level row, that is to say, that Family is first order row, and Qualifier is
The second level arranges, and two are set memberships.
According to the basic characteristics of HBase, to record information list, batch information table, data content table structure such as 1 institute of table
Show:
Table 1
A) record information list record meta:
Version indicates version number;
App indicates Business Name;
Sum indicates currently available total sample number, and value range is [0, max];
Cur indicates that current record (record being newly written) is numbered, and often adds 1 record cur value and increases by 1, when reaching
When max, counting is started the cycle over from 0;
B) batch information table batch meta:
Version indicates version number;
App indicates Business Name;
Batch_sum indicates currently available batch sums, and value range is [0, maxbatch]
Batch_cur indicates that present lot (record being newly written) is numbered, often replaces 1 batch batch_cur value
1, when reaching maxbatch, counting is started the cycle over from 0;
C) data content table record data:
Version indicates version number;
App indicates Business Name;
Batch indicates Mission Number;
Cur indicates record number;
Data is kv list fields, can be with Dynamic expansion, such as can be following form for dqn algorithms:
[state:xxx][action:xxx][reward:xxxx][next_state:xxx]
Data fields can also include other information, such as write time information, be chosen according to time range to realize
Sample playback data etc.;
In addition, for " randomly selecting record playback according to specified probability ", data fields can be also used for being stored as every
The specified selected probability of sample record.It is understood that the content stored be not limited in value [0,1] probability
Numeric form can also be other forms, as long as the selected priority of different records can be distinguished.Certainly, this refers to
Other positions can also be stored in by determining probabilistic information, and this specification is not defined the acquisition source for specifying probabilistic information.
As it can be seen that in above-mentioned storage organization, batch for allowing the record quantity maximum value max of storage, allowing storage is limited
Sub-quantity maximum value maxbatch_.And version and app fields are increased in Rowkey, as the association of three tables,
To realize the multiplexing storage of multiple business/miscellaneous editions.Wherein version can also write time information, to realize press
The demand of sample playback data is chosen according to time range.
Based on above-mentioned data structure, for any bar data to be stored, it is as follows that logic is written in sample playback component:
S201 reads record meta tables, obtains current summary journal number cur:
Cur=(cur+1) %max;
S202. batch meta tables are read, present lot number batch_cur is obtained:
It is written if it is a new lot, batch_cur=(batch_cur+1) %maxbatch,
It is written if not new lot, batch_cur=batch_cur.
S203. splice data and be written to record data:
Assembled Rowkey=version:app:batch_cur:Cur, content are the contents of the act of current data to be stored
Four-tuple, state, action, reward, netx_state, by kv list write-in record data with this Rowkey pairs
The row answered.
Assembled Rowkey=version:app:null:Cur contents are the content of the act quaternarys of current data to be stored
Group, state, action, reward, netx_state will be corresponding with the Rowkey in kv list write-in record data
Row.
In the present embodiment, two rows record can be written for behavior each time:A line carries batch number informations, is used for
It realizes and plays back demand by batch;A line is without batch number informations, for realizing global playback demand.The original handled in this way
When providing assembly Rowkey because being, in HBase must field is assembled one by one in sequence, therefore (be not required in global play back
Will be using batch number informations as querying condition) when cannot skip batch_cur and directly assembly cur, certainly, this demand
Also a lot of other solutions, the present embodiment are only used for schematically illustrating, and should not be construed as the restriction to technical solution.
S204. record meta tables and batch meta tables are updated:
The current total number sum=min (sum+1, max) of record meta tables is updated, while updating working as in meta tables
Preceding number value cur=cur;
If this record is the last item in batch, batch meta tables version is updated:Current total item of app
Number, batch_sum=(min batch_sum+1, max).The current number value batch_ in batch meta tables is updated simultaneously
Cur=batch_cur.
The data write-in logic of sample playback component is described above, schematically introduces several reading logics again below:
Random read take records:
S301. record meta tables are read, current record sum sum is obtained;
S302. random number array is calculated, from n random value of selection in [0, sum-1]:
List=n_random (sum),
S303. each numerical value is traversed from list as cur, assembled Rowkey=version:app:null:Cur, from
Data are read in record data tables.
Random read take batch:
S401. batch meta tables are read, present lot sum batch_sum is obtained;
S402. random number array is calculated, from n random value of selection in [0, batch_sum-1]:
List=n_random (batch_sum)
S403. each numerical value is traversed from list as batch_cur, assembled Rowkey=version:app:batch_
Cur can utilize HBase due to being that need not splice record number cur here integrally as reading object using batch
In scan methods realize the reading of entire batch.
Specified probability recorded at random is read:
The sampling of preference for probability grade is the very important means promoted in intensified learning, can design sample according to several scenes
This priority, can be new and old for priority with the time, can also be using sample business importance as priority, and assigned priority can be with
Fast lifting training effectiveness, this specification provide a kind of streaming and consumption data and choose sample according to specified probability in queue
Scheme.
Assuming that the probability that every sample record is selected is according to the new and old determination of batch where record, according to different batches
Mission Number is respectively by the time sequencing of write-in:1,2,3 ... N, wherein N are batch maximum value.It can so define:
Record Mission Number/N* α that i is selected probability=record i
Wherein α is probability corrected parameter, can be the value in (0,1), such as 0.5,0.8 etc..
It should be noted that due to before the present embodiment it has been specified that allowing the batch size maximum value of storage
Maxbatch, therefore batch_cur is cycle count, if being not specified by maxbatch, can also directly be utilized here
Batch_cur, which is calculated, is selected probability.
It is shown in Figure 3, specify probability recorded at random to read logic as follows:
S501. record meta tables are read, current record sum sum is obtained;
S502. random number array is calculated, from n random value of selection in [0, sum-1]:
List=n_random (sum),
S503. each numerical value is traversed from list as cur, assembled Rowkey=version:app:null:Cur, from
Data are read in record data tables.
It can be seen that S501~S503 is consistent with S301~S303, it is described further below random according to specified probability
The realization method of playback:
S504. it is directed to every record that S503 is read, whether is retained according to selected determine the probability:
I is recorded for any bar:
On the one hand, the selected probability P i of record i is determined:
Pi=records Mission Number/N* α of i
On the other hand, a random probability value u is generated:
U=random (1), (u ∈ [0,1]);
Compare the size of Pi and u, if u < Pi, reservation record i, if instead u >=Pi, then abandons record i.It can be with
Understand, the case where for u=Pi, can flexibly be arranged, the present embodiment is only used for schematically illustrating.
After executing S504 for the first time, the n items record in list has part that can be dropped due to probability, the note retained
Record sum is less than n, S502~S504 can be repeated, until the record total quantity of reservation reaches n.It repeats to select
During, usually allow same record repeatedly to be retained, if necessary to especially avoid such case, is further added by and avoids
The screening conditions repeated.
Compared with prior art, said program includes at least following advantage:
1. using the high reliability of HBase clusters, even if part machine delay machine or restarting, sample playback number will not be all caused
According to loss;
2. using the high throughput performance of HBase clusters, production system massive logs sample on practical line can be docked and flowed back
Acquisition;
3. can not be limited by single machine memory with clustered deploy(ment), entire data set can be set according to cluster capacity in queue
Limit, supports ultra-large set of data samples.Simultaneously because capacity is big, the reading and writing data speed of the producer and user can be matched
Degree, will not cause queue sample to fill up queue because of the producer is too fast;
4. sample playback data structure dynamic schema, can freely define, dynamic expansion, subsequently when more complicated reinforcing
Study needs to record more information in the sample, and entire sample playback component can support data field Dynamic expansion;
5. supporting a variety of playback strategies such as global playback, replay segment, sequential playback, random playback.
Corresponding to above method embodiment, this specification embodiment also provides a kind of sample playback data storage device, ginseng
As shown in Figure 4, which may include:Distribution module 110, content writing module 120, information updating module 130 are identified, for
Any bar data to be stored:
The mark distribution module 110, for according to record information list, being identified for the data to be stored assignment record;
And according to batch information table, batch identification is distributed for the data to be stored;
The content writing module 120, for according to the storage organization of data content table, to distributed record identification,
The content of batch identification and the data to be stored is spliced and data content table is written in splicing result;
Described information update module 130, for being updated to record information list, batch information table.
According to a kind of specific implementation mode that this specification provides, mark distribution module 110 can be specifically used for:
Judge whether the data batch that data to be stored is stored with upper one is identical;
If it is, for data to be stored distribution batch identification identical with the data that upper one stores;
Otherwise it is that data to be stored distributes new batch identification.
According to a kind of specific implementation mode that this specification provides, content writing module 120 can be specifically used for being directed to one
Data to be stored, splicing two record and are written data content table, and two records are respectively:
The record for carrying batch identification plays back demand for realizing by batch;
The record for not carrying batch identification, for realizing global playback demand.
According to a kind of specific implementation mode that this specification provides, record information list can be additionally operable to the sample that storage has been written into
The record sum of this playback of data;
Information updating module 130 can be also used for:Record sum is updated.
According to a kind of specific implementation mode that this specification provides, batch information table can be also used for the sample that storage has been written into
The lot count of this playback of data;
Information updating module 130 can be also used for:Lot count is updated.
According to a kind of specific implementation mode that this specification provides, for data content table, being pre-configured with allows storage
The record quantity maximum value of sample playback data;
Mark distribution module 120 can be specifically used for:It is to count the period to record quantity maximum value, for data to be stored point
With record identification.
According to a kind of specific implementation mode that this specification provides, for data content table, being pre-configured with allows storage
The batch size maximum value of sample playback data;
Mark distribution module 120 can be specifically used for:It is to count the period with batch size maximum value, for data to be stored point
With batch identification.
Shown in Figure 5, this specification also provides a kind of sample playback digital independent device, which may include:
Playback demand determining module 210, for determining that playback demand is:Randomly select record playback;
The total determining module 220 of record, for according to record information list, obtaining the record for the sample playback data having been written into
Total sum;
Data read module 230, for generating random number array, random number array includes from sum record identification
N random value of selection, wherein n are the sample record quantity needed for playback;It traverses random number array and executes following steps, obtain
N sample playback of data record:Using any value in array as record identification, being read from data content table has the note
Record the sample playback data of mark.
It is shown in Figure 6, according to a kind of specific implementation mode that this specification provides, if playback demand is specially:It presses
Record playback is randomly selected according to specified probability;Then sample playback digital independent device can also include:
Data selecting module 240, for for the obtained every sample playback of data record of data read module, determining
The selected probability specified for this sample record;A random value is generated, if the random value is less than the specified quilt of this record
Probability is chosen, then retains this record, otherwise abandons this record;
Loop control module 250, for judging whether the record sum retained reaches n, if the record sum retained does not reach
To n, then repeated trigger data read module 230, data selecting module 240, until the record quantity of reservation reaches n.
Shown in Figure 7, this specification also provides a kind of sample playback digital independent device, which may include:
Playback demand determining module 310, for determining that playback demand is:Randomly select batch playback;
Lot count determining module 320, for according to batch information table, obtaining the batch for the sample playback data having been written into
Total batch_sum;
Data read module 330, for generating random number array, random number array includes being recorded from batch_sum
The n random value chosen in mark, wherein n are the sample batch quantity needed for playback;It traverses random number array and executes following step
Suddenly, n sample playback data batch is obtained:Using any one numerical value in array as batch identification, read from data content table
Take the sample playback data with the batch identification.
This specification embodiment also provides a kind of computer equipment, includes at least memory, processor and is stored in
On reservoir and the computer program that can run on a processor, wherein processor realizes sample above-mentioned when executing described program
Playback of data stores or read method.
Fig. 8 shows a kind of more specifically computing device hardware architecture diagram that this specification embodiment is provided,
The equipment may include:Processor 1010, memory 1020, input/output interface 1030, communication interface 1040 and bus
1050.Wherein processor 1010, memory 1020, input/output interface 1030 and communication interface 1040 are real by bus 1050
The now communication connection inside equipment each other.
General CPU (Central Processing Unit, central processing unit), micro- place may be used in processor 1010
Reason device, application specific integrated circuit (Application Specific Integrated Circuit, ASIC) or one
Or the modes such as multiple integrated circuits are realized, for executing relative program, to realize technical side that this specification embodiment is provided
Case.
ROM (Read Only Memory, read-only memory), RAM (Random Access may be used in memory 1020
Memory, random access memory), static storage device, the forms such as dynamic memory realize.Memory 1020 can store
Operating system and other applications are realizing technical solution that this specification embodiment is provided by software or firmware
When, relevant program code is stored in memory 1020, and is executed by processor 1010 to call.
Input/output interface 1030 is for connecting input/output module, to realize information input and output.Input and output/
Module can be used as component Configuration (not shown) in a device, can also be external in equipment to provide corresponding function.Wherein
Input equipment may include keyboard, mouse, touch screen, microphone, various kinds of sensors etc., output equipment may include display,
Loud speaker, vibrator, indicator light etc..
Communication interface 1040 is used for connection communication module (not shown), to realize the communication of this equipment and other equipment
Interaction.Wherein communication module can be realized by wired mode (such as USB, cable etc.) and be communicated, can also be wirelessly
(such as mobile network, WIFI, bluetooth etc.) realizes communication.
Bus 1050 include an access, equipment various components (such as processor 1010, memory 1020, input/it is defeated
Outgoing interface 1030 and communication interface 1040) between transmit information.
It should be noted that although above equipment illustrates only processor 1010, memory 1020, input/output interface
1030, communication interface 1040 and bus 1050, but in specific implementation process, which can also include realizing normal fortune
Other assemblies necessary to row.In addition, it will be appreciated by those skilled in the art that, can also only include real in above equipment
Component necessary to existing this specification example scheme, without including all components shown in figure.
This specification embodiment also provides a kind of computer readable storage medium, is stored thereon with computer program, the journey
Sample playback data storage above-mentioned or read method are realized when sequence is executed by processor.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus
Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
As seen through the above description of the embodiments, those skilled in the art can be understood that this specification
Embodiment can add the mode of required general hardware platform to realize by software.Based on this understanding, this specification is implemented
Substantially the part that contributes to existing technology can be expressed in the form of software products the technical solution of example in other words,
The computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are making
It is each to obtain computer equipment (can be personal computer, server or the network equipment etc.) execution this specification embodiment
Method described in certain parts of a embodiment or embodiment.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.A kind of typically to realize that equipment is computer, the concrete form of computer can
To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment
The combination of arbitrary several equipment.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for device reality
For applying example, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to embodiment of the method
Part explanation.The apparatus embodiments described above are merely exemplary, wherein described be used as separating component explanation
Module may or may not be physically separated, can be each module when implementing this specification example scheme
Function realize in the same or multiple software and or hardware.Can also select according to the actual needs part therein or
Person's whole module achieves the purpose of the solution of this embodiment.Those of ordinary skill in the art are not the case where making the creative labor
Under, you can to understand and implement.
The above is only the specific implementation mode of this specification embodiment, it is noted that for the general of the art
For logical technical staff, under the premise of not departing from this specification embodiment principle, several improvements and modifications can also be made, this
A little improvements and modifications also should be regarded as the protection domain of this specification embodiment.