CN109634933A - The method, apparatus and system of data processing - Google Patents
The method, apparatus and system of data processing Download PDFInfo
- Publication number
- CN109634933A CN109634933A CN201811617963.0A CN201811617963A CN109634933A CN 109634933 A CN109634933 A CN 109634933A CN 201811617963 A CN201811617963 A CN 201811617963A CN 109634933 A CN109634933 A CN 109634933A
- Authority
- CN
- China
- Prior art keywords
- file
- subset
- data
- retrieval request
- caching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a kind of method, apparatus of data processing and systems, are related to information technology field, to solve the problems, such as that retrieval request largely occupies Installed System Memory and invents.The method comprise the steps that being grouped to the file set that retrieval request is related to, multiple subset of the file are obtained;It is cached for first subset of the file distribution, to read the data in first subset of the file;After reading data in first subset of the file, the caching of first subset of the file is discharged, and be next subset of the file distribution caching, to read the data in next subset of the file;After the data for reading each subset of the file, the data of All Files subset are merged, obtain the user data returned to client.Present invention is mainly applied in the data retrieval process based on distributed memory system.
Description
The present invention be the applying date be on October 29th, 2014, application No. is 2014105946768, entitled " data
The method, apparatus and system of processing " divisional application.
Technical field
The present invention relates to information technology field more particularly to the method, apparatus and system of a kind of data processing.
Background technique
Cassandra is a kind of storage system of non-stop layer node, according to Hash hash algorithm by the dispersion of data equalization
Into different nodes.Compared with traditional centralised storage system, it is centrally stored that distributed memory system can improve data
The limited problem of caused system performance, can be improved the efficiency of data storage, data query and data processing, more adapt to big
The scene of scale data storage.
MemTable is a certain size the space distributed in distributed node memory, is used to store the number of user's write-in
According to.When data are written to node in user, the data of write-in can be directly appended in the MemTable in node memory.When
After MemTable data are write completely, node can store up all data conversion storages in MemTable on (dump) to disk, form one
Orderly string table (Sorted StringTable, abbreviation SSTable) file, so that the storage to write-in data is completed,
That is data be with SSTable stored in file format on node disk.In general, each SSTable file in node
It is right that one group of any orderly key assignments (Key) will be stored, key value of the Key as SSTable file, for SSTable text
Data in part are identified and (in client level, Key can be simply interpreted as storing or searching the keyword of data).
For each Key, multiple row (Column) information of same Key value can be written in user in synchronization to node, can also
The multiple row Column of same Key to be written to node in different moments.Since MemTable can the dump after memory is write completely
SSTable file, thus same Key different moments be written to the Column in MemTable can be persisted to it is different
In SSTable file, so that the same Key can be corresponded in different SSTable files.In addition, client is to correspondence
The irregular modification of a certain Key data, which also results in the same Key, will disperse in different SSTable files.
At present for distributed memory system this for Cassandra, when client retrieves the data of some Key,
System first has to find all SSTable files comprising the Key, and all correlations of the Key are then read from these files
Data return to client into caching, and after merging to these data.
During above-mentioned retrieval data, at least there are the following problems in the prior art for inventor's discovery: system needs
One section of spatial cache is distributed for each SSTable file, to store data relevant to Key in the SSTable file.Actually answer
In, when the SSTable quantity of documents that some retrieval request is related to is less, the caching expense of system is relatively fewer, but with
The passage of time, the related data of the same key will disperse into more and more SSTable files, therefore carrying out
When retrieval request, system can be a large amount of caching of SSTable file distribution.Since the distribution of caching is with Installed System Memory for support
, thus the excessive caching of distribution can occupy a large amount of Installed System Memory, so that the concurrent handling capacity of retrieval request sharply declines.
Summary of the invention
In view of the above problems, the present invention provides a kind of method, apparatus of data processing and systems, are able to solve retrieval and ask
Ask a large amount of the problem of occupying Installed System Memory.
To solve this technical problem, first aspect present invention provides a kind of method of data processing, comprising:
The file set that retrieval request is related to is grouped, multiple subset of the file are obtained;
It is cached for first subset of the file distribution, to read the data in first subset of the file;
After reading data in first subset of the file, the caching of first subset of the file is discharged, and is next
Subset of the file distribution caching, to read the data in next subset of the file;
After the data for reading each subset of the file, the data of All Files subset are merged, are obtained to client
Hold the user data returned.
Second aspect of the present invention provides a kind of device of data processing, comprising:
Grouped element obtains multiple subset of the file for being grouped to the file set that retrieval request is related to;
Allocation unit, first subset of the file for dividing for grouped element distribute caching;
Reading unit, for being read in first subset of the file that grouped element divides from the caching that allocation unit is distributed
Data;
Allocation unit after the reading data being also used in first subset of the file, discharges first subset of the file
Caching, and for grouped element divide next subset of the file distribute caching;
Reading unit is also used to read in next subset of the file that grouped element divides from the caching that allocation unit is distributed
Data;
Processing unit, the institute for being read to reading unit after the data that reading unit reads each subset of the file
There are the data of subset of the file to merge, obtains the user data returned to client.
Third aspect present invention provides a kind of system of data processing, comprising: client and memory node, wherein deposit
Storage node includes the device of second aspect meaning as above.
By above-mentioned technical proposal, the method, apparatus and system of data processing provided by the invention can be to retrieval requests
The file set being related to is grouped, and multiple subset of the file after grouping are successively distributed with caching and reads data, and by each text
The data read in part subset, which merge to obtain, returns to client user's data.With in the prior art simultaneously be all texts
Part distribution caching is compared, and a complete retrieving can be decomposed into multiple sub- retrievings and successively be held by the present invention
Row decomposes the timing of buffer size to realize.Since the buffer size in Spatial Dimension having been decomposed on time dimension,
Therefore retrieval request is substantially reduced in sometime upper occupied Installed System Memory, and this " parallel " mode for turning " serial " can be with
Make Installed System Memory while supporting more retrieval requests, thus improves the concurrent handling capacity of system.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of method flow diagram of data processing in the embodiment of the present invention;
Fig. 2 shows the flow charts of data processing a kind of in application scenarios of the present invention;
Fig. 3 shows a kind of structural schematic diagram of the device of data processing in the embodiment of the present invention;
Fig. 4 shows the structural schematic diagram of the device of another data processing in the embodiment of the present invention;
Fig. 5 shows a kind of system schematic of data processing in the embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
In order to save the memory source of Cassandra storage system, so that Cassandra system can be supported more simultaneously
Retrieval request, the embodiment of the invention provides a kind of methods of data processing.As shown in Figure 1, this method comprises:
101, the file set that retrieval request is related to is grouped, obtains multiple subset of the file.
Any node in Cassandra storage system all can serve as access node, that is, receives, forwards client retrieval
Request summarizes simultaneously feedback searching result.And the data SStable of the same key value KEY can be deposited on distributed memory system
On multiple back end, each node can be responsible for being retrieved from the file set that oneself is administered, search result
Access node can be fed back to.Client is fed back to after access node aggregation process.
As previously mentioned, the key value in a retrieval request is often related to numerous files, and in other words, distributed storage system
Multiple files in system include the data of key value, and therefore, a retrieval request often relates to multiple files, and the present embodiment will
Multiple files that one retrieval request is related to are stated in a manner of file set.
Illustratively, when in search condition including keyword " patent ", system can search institute from each memory node
There are the files such as file, including document, table, audio-video comprising keyword " patent ", obtains the file being made of these files
Set.
Unlike the prior art, after obtaining file set, system can be grouped file set, be divided
For multiple subset of the file.In the present embodiment, the purpose that file set is grouped is, retrieving is decomposed into multiple sons
Process successively executes multiple subprocess so as to subsequent.
The foundation that file set is grouped in the present embodiment can determine by network management personnel and be configured in system, can also be by
System self-setting.For the latter's implementation, the standard of system self-setting group basis can be divided into two aspects a: side again
Face, system can formulate group basis according to the system parameter of acquisition;On the other hand, system can also according in file perhaps
Attribute information formulates group basis.For in a first aspect, the system parameter that system can refer to includes but is not limited to be that memory is big
Small, transmission bandwidth, number of requests, quantity of documents and number of nodes;And for second aspect, system can then be compiled according to file
Number, the factors such as file cryptographic Hash, file name, fileversion number, file type, file size or the affiliated node of file formulate point
Group foundation, the present embodiment do not limit the specified value of group basis.
It 102, is first subset of the file distribution caching, to read the data in first subset of the file.
After obtaining multiple subset of the file, system successively reads data relevant to key value from each subset of the file.
Specifically, system is first subset of the file distribution caching first, to read the data in first subset of the file.With existing skill
Unlike art, when reading the data of current file subset, system is not cached to alternative document subset allocation.Alternative document
Collection successively waits reading data according to certain sequence.
In the present embodiment, the ordering rule of subset of the file can be, but not limited to be: according to comprising quantity of documents arranged
Sequence is ranked up according to data volume size, is ranked up according to the sum of file weighted value, is ranked up according to the sum of reference number of a document
Deng.The collating sequence of subset of the file determines the sequencing of reading data.
And in a kind of mode for easily facilitating implementation, system can also be randomly ordered to subset of the file progress, or presses
It is ranked up according to the sequencing of decomposition.The present embodiment to the ordering rule and collating sequence of subset of the file with no restrictions.
103, after reading data in first subset of the file, the caching of first subset of the file is discharged, and be
Next subset of the file distribution caching, to read the data in next subset of the file.
After reading data in first subset of the file, system is next subset of the file distribution caching.This reality
It applies in example, since the caching of each subset of the file is successively allocated, before for next subset of the file distribution caching,
System needs the caching to previous file subset to discharge.
After the data for having read next subset of the file, system repeats step 103, discharge its cache and continue according to
It is secondary that third, the 4th subset of the file distribution are cached, reading data is carried out, until All Files subset all completes reading data
Until.
104, after the data for reading each subset of the file, the data of All Files subset are merged, obtain to
The user data that client returns.
System can carry out the merging for summarizing formula to each data of reading, obtain the user data for returning to client.
Further, when merging to data, system can also carry out duplicate removal processing to data, and different storages are saved
The identical data that point provides carries out duplicate removal, thus optimizes to search result.
In the prior art, when obtaining file set, system can distribute one section of caching for each file, will in the caching
The data pick-up of key value involved in file comes out, and merges to obtain the use of response retrieval request to each data of extraction
User data.In general, the quantity of documents that retrieval request is related to is more huge, it is the All Files of a retrieval request while distributes slow
A large amount of Installed System Memory can be occupied by depositing.Illustratively, it is assumed that can be serviced in system with the memory size of retrieval request is 2G, some
The key value of retrieval request exists in 1000 files.If the data pick-up of a file needs to distribute the caching of 32K,
Distribute the caching expense that caching will generate the total 32M of 32K*1000 simultaneously for 1000 files, i.e. a retrieval request can occupy 32M
Memory.For the Installed System Memory of 2G size, it can only at most support 64 retrieval requests, such concurrent handling capacity aobvious simultaneously
It is not able to satisfy the demand of real network deployed environment so.
And the method for data processing provided by the embodiment of the present invention, the All Files that a retrieval request can be related to
It is grouped, and caching distribution is successively carried out to multiple subset of the file of acquisition.Due to the quantity of documents handled on synchronization
The occupation proportion for reducing, therefore caching can also correspond to reduction, and system can vacate more memory sources and retrieve to others
It requests while being handled.Equally with above-mentioned data instance, it is assumed that 1000 files are divided into 100 subset of the file, then
Quantity of documents in each subset of the file is 10.If the data pick-up of a file needs to distribute the caching of 32K, same
When engrave the caching expense that the retrieval request only generates the total 320K of 32K*10, i.e. a retrieval request can only occupy 320K memory.
For the Installed System Memory of 2G size, parallel handling capacity then can extend to 6400 retrieval requests, with existing skill
Art is compared, and the parallel handling capacity of system can be expanded 100 times.
Further, it as the refinement to Fig. 1 step 101, in another embodiment of the present invention, can also retrieve
Joint-action mechanism is established between request and system memory resource.System can determine each according to the EMS memory occupation situation of current system
The grouping coefficient of retrieval request, and it is total by determining grouping coefficient and each retrieval request to be related to the file in file set
Number, is grouped file set.
In the present embodiment, grouping coefficient be used for determines be grouped after subset of the file quantity, be grouped coefficient it is determined that making
It is higher to obtain correlation, the i.e. occupation proportion of current system memory between current EMS memory occupation ratio and subset of the file quantity
(Installed System Memory is more nervous), the subset of the file quantity after grouping are more.The purpose for establishing the two positive correlation is, in system
When memory anxiety, the quantity of subset of the file is improved, the caching for reducing each subset of the file occupies, to improve gulping down parallel for system
The amount of spitting;And in the Installed System Memory free time, the quantity of subset of the file is reduced, the caching for increasing each subset of the file occupies, thus plus
The response time of fast retrieval request, the linkage being achieved between Installed System Memory and retrieval request.
It can be currently to connect for current Installed System Memory situation after receiving retrieval request every time in practical application
The retrieval request of receipts determines grouping coefficient.It is grouped in addition, system can also be formulated according to the memory variation tendency in a period of time
Coefficient, received retrieval request uses the grouping coefficient to carry out file grouping in the period.Again alternatively, system can also allow for
Network management personnel according to the variation manual setting of Installed System Memory be grouped coefficient, the present embodiment not to grouping coefficient acquisition modes into
Row limitation.
Further, described in the present embodiment grouping coefficient can be, but not limited to be subset quantity of documents, set accounting and
Subset quantity.The present embodiment will respectively be introduced the grouping of file set according to different grouping coefficients below:
1) grouping coefficient is subset quantity of documents
Subset quantity of documents refers to each subset of the file quantity of documents for being included after grouping.For example, the son that system determines
Collection quantity of documents may is that 5,10,30,50,85,100,150 etc..It is between subset quantity of documents and current memory occupation proportion
The higher subset quantity of documents of negative correlativing relation, i.e. current memory occupation proportion is smaller.In conjunction with the case where practical application, subset file
Determination of amount section is suitably [2,10000].The subset quantity of documents is by system according to current system memory accounting or artificial
Setting determination obtains.
Illustratively, it is assumed that the file set that some retrieval request is related to includes 1500 files, and subset quantity of documents is
30, then it is 1500/30 totally 50 that system, which can calculate the subset of the file quantity of division, wherein the text in each subset of the file
Number of packages amount is 30.
It should be noted that siding stopping collection quantity of documents does not allow for dividing exactly total number of files in the present embodiment, work as subset
When quantity of documents can not divide exactly total number of files, saved except remaining file can individually establish a subset of the file.Such as work as file
When sum obtains n more than m divided by subset quantity of documents (n is less than subset quantity of documents), then the subset of the file quantity after being grouped is m+1
It is a.
2) grouping coefficient is set accounting
Set accounting refers to the ratio between quantity of documents and total number of files of subset of the file, which can actually reflect
The quantity of the subset of the file marked off.For example, indicating file set being divided into 5 file when gathering accounting and being 20%
Collection, each subset of the file include the file of total number of files 1/5.Gather negatively correlated pass between accounting and current memory occupation proportion
System, the i.e. higher set accounting of current memory occupation proportion are smaller.With 1) in it is similar, which equally can be by system
It is obtained according to current system memory accounting or artificial setting determination.
In practical application, set accounting can be set to 5%, 10%, 12%, 36%, 65%, 80% or 94%, collection
Closing the more suitable value interval of accounting is [0.001%, 50%].
Equally, it is integer that setting set accounting, which is equally not necessarily limited the subset of the file number marked off, in the present embodiment, works as text
When part number of subsets is floating-point values, the subset of the file quantity of division is that integer part adds 1.
3) grouping coefficient is subset quantity
More intuitive, subset quantity, which is directly used in, limits the subset of the file quantity after grouping.Subset of the file number
Correlation between amount and current memory occupation proportion, the i.e. higher subset quantity of current memory occupation proportion are bigger.System
Directly subset quantity can be determined according to current system memory accounting or artificial setting.
In practical application, the value range of subset quantity is 2 to just infinite positive integer, and aforementioned 1) and 2) different
It is that the value of subset quantity cannot be floating data.
The above is only to the exemplary illustration for dividing parameter and packet mode to carry out, system can also use it in practical application
The foundation that his parameter is grouped as file set, the present embodiment do not remake design parameter workable for system and introduce one by one.
Further, in another embodiment of the invention, to be further reduced retrieval request to the occupancy of Installed System Memory,
System can also merge identical retrieval request.A kind of feasible implementation are as follows: before executing Fig. 1 step 101,
System can analyze the key value of multiple retrieval requests, and the identical multiple retrieval requests of key value are normalized
Multiple identical retrieval requests are merged into a retrieval request by processing, and then sequence executes shown in Fig. 1 since step 101
Process handles the retrieval request after merging.After completing retrieval, user data is fed back to transmission retrieval by system respectively
Each client of request, the thus disposable response realized to multiple retrieval requests.
It further, in another embodiment of the present invention, is the reduction excessively multipair Installed System Memory of file call number
Caused by occupy, system can also merge associated retrieval request.So-called associated retrieval request refers to key value packet
Contained in multiple retrieval requests in same file.For such file, system can incite somebody to action in a file calling process
The data for being directed to different key values are all extracted, it is possible thereby to which the repetition reduced to same file is called, are then saved
Save the caching expense for calling this document.
Further, in another embodiment of the present invention, when there are multiple retrieval requests, system can be preferably
The retrieval request of first priority distributes caching.In the retrieval request for receiving client, system is first retrieval request point
With priority, distribute priority foundation can with but be not limited only to are as follows: the source point of the urgency level of retrieval request, retrieval request
The text of subset of the file in the subset of the file number and retrieval request that total number of files that class, retrieval request are related to, retrieval request are related to
Number of packages.
Wherein, the urgency level of retrieval request can voluntarily be selected according to demand by client, the classification of urgency level by
System is unified to be formulated, and may include " not urgent ", " general ", " relatively urgent ", " very urgent " etc., the high retrieval of urgency level is asked
Ask priority higher;The origin classification of retrieval request is mainly the classification to client, including " employee's terminal ", " responsible person's end
End ", " manager terminal " and " administrator terminal " etc., system can know client by the source IP address of retrieval request
Not, the request rank highest of administrator, remaining personnel divide priority according to the height of administrative grade under normal conditions;Retrieval is asked
The total number of files being related to is asked to refer to the quantity of documents in file set, after receiving retrieval request, system can basis first
Key value therein obtains file to memory node, then counts to the file of acquisition, according to how many pairs of quantity of documents
Retrieval request is classified, and the more Request Priority of quantity of documents is higher;Similar with number of files, searching system can also be right
The subset of the file number that retrieval request is related to is counted and is classified, and the more Request Priority of subset of the file quantity is higher;Retrieval
The number of files of subset of the file refers to the quantity of documents in each subset of the file in request, and the more request of usual this document quantity is excellent
First grade is higher.
In the present embodiment, System Priority is the retrieval request distribution caching of the first priority with two layers of meaning: first, when
When Installed System Memory deficiency, the retrieval request of the first priority of priority processing, other retrieval requests are according to priority just joined the team waiting
Processing;Second, when handling different multiple retrieval requests at the same time, when Installed System Memory occurs idle (such as some retrieval request
After being disposed discharge caching), for the remaining retrieval request handled, system can be to waiting in high priority requests at
The subset of the file of reason preferentially distributes caching.
In the above embodiments, system carries out batch processing to retrieval request in timing, once only handles a text
Part subset, remaining subset of the file etc. are to be processed.It further, is the service efficiency for improving Installed System Memory, of the invention another
In embodiment, system can also be distributed simultaneously when memory occurs idle at least two subset of the file of same retrieval request
Caching, to carry out parallel processing.For example, system can be from handling when some retrieval request is disposed release caching
Retrieval request in select one or more retrieval requests (according to priority select or randomly choose etc.), for the retrieval request of selection
Medium one or more subset of the file distribution caching to be processed is handled, these retrieval requests just have at least two as a result,
A subset of the file is handled at the same time.For another example system can also be to new when some retrieval request is disposed release caching
Retrieval request is handled, and distributes caching at least two subset of the file in the request simultaneously.
In order to be further reduced the occupancy of Installed System Memory, in the last one embodiment of the invention, system can also be right
Duplicate retrieval request further does duplicate removal processing.In the aforementioned embodiment to retrieval request duplicate removal, system is mainly for same
Multiple same requests of Shi Faqi carry out duplicate removal, and in the present embodiment, system can be more to what is successively initiated in a period of time
A same request is handled, and thus further saves data retrieval to the expense of Installed System Memory.In real life, client can
Repeated retrieval can be carried out to the same key value, or different clients can be successive to the same key value whithin a period of time
It is retrieved.For such situation, system can delay the user data retrieved after first treated retrieval request
It deposits, when receiving retrieval request identical to key value next time, the user data that system can be read directly in caching is returned to
Client.It is updated it should be noted that the data stored in memory node usually there will be a degree of dynamic, therefore this reality
Apply example offer implementation need to duplicate retrieval request carry out period limitation, such as limit for 4 hours or one day it
Interior repetitive requests carry out above-mentioned processing, to reduce the probability of data movement.In addition, system is also for the user data of caching
Section (such as night) retrieval can be re-started to it during idle time, obtain the user data of update and cached, to ring
Answer subsequent same request.
In practical application, the various embodiments described above be can be applied in various distributed memory systems, distributed memory system
It again can be relevant database and non-relational database.In the following, the present invention will provide this by taking Cassandra system as an example
One application scenarios of invention, to realize application of the above-mentioned each method embodiment in Cassandra system.Cassandra system
System is a kind of distributed memory system of the ring structure of typical non-stop layer node, is made of numerous memory nodes, these are deposited
Storage node has data storage function and request access function simultaneously.Each memory node can separate responses clients
Retrieval request.Data scatter is stored in each memory node by Cassandra system by Hash (Hash) algorithm.Each deposit
Storage node is responsible for managing the data of a certain piece of successive range (Range) on annular storage organization.Data are with SSTable file
Format is stored in memory node.When carrying out data storage, if client can be written to the Memtable in Installed System Memory
Dry column (Column) are related to the data of key value Key.After Memtable writes completely, system can be by the data dump in Memtable
Onto the disk of memory node, SSTable file is formed.
As shown in Fig. 2, carrying out the process packet of data retrieval in this scene based on above-mentioned system architecture and storage characteristics
It includes:
201, memory node receives the retrieval request that client is sent.
Each memory node in Cassandra system independently has request access function, is deposited in this programme with numerous
It is illustrated for a memory node in storage node.Wherein, key value is carried in the received retrieval request of memory node
Key。
202, memory node searches all SSTable files comprising Key data according to the file index in memory, obtains
SSTable file set.
These SSTable files can be the SSTable file of memory node itself storage, or other storage sections
The SSTable file stored in point.
203, memory node is grouped SSTable file set, obtains multiple SSTable subset of the file.
Illustratively, SSTable file set includes 2000 SSTable files, and memory node is by SSTable file set
Conjunction is divided into 100 SSTable subset of the file, and each subset of the file includes 20 SSTable files.
204, memory node is that 20 SSTable files in a SSTable subset of the file distribute caching respectively, is opened
Each SSTable file is by reading data therein into corresponding caching.
Illustratively, memory node is the caching that each SSTable file distributes 32K, is first SSTable file
Integrate the caching of distribution altogether as 640K.
205, memory node merges the data read in the SSTable subset of the file.
The data extracted from 20 SSTable files are deserialized as Column data by memory node, and are closed
And obtain corresponding to the intermediate data of the SSTable subset of the file.Meanwhile memory node closes 20 SSTable files, discharges it
Caching.
After executing the step 205, memory node repeats 204 and step 205, extracts the 2nd, the 3rd until the
The intermediate data of 100 SSTable subset of the file.
206, memory node merges 100 intermediate data of acquisition, obtains final user data.
The user data of acquisition is fed back to client by memory node, terminates data retrieval process.
In this scene, the caching that an any time upper retrieval request occupies is 640K, for 2G memory, one
Memory node can support 3200 retrieval requests simultaneously, substantially increase the parallel handling capacity of Cassandra system.
Further, as the realization to the various embodiments described above the method, the present invention also provides a kind of data processings
Device, the device can be located at distributed memory system in memory node in, or independently of memory node and with storage save
There is data interaction relationship between point, realized for the method described in the various embodiments described above.As shown in figure 3, the device
It include: grouped element 31, allocation unit 32, reading unit 33 and processing unit 34, wherein
Grouped element 31 obtains multiple subset of the file for being grouped to the file set that retrieval request is related to;
Allocation unit 32, first subset of the file for dividing for grouped element 31 distribute caching;
Reading unit 33, first file divided for reading grouped element 31 from the caching that allocation unit 32 is distributed
Data in subset;
Allocation unit 32 after the reading data being also used in first subset of the file, discharges first file
The caching of collection, and the next subset of the file divided for grouped element 31 distributes caching;
Reading unit 33 is also used to read next file that grouped element 31 divides from the caching that allocation unit 32 is distributed
Data in subset;
Processing unit 34, for reading reading unit 33 after the data that reading unit 33 reads each subset of the file
The data of the All Files subset taken merge, and obtain the user data returned to client.
Further, as shown in figure 4, grouped element 31, comprising:
Determining module 311, for determining grouping coefficient according to current EMS memory occupation situation;
Division module 312, the file in grouping coefficient and file set for being determined by determining module 311 are total
Number, is grouped file set;
Wherein, current memory occupation proportion and grouping after subset of the file quantity between correlation.
Further, the grouping coefficient that determining module 311 determines is the quantity of documents in each subset of the file, quantity of documents
The negative correlation between current memory occupation proportion.
Further, the grouping coefficient that determining module 311 determines is set accounting of the subset of the file to collected works set, set
Negative correlation between accounting and current memory occupation proportion.
Further, the grouping coefficient that determining module 311 determines is subset of the file quantity, subset of the file quantity and it is current in
Deposit correlation between occupation proportion.
Further, as shown in figure 4, the device further comprises:
Combining unit 35, for when there are multiple retrieval requests, in grouped element 31 to the text being related to retrieval request
Before part set is grouped, the identical retrieval request of retrieval object is normalized.
Further, allocation unit 32, the retrieval for when there are multiple retrieval requests, being preferably the first priority are asked
Distribution is asked to cache;
Wherein, the partitioning standards of the first querying process scheduling request include:
Total number of files, the retrieval request that the urgency level of retrieval request, the origin classification of retrieval request, retrieval request are related to
The number of files of subset of the file in the subset of the file number and retrieval request being related to.
Further, allocation unit 32, for when the memory free time, being at least two subset of the file of same retrieval request
Caching is distributed simultaneously.
Further, as shown in figure 4, the device further include:
Writing unit 36 is being read for being more than the retrieval request of predeterminated frequency threshold value or preset times threshold value to request
After unit 33 reads its user data, the user data write-in that reading unit 33 is read is cached;
Processing unit 34, for obtaining user data from caching when receiving same retrieval request again.
In practical application, above-mentioned Fig. 3 or device shown in Fig. 4 can be located in the memory node in Cassandra system,
Or there is data interaction relationship independently of memory node and between memory node, for the method described in the various embodiments described above
It is realized.
The device of data processing provided in an embodiment of the present invention can divide the file set that retrieval request is related to
Group, to multiple subset of the file after grouping successively distribute caching read data, and by the data read in each subset of the file into
Row, which merges to obtain, returns to client user's data.Compared with being cached simultaneously for All Files distribution in the prior art, this hair
One complete retrieving can be decomposed into multiple sub- retrievings and carried out by the device for the data processing that bright embodiment provides
It successively executes, the timing of buffer size is decomposed to realize.Since the buffer size in Spatial Dimension has been decomposed the time
In dimension, therefore retrieval request is substantially reduced in sometime upper occupied Installed System Memory, this " parallel " turn " serial "
Mode can make Installed System Memory while support more retrieval requests, thus improve the concurrent handling capacity of system.
Further, as the realization to the various embodiments described above the method, the present invention also provides a kind of data processings
System, realized to the method described in the various embodiments described above.As shown in figure 5, the system includes: client 51 and deposits
Store up node 52, wherein memory node 52 include such as earlier figures 3 or device shown in Fig. 4, or with earlier figures 3 or dress shown in Fig. 4
There is data interaction relationship between setting.
In practical application, storage system composed by above-mentioned each memory node 52 can be Cassandra system, wherein often
A memory node 52 can be with retrieval request transmitted by the client 51 of separate responses itself access.
The system of data processing provided in an embodiment of the present invention can divide the file set that retrieval request is related to
Group, to multiple subset of the file after grouping successively distribute caching read data, and by the data read in each subset of the file into
Row, which merges to obtain, returns to client user's data.Compared with being cached simultaneously for All Files distribution in the prior art, this hair
One complete retrieving can be decomposed into multiple sub- retrievings and carried out by the system for the data processing that bright embodiment provides
It successively executes, the timing of buffer size is decomposed to realize.Since the buffer size in Spatial Dimension has been decomposed the time
In dimension, therefore retrieval request is substantially reduced in sometime upper occupied Installed System Memory, this " parallel " turn " serial "
Mode can make Installed System Memory while support more retrieval requests, thus improve the concurrent handling capacity of system.
The present invention provides the following technical scheme that
A1, a kind of method of data processing, which comprises
The file set that retrieval request is related to is grouped, multiple subset of the file are obtained;
It is cached for first subset of the file distribution, to read the data in first subset of the file;
After reading data in first subset of the file, the caching of first subset of the file is discharged,
And be next subset of the file distribution caching, to read the data in next subset of the file;
After the data for reading each subset of the file, the data of All Files subset are merged, are obtained to client
Hold the user data returned.
A2, method according to a1, it is described that the file set that retrieval request is related to is grouped, comprising:
Grouping coefficient is determined according to current EMS memory occupation situation;
By the total number of files in the grouping coefficient and the file set, the file set is grouped;
Wherein, current memory occupation proportion and grouping after subset of the file quantity between correlation.
A3, the method according to A2, the grouping coefficient are the quantity of documents in each subset of the file, the number of files
Negative correlation between amount and the current memory occupation proportion.
A4, the method according to A2, the grouping coefficient are set accounting of the subset of the file to the collected works set, institute
State negative correlation between set accounting and the current memory occupation proportion.
A5, the method according to A2, the grouping coefficient be subset of the file quantity, the subset of the file quantity with it is described
Correlation between current memory occupation proportion.
A6, method according to a1, it is described the file set that retrieval request is related to is grouped before, the side
Method further comprises:
When there are multiple retrieval requests, the identical retrieval request of retrieval object is normalized.
A7, method according to a1, the method further includes:
It is preferably the retrieval request distribution caching of the first priority when there are multiple retrieval requests;
Wherein, the partitioning standards of the first querying process scheduling request include:
Total number of files, the retrieval request that the urgency level of retrieval request, the origin classification of retrieval request, retrieval request are related to
The number of files of subset of the file in the subset of the file number and retrieval request being related to.
A8, method according to a1, when the memory free time, the method further includes:
Caching is distributed simultaneously at least two subset of the file of same retrieval request.
A9, method according to a1 are more than the retrieval request of predeterminated frequency threshold value or preset times threshold value to request,
After reading its user data, the method further includes:
The user data is written and is cached;
When receiving same retrieval request again, the user data is obtained from the caching.
B10, a kind of device of data processing, described device include:
Grouped element obtains multiple subset of the file for being grouped to the file set that retrieval request is related to;
Allocation unit, first subset of the file for dividing for the grouped element distribute caching;
Reading unit, described first divided for reading the grouped element from the caching that the allocation unit is distributed
Data in a subset of the file;
The allocation unit after reading data being also used in first subset of the file, discharges described the
The caching of one subset of the file, and the next subset of the file divided for the grouped element distributes caching;
Reading unit, be also used to from read that the grouped element divides in the caching that the allocation unit is distributed it is described under
Data in one subset of the file;
Processing unit, for after the data that the reading unit reads each subset of the file, to the reading unit
The data of the All Files subset of reading merge, and obtain the user data returned to client.
B11, device according to b10, the grouped element, comprising:
Determining module, for determining grouping coefficient according to current EMS memory occupation situation;
Division module, the text in the grouping coefficient and the file set for being determined by the determining module
Part sum, is grouped the file set;
Wherein, current memory occupation proportion and grouping after subset of the file quantity between correlation.
B12, the device according to B11, the grouping coefficient that the determining module determines are in each subset of the file
Quantity of documents, negative correlation between the quantity of documents and the current memory occupation proportion.
B13, the device according to B11, the grouping coefficient that the determining module determines are subset of the file to described
The set accounting of collected works set, negative correlation between the set accounting and the current memory occupation proportion.
B14, the device according to B11, the grouping coefficient that the determining module determines are subset of the file quantity, institute
State correlation between subset of the file quantity and the current memory occupation proportion.
B15, device according to b10, described device further comprises:
Combining unit, for being related to described to retrieval request when there are multiple retrieval requests in the grouped element
File set be grouped before, the identical retrieval request of retrieval object is normalized.
B16, device according to b10, the allocation unit, for when there are multiple retrieval requests, being preferably
The retrieval request of one priority distributes caching;
Wherein, the partitioning standards of the first querying process scheduling request include:
Total number of files, the retrieval request that the urgency level of retrieval request, the origin classification of retrieval request, retrieval request are related to
The number of files of subset of the file in the subset of the file number and retrieval request being related to.
B17, device according to b10, the allocation unit, for when the memory free time, being same retrieval request
At least two subset of the file distribute caching simultaneously.
B18, device according to b10, described device further include:
Writing unit, for being more than the retrieval request of predeterminated frequency threshold value or preset times threshold value to request, in the reading
After taking unit to read its user data, the user data write-in that the reading unit is read is cached;
The processing unit, for obtaining the user from the caching when receiving same retrieval request again
Data.
C19, a kind of system of data processing, the system comprises client and memory nodes, wherein the memory node
Including the device as described in any one of the claims B10 to B18.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
It is understood that the correlated characteristic in the above method and device can be referred to mutually.In addition, in above-described embodiment
" first ", " second " etc. be and not represent the superiority and inferiority of each embodiment for distinguishing each embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein.
Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system
Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various
Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects,
Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect
Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as a separate embodiment of the present invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment
Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or
Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any
Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed
All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose
It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
Meaning one of can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors
Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice
Microprocessor or digital signal processor (DSP) realize the denomination of invention according to an embodiment of the present invention (as determined in website
The device of Hyperlink rank) in some or all components some or all functions.The present invention is also implemented as being used for
Some or all device or device programs of method as described herein are executed (for example, computer program and calculating
Machine program product).It is such to realize that program of the invention can store on a computer-readable medium, or can have one
Or the form of multiple signals.Such signal can be downloaded from an internet website to obtain, or be provided on the carrier signal,
Or it is provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch
To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame
Claim.
Claims (10)
1. a kind of method of data processing, which is characterized in that the described method includes:
The file set that retrieval request is related to is grouped, multiple subset of the file are obtained;
It is cached for first subset of the file distribution, to read the data in first subset of the file;
After reading data in first subset of the file, the caching of first subset of the file is discharged, and is
Next subset of the file distribution caching, to read the data in next subset of the file;
After the data for reading each subset of the file, the data of All Files subset are merged, obtain returning to client
The user data returned.
2. the method according to claim 1, wherein described divide the file set that retrieval request is related to
Group, comprising:
Grouping coefficient is determined according to current EMS memory occupation situation;
By the total number of files in the grouping coefficient and the file set, the file set is grouped;
Wherein, current memory occupation proportion and grouping after subset of the file quantity between correlation.
3. according to the method described in claim 2, it is characterized in that, the grouping coefficient is the number of files in each subset of the file
Amount, negative correlation between the quantity of documents and the current memory occupation proportion.
4. according to the method described in claim 2, it is characterized in that, the grouping coefficient is subset of the file to the collected works set
Set accounting, negative correlation between the set accounting and the current memory occupation proportion.
5. according to the method described in claim 2, it is characterized in that, the grouping coefficient is subset of the file quantity, the file
Correlation between subset quantity and the current memory occupation proportion.
6. the method according to claim 1, wherein dividing described the file set that retrieval request is related to
Before group, the method further includes:
When there are multiple retrieval requests, the identical retrieval request of retrieval object is normalized.
7. the method according to claim 1, wherein the method further includes:
It is preferably the retrieval request distribution caching of the first priority when there are multiple retrieval requests;
Wherein, the partitioning standards of the first querying process scheduling request include:
Total number of files, the retrieval request that the urgency level of retrieval request, the origin classification of retrieval request, retrieval request are related to are related to
Subset of the file number and retrieval request in subset of the file number of files.
8. the method according to claim 1, wherein when the memory free time, the method further includes:
Caching is distributed simultaneously at least two subset of the file of same retrieval request.
9. a kind of device of data processing, which is characterized in that described device includes:
Grouped element obtains multiple subset of the file for being grouped to the file set that retrieval request is related to;
Allocation unit, first subset of the file for dividing for the grouped element distribute caching;
Reading unit, first text divided for reading the grouped element from the caching that the allocation unit is distributed
Data in part subset;
The allocation unit after reading data being also used in first subset of the file, discharges described first
The caching of subset of the file, and the next subset of the file divided for the grouped element distributes caching;
Reading unit is also used to read next text that the grouped element divides from the caching that the allocation unit is distributed
Data in part subset;
Processing unit, for being read to the reading unit after the data that the reading unit reads each subset of the file
The data of All Files subset merge, obtain the user data returned to client.
10. a kind of system of data processing, which is characterized in that the system comprises client and memory nodes, wherein described to deposit
Storing up node includes such as above-mentioned device as claimed in claim 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811617963.0A CN109634933A (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410594676.8A CN105550180B (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
CN201811617963.0A CN109634933A (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410594676.8A Division CN105550180B (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109634933A true CN109634933A (en) | 2019-04-16 |
Family
ID=55829369
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410594676.8A Active CN105550180B (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
CN201811617963.0A Pending CN109634933A (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410594676.8A Active CN105550180B (en) | 2014-10-29 | 2014-10-29 | The method, apparatus and system of data processing |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN105550180B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444023A (en) * | 2020-04-13 | 2020-07-24 | 中国银行股份有限公司 | Data processing method, device, equipment and readable storage medium |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10817515B2 (en) | 2017-07-26 | 2020-10-27 | International Business Machines Corporation | Cognitive data filtering for storage environments |
US10884980B2 (en) | 2017-07-26 | 2021-01-05 | International Business Machines Corporation | Cognitive file and object management for distributed storage environments |
CN109240607B (en) * | 2018-08-21 | 2022-02-18 | 郑州云海信息技术有限公司 | File reading method and device |
CN109783523B (en) * | 2019-01-24 | 2022-02-25 | 广州虎牙信息科技有限公司 | Data processing method, device, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5553303A (en) * | 1990-08-31 | 1996-09-03 | Fujitsu Limited | Data processing system for dynamically switching access control process and for performing recovery process |
CN102929929A (en) * | 2012-09-24 | 2013-02-13 | 深圳市网信联动技术有限公司 | Method and device for data summarization |
CN103259745A (en) * | 2013-05-31 | 2013-08-21 | 东蓝数码股份有限公司 | Design method for improving memory usage rate of buffer area in network programming |
CN103559244A (en) * | 2013-10-28 | 2014-02-05 | 东软集团股份有限公司 | Method and system for obtaining E-mail body based on mbx format |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103744628B (en) * | 2014-01-27 | 2016-09-28 | 北京奇虎科技有限公司 | SSTable file storage method and device |
CN104360824B (en) * | 2014-11-10 | 2017-12-12 | 北京奇虎科技有限公司 | The method and apparatus that a kind of data merge |
-
2014
- 2014-10-29 CN CN201410594676.8A patent/CN105550180B/en active Active
- 2014-10-29 CN CN201811617963.0A patent/CN109634933A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5553303A (en) * | 1990-08-31 | 1996-09-03 | Fujitsu Limited | Data processing system for dynamically switching access control process and for performing recovery process |
CN102929929A (en) * | 2012-09-24 | 2013-02-13 | 深圳市网信联动技术有限公司 | Method and device for data summarization |
CN103259745A (en) * | 2013-05-31 | 2013-08-21 | 东蓝数码股份有限公司 | Design method for improving memory usage rate of buffer area in network programming |
CN103559244A (en) * | 2013-10-28 | 2014-02-05 | 东软集团股份有限公司 | Method and system for obtaining E-mail body based on mbx format |
Non-Patent Citations (1)
Title |
---|
郭鹏: "《Cassandra实战》", 30 June 2011, 机械工业出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444023A (en) * | 2020-04-13 | 2020-07-24 | 中国银行股份有限公司 | Data processing method, device, equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN105550180B (en) | 2019-02-12 |
CN105550180A (en) | 2016-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6263364B1 (en) | Web crawler system using plurality of parallel priority level queues having distinct associated download priority levels for prioritizing document downloading and maintaining document freshness | |
US6351755B1 (en) | System and method for associating an extensible set of data with documents downloaded by a web crawler | |
CN105550180B (en) | The method, apparatus and system of data processing | |
US8543596B1 (en) | Assigning blocks of a file of a distributed file system to processing units of a parallel database management system | |
US9002871B2 (en) | Method and system of mapreduce implementations on indexed datasets in a distributed database environment | |
CN102831120B (en) | A kind of data processing method and system | |
US6772163B1 (en) | Reduced memory row hash match scan join for a partitioned database system | |
CN103593436B (en) | file merging method and device | |
CN110291518A (en) | Merge tree garbage index | |
Cheng et al. | Efficient query processing on graph databases | |
CN106294352B (en) | A kind of document handling method, device and file system | |
CN110162528A (en) | Magnanimity big data search method and system | |
CN110268399A (en) | Merging tree for attended operation is modified | |
CN105302840B (en) | A kind of buffer memory management method and equipment | |
CN107025243A (en) | A kind of querying method of resource data, inquiring client terminal and inquiry system | |
JP2014142940A (en) | Storage side storage request management | |
CN104111936B (en) | Data query method and system | |
CN107247778A (en) | System and method for implementing expansible data storage service | |
US7917495B1 (en) | System and method for processing query requests in a database system | |
CN108197296A (en) | Date storage method based on Elasticsearch indexes | |
CN109815234A (en) | A kind of multiple cuckoo filter under streaming computing model | |
US11308066B1 (en) | Optimized database partitioning | |
CN109033462A (en) | The method and system of low-frequency data item are determined in the storage equipment of big data storage | |
CN113590332B (en) | Memory management method, device and memory distributor | |
CN109117426A (en) | Distributed networks database query method, apparatus, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190416 |