CN106844706A - Update method, equipment, web storage system and the search system of web storage - Google Patents
Update method, equipment, web storage system and the search system of web storage Download PDFInfo
- Publication number
- CN106844706A CN106844706A CN201710065766.1A CN201710065766A CN106844706A CN 106844706 A CN106844706 A CN 106844706A CN 201710065766 A CN201710065766 A CN 201710065766A CN 106844706 A CN106844706 A CN 106844706A
- Authority
- CN
- China
- Prior art keywords
- hash table
- web
- web data
- hash
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
- G06F16/986—Document structures and storage, e.g. HTML extensions
Abstract
The invention discloses a kind of method for updating web storage, equipment, web storage system and search system.It is described to include for updating the method for web storage:The collision rate of the first Hash table of web storage system is detected, wherein, first Hash table stores web data;In the case where collision rate is more than threshold value is updated, the second Hash table is created;And in multiple migration process web data in the first Hash table being moved into the second Hash table, wherein, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.According to one embodiment, influence of the migration operation to query performance can be mitigated.
Description
Technical field
The present invention relates to web storage and Webpage search technical field, deposited for more new web page more particularly, to one kind
The method of storage, the equipment for updating web storage, web storage system and web page search system.
Background technology
In the web page search system of the Internet, applications, it usually needs web data storage is existed in the form of web-page summarization
In the web storage system of web page search system.Due to the webpage enormous amount in internet, therefore, generally with the shape of key-value pair
Formula stores in web storage system the web data, wherein, the major key of the key-value pair is normalized web page address,
Value in the key-value pair is web page contents or web-page summarization content.
In existing web page search system, because webpage enormous amount and web page search system are for web page contents
Update insensitive, therefore, generally daily or weekly the web data in web storage system is updated, or only for
Part web data carries out real-time update.
Generally, there are two kinds of modes that real-time update is carried out to web data.
In first way, real-time update is realized by the storage device (for example, Redis systems) for aiding in.It is this
Mode can increase the complexity of web storage system.
In the second way, web data is stored in the way of open chain Hash table, to realize to the fast of web data
Quick checking is looked for.In the case of a large amount of webpages are stored in Hash table, Hash table conflict can cause query performance degradation.This
When, it is necessary to carry out dilatation to Hash table.
In the prior art, when dilatation is carried out to Hash table, new Hash table is created first, then by original Hash table
The web data of storage is disposably copied in new Hash table.In the disposable copy procedure, the inquiry of original Hash table
Performance degradation.
Accordingly, it is desirable to provide a kind of new technical scheme, enters for above-mentioned at least one technical problem of the prior art
Row is improved.
The content of the invention
It is an object of the present invention to provide a kind of new solution for updating web storage.
According to the first aspect of the invention, there is provided a kind of method for updating web storage, including:Detection webpage is deposited
The collision rate of the first Hash table of storage system, wherein, first Hash table stores web data;In collision rate more than renewal threshold
In the case of value, the second Hash table is created, wherein, the capacity of the capacity more than the first Hash table of the second Hash table;And with many
Secondary migration process moves to web data in the first Hash table in the second Hash table, wherein, in each migration process, by
During a part for web data moves to the second Hash table in one Hash table.
Alternatively or alternatively, the collision rate is the Hash shared by the web data of currently practical receiving in Hash table
The ratio of the whole Hash barrelages in barrelage and Hash table, the renewal threshold value is the threshold value on the ratio.
Alternatively or alternatively, the collision rate is shared by the web data of currently practical storage in first Hash table
Hash barrelage, and the renewal threshold value is the threshold value on Hash barrelage.
Alternatively or alternatively, in web page contents in the first Hash table being moved into the second Hash table with multiple migration process
Also include:During web data in the first Hash table moved into the second Hash table when a query is received.
Alternatively or alternatively, in web data in the first Hash table being moved into the second Hash table when a query is received
Also include:Migration vernier i is set in the first Hash table, wherein, migration vernier i indicates current web data unit to be migrated
Element;And web data migration of element indicated by vernier i to the second Hash table will be migrated when a query is received.
Alternatively or alternatively, the current web data element to be migrated includes one or more Hash bucket correspondence
Element or Hash bucket in one or more elements.
Alternatively or alternatively, in transition process, new web data is written in the second Hash table.
Alternatively or alternatively, also include:Web data is read from the second Hash table;And work as in the second Hash table not
In the case of finding related web page data web data is read from the first Hash table.
Alternatively or alternatively, in web page contents in the first Hash table being moved into the second Hash table with multiple migration process
Also include:By from the web data of the first Hash table write-in file cache;And in the web data of write-in file cache
Web data in file cache is written to the second Hash table by amount more than in the case of cache threshold.
Alternatively or alternatively, also include:When the web data in file cache is written into the second Hash table,
In the case that the length of the web data file in two Hash tables is more than the length of corresponding web data file in file cache,
Read the web data file in file cache.
Alternatively or alternatively, the web data is web-page summarization.
Alternatively or alternatively, the web storage system is the storage system of web page search system.
According to the second aspect of the invention, there is provided a kind of equipment for updating web storage, including:For detecting net
The device of the collision rate of the first Hash table of page storage system, wherein, first Hash table stores web data;For in punching
Prominent rate creates the device of the second Hash table in the case of being more than renewal threshold value, wherein, the capacity of the second Hash table is breathed out more than first
The capacity of uncommon table;And for web data in the first Hash table to be moved to the dress in the second Hash table with multiple migration process
Put, wherein, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.
According to the third aspect of the invention we, there is provided a kind of web storage system, it is including above-mentioned for updating web storage
Equipment.
According to the fourth aspect of the invention, there is provided a kind of web storage system, including:Memory and processor, wherein,
The memory includes machine-executable instruction, and when the web storage system operation, the machine-executable instruction is used for
Control the treatment in method of the computing device according to any of the above described one.
According to the fifth aspect of the invention, there is provided a kind of web page search system, including according to any of the above-described described net
Page storage system, for storing web data, for retrieval.
According to one embodiment of present invention, inquiry of the web data migration to Hash table can to a certain extent be mitigated
The influence of performance.
By referring to the drawings to the detailed description of exemplary embodiment of the invention, further feature of the invention and its
Advantage will be made apparent from.
Brief description of the drawings
The accompanying drawing for being combined in the description and constituting a part for specification shows embodiments of the invention, and even
It is used to explain principle of the invention together with its explanation.
Fig. 1 shows the schematic flow for updating the method for web storage according to an embodiment of the invention
Figure.
Fig. 2 shows the schematic block diagram of web storage system according to an embodiment of the invention.
Fig. 3 shows the schematic block diagram of web storage system according to another embodiment of the invention.
Fig. 4 shows the schematic block diagram of web page search system according to an embodiment of the invention.
Specific embodiment
Describe various exemplary embodiments of the invention in detail now with reference to accompanying drawing.It should be noted that:Unless had in addition
Body illustrates that the part and the positioned opposite of step, numerical expression and numerical value for otherwise illustrating in these embodiments do not limit this
The scope of invention.
The description only actually at least one exemplary embodiment is illustrative below, never as to the present invention
And its any limitation applied or use.
May be not discussed in detail for technology, method and apparatus known to person of ordinary skill in the relevant, but suitable
In the case of, the technology, method and apparatus should be considered as a part for specification.
In all examples shown here and discussion, any occurrence should be construed as merely exemplary, without
It is as limitation.Therefore, other examples of exemplary embodiment can have different values.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi
It is defined in individual accompanying drawing, then it need not be further discussed in subsequent accompanying drawing.
Below, each embodiment of the invention and example are described with reference to the accompanying drawings.
<Method>
Fig. 1 shows the schematic flow for updating the method for web storage according to an embodiment of the invention
Figure.
As shown in figure 1, in step S1100, detecting the collision rate of the first Hash table of web storage system.First Hash table
Storage web data.
Web storage system is the storage system of web page search system.The web storage system has Hash table, in Hash
Be stored with a large amount of web datas in table.For example, the web data can include web-page summarization.In such a case, it is possible to use key
Value to mode store web data, the major key of the key-value pair is normalized web page address, and the value in the key-value pair is
Web page contents or web-page summarization.
Stored using Hash table, can rapidly be found corresponding web data.
In step S1200, in the case where collision rate is more than threshold value is updated, the second Hash table is created.For example, the second Hash
Capacity of the capacity of table more than the first Hash table.
The renewal threshold value is corresponding with the collision rate of the Hash table that step S1100 is related to.According to the collision rate of Hash table
Definition mode it is different, then the renewal threshold value for setting is also different.
In one example, the collision rate of the Hash table is shared by the web data of currently practical receiving in Hash table
Hash barrelage and Hash table in whole Hash barrelages ratio, and the renewal threshold value is the threshold on the ratio
Value.For example, the whole Hash barrelages in Hash table are 100, Hash shared by the web data of currently practical receiving in Hash table
Barrelage 50, then the collision rate of Hash table is 50 and 100 ratio, i.e., 0.5.For example, the renewal threshold value is 0.5.Work as Hash table
Collision rate when reaching 0.5, start to create the second Hash table.
In another example, when a Hash table is set up, the threshold value of Hash barrelage can be preset.When actually depositing
When the Hash barrelage for storing up data reaches the threshold value, it is updated.In such a case, it is possible to directly using Hash barrelage as conflict
Rate.For example, the collision rate is the Hash barrelage shared by the web data of currently practical storage in first Hash table, with
And the renewal threshold value is the threshold value on Hash barrelage.For example, the renewal threshold value for pre-setting is 60.When current in Hash table
When Hash barrelage shared by the web data of actual storage is 60, start to create the second Hash table.
For example, the collision rate of the Hash table can also be the Kazakhstan shared by the web data of currently practical receiving in Hash table
The ratio of the total capacity of the whole Hash bucket in the capacity and Hash table of uncommon bucket.For example, in Hash table currently practical receiving net
The capacity of the Hash bucket shared by page data is 7000, and the total capacity of the whole Hash bucket in Hash table is 10000, then Hash table
Collision rate be 7000 and 10000 ratio, i.e., 0.7.For example, the renewal threshold value is set as 0.7.When collision rate is 0.7,
Start to create the second Hash table.
The quantity of the web data that second Hash table can be accommodated is more than the webpage number that the first Hash table can be accommodated
According to quantity.The second Hash table can be created by way of using increase Hash barrelage, or, it is also possible to by increasing each
The mode of the quantity of the web data that Hash barrelage can be accommodated creates the second Hash table.For example, by increasing Hash barrelage
Mode create the second Hash table in the case of, the spreading coefficient of Hash table is q (q > 1), and the Hash barrelage of the first Hash table is
N.When the collision rate of the first Hash table is more than threshold value is updated, the second Hash table is created, wherein, the second Hash table for being created
Hash barrelage N '=N*q.
In step S1300, during web data in the first Hash table moved into the second Hash table with multiple migration process, its
In, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.
It is to receive web page interrogation to ask by the trigger condition that web data in the first Hash table moves in the second Hash table
Ask.If detecting the collision rate of the first Hash table of web storage system more than threshold value is updated, asked when web page interrogation is received
When, the part web data in the first Hash table is moved into the second Hash table.So can as far as possible reduce migration operation to looking into
Ask the influence of operation.In one example, after inquiry operation is performed, the migration operation is performed, is moved with further reduction
Move influence of the operation to inquiry operation.
In one embodiment, migration vernier i is pre-set in the first Hash table.Migration vernier i indicates currently
The web data element being migrated.When a query is received, web data migration of element indicated by vernier i to second will be migrated
Hash table.Current web data element to be migrated can be the corresponding element of one or more Hash bucket, or, Ke Yiwei
One or more elements in one Hash bucket.
When the web data element migrated indicated by vernier i is written into the second Hash table, migration vernier i can be moved
With the web data element to be migrated of next group in the first Hash table of instruction.When inquiry request is received again, this is moved
Web data migration of element to be migrated that vernier i indicates is moved in the second Hash table.According to aforesaid operations step, by first
The web data stored in Hash table is moved in the second Hash table in batches, until the web data that the first Hash table is stored is complete
Portion moves to the second Hash table, so as to complete the migration to the web data stored in the first Hash table.
For example, second can be moved in the part web data that will migrate the first Hash table storage indicated by vernier i
After Hash table, the web data corresponding to this migration process is deleted from the first Hash table.Alternatively, when the first Hash table
Whole web datas of middle storage are migrated to after the second Hash table, are deposited in the first Hash table of the first Hash table of deletion or deletion
Whole web datas of storage.
In one embodiment, during the web data migration process of the first Hash table storage, preferentially breathed out from second
Uncommon table reads the corresponding web data of inquiry request.When not finding the corresponding associated nets number of pages of inquiry request in the second Hash table
In the case of, the corresponding web data of inquiry request is read from the first Hash table.
In each migration process, when the web data in the first Hash table is moved in the second Hash table, can be with
Web data to be migrated in first Hash table is written in file cache.It is big in the web data amount of write-in file cache
In the case of cache threshold, the web data in file cache is written to the second Hash table.It is written to wanting for file cache
The web data being migrated can be queried.
During the web data in file cache is written into the second Hash table, when inquiry request is received,
It is to read the web data file from the second Hash table that the length of web data file that can be in the second Hash table judges,
Or read the web data file from file cache.Generally, when web data file is write to the second Hash table, can be by the
It is sufficiently large that file size in two Hash tables is set, to guarantee to accommodate this document.When write operation is terminated, by file
Length is set to physical length.Under the circumstances, when inquiry operation is performed, the web data text in the second Hash table
In the case that the length of part is more than the length of corresponding web data file in file cache, then the webpage in file cache is read
Data file.
According to one embodiment, the web data of web storage system is migrated using progressive manner, it is to avoid disposably copy
Shellfish mass data, so as to mitigate influence of the single migration operation to query performance.
Additionally, in one embodiment, in this way can be with the capacity of dynamic expansion Hash table.
Additionally, in one embodiment, read/write operation can be to a certain extent separated, so as to improve systematic function.
<Equipment>
It will be appreciated by those skilled in the art that in electronic technology field, can be by software, hardware and software and hard
The mode that part is combined, the above method is embodied in the product.Those skilled in the art are easy to, based on method as disclosed above, produce
A kind of raw equipment for updating web storage.The equipment can include being described previously for updating web storage for realization
Method in each operation device.For example, the equipment can include the first Hash table for detecting web storage system
Collision rate device, wherein, first Hash table stores web data;For in situation of the collision rate more than renewal threshold value
The lower device for creating the second Hash table, wherein, the capacity of the second Hash table is more than the first Hash table;And for repeatedly migrating
Web data in first Hash table is moved to device in the second Hash table for treatment, wherein, in each migration process, by the
During a part for web data moves to the second Hash table in one Hash table.
<Web storage system>
Equipment for updating web storage described above can be a part for web storage system.In such case
Under, the web storage system is used to update web storage, realizes repeatedly moving to the web data that the first Hash table is stored
In second Hash table.Fig. 2 shows the schematic block diagram of web storage system according to an embodiment of the invention.Referring to figure
2, web storage system 2000 includes the equipment 2010 for being described previously for updating web storage.
Fig. 3 shows the schematic block diagram of web storage system according to another embodiment of the invention.Referring to Fig. 3,
Web storage system 3000 can include processor 3010, memory 3020, interface arrangement 3030, communicator 3040, display
Device 3050, input unit 3060, loudspeaker 3070, microphone 3080, etc..
Processor 3010 for example can be central processor CPU, Micro-processor MCV etc..Memory 3020 for example includes ROM
(read-only storage), RAM (random access memory), the nonvolatile memory of hard disk etc..Interface arrangement 3030 is for example
Including USB interface, earphone interface etc..
Communicator 3040 can for example carry out wired or wireless communication.
Display device 3050 is, for example, LCDs, touch display screen etc..Input unit 3060 can for example include touching
Touch screen, keyboard etc..User can be by loudspeaker 3070 and the inputting/outputting voice information of microphone 3080.
Web storage system shown in Fig. 3 is only explanatory, and never be intended to limitation the present invention, its application or
Purposes.
In this embodiment, the memory 3020 is used for store instruction, and the instruction is used to control the processor
3010 are operated to perform the method for updating web storage shown in Fig. 1.Although it will be appreciated by those skilled in the art that
Figure 3 illustrates multiple devices, but, the present invention can only relate to partial devices therein, for example, processor 3010 and depositing
Reservoir 3020 etc..Technical staff can instruct according to presently disclosed conceptual design.How control process device is grasped for instruction
Make, this is it is known in the art that therefore being not described in detail herein.
<Web page search system>
Web storage system described above is the storage system of web page search system.Fig. 4 shows of the invention
The schematic block diagram of the web page search system of one embodiment.Referring to Fig. 4, web page search system 4000 includes foregoing net
Page storage system 4010, for storing web data, for retrieval.
<Example>
Here is according to an example for specific embodiment.
According to an example, the web data in web storage system is updated on the premise of not influenceing query performance as far as possible
Storage.For example, the web storage system is the storage system of web page search system.
In one example, by the way of gradual, the memory capacity or renewal web storage of extended web data.Example
Such as, at no point in the update process, two Hash tables of holding, i.e. the first Hash table (or current Hash table) and the second Hash table (or update
Hash table afterwards).
In the prior art, when updating, the second Hash table is copied to by the data in the first Hash table are disposable.This
The read-write operation of the first Hash table can be produced a very large impact.
In one example, by the way of gradual, it is to avoid disposably copy mass data, so as to mitigate to systematicness
The influence of energy.
In one example, by file cache, data are write to the second Hash table in a batch.Can so keep away
The random writing of disk is exempted from.
Alternatively, according to one embodiment, can facilitate and web data is segmented by data volume.
Alternatively, because only one piece of data is currently written into, and other data segments only provide reading service, and therefore, it can will be big
Partial read request and write request is separated by be left, and this can improve the performance for reading web-page summarization.
First, the collision rate of the first Hash table of detection web storage system.
For example, the collision rate of Hash table be Hash barrelage shared by the web data of currently practical receiving in Hash table with
The ratio of the whole Hash barrelages in Hash table.Whole Hash barrelage N in Hash table are 100.It is default renewal threshold value be
0.6.When the Hash barrelage shared by the web data of currently practical receiving in the Hash table for detecting web storage system is 61
When, the collision rate of the first Hash table is 61 and 100 ratio, i.e., 0.61, it is more than default renewal threshold value.
Then, the second Hash table is created.
For example, creating the second Hash table in the way of to increase Hash barrelage.The spreading coefficient of Hash table is q (q > 1).When
When the collision rate of the first Hash table is more than threshold value is updated, the second Hash table is created.The Hash barrelage of the second Hash table for being created
N '=N*q.For example, it is assumed that q=1.5, then the Hash barrelage N ' of the second Hash table is 150.
Then, the part web data in the first Hash table is written in file cache.
For example, during the web data in the first Hash table moved into the second Hash table with multiple migration process.Receiving
During to inquiry request, a part for the web data in the first Hash table is written in file cache.
Here, due to triggering migration operation based on inquiry request, therefore, it can be grasped relative to migration to a certain extent
Make the degree of priority of raising inquiry request, so as to mitigate influence of the migration operation to query performance.
For example, migration vernier i can be pre-set in the first Hash table.Migration vernier i indicates current to be migrated
Web data element.When inquiry request is received, the web data element migrated indicated by vernier i is written to file and is delayed
In depositing.Current web data element to be migrated can be the element in one or more Hash bucket.
It is mobile to migrate vernier i to refer to after the web data element indicated by migration vernier i is written in file cache
Show next group web data element to be migrated in the first Hash table.When inquiry request is received again, by migration trip
The web data element to be migrated that mark i is indicated is written in file cache.
Afterwards, the web data in file cache is written to the second Hash table.
When the web data amount for writing file cache is more than cache threshold, the web data in file cache is written to
Second Hash table.
According to aforesaid operations step, the web data batch that the first Hash table is stored is moved in the second Hash table, directly
The second Hash table is all written to by the web data that the first Hash table is stored, so as to complete the webpage of the first Hash table storage
The migration of data.
After migration is completed, the first Hash table can be deleted.Furthermore, it is possible to reset migration vernier in the second Hash table
I, so as to when needed, during the web data in the second Hash table moved into new Hash table.
The present invention can be system, method and/or computer program product.Computer program product can include computer
Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.
Computer-readable recording medium can be the tangible of the instruction that holding and storage are used by instruction execution equipment
Equipment.Computer-readable recording medium for example can be-- but be not limited to-- storage device electric, magnetic storage apparatus, optical storage
Equipment, electromagnetism storage device, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer-readable recording medium
More specifically example (non exhaustive list) includes:Portable computer diskette, hard disk, random access memory (RAM), read-only deposit
It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static RAM (SRAM), portable
Compact disk read-only storage (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon
Be stored with instruction punch card or groove internal projection structure and above-mentioned any appropriate combination.Calculating used herein above
Machine readable storage medium storing program for executing is not construed as instantaneous signal in itself, the electromagnetic wave of such as radio wave or other Free propagations, logical
Cross electromagnetic wave (for example, the light pulse for passing through fiber optic cables) that waveguide or other transmission mediums propagate or by wire transfer
Electric signal.
Computer-readable program instructions as described herein can from computer-readable recording medium download to each calculate/
Processing equipment, or outer computer or outer is downloaded to by network, such as internet, LAN, wide area network and/or wireless network
Portion's storage device.Network can include copper transmission cable, Optical Fiber Transmission, be wirelessly transferred, router, fire wall, interchanger, gateway
Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted
Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for storing the meter in each calculating/processing equipment
In calculation machine readable storage medium storing program for executing.
For perform the present invention operation computer program instructions can be assembly instruction, instruction set architecture (ISA) instruction,
Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming language
Source code or object code that any combination is write, programming language of the programming language including object-oriented-such as
Smalltalk, C++ etc., and routine procedural programming languages-such as " C " language or similar programming language.Computer
Readable program instructions can perform fully on the user computer, partly perform on the user computer, as one solely
Vertical software kit is performed, part performs or completely in remote computer on the remote computer on the user computer for part
Or performed on server.In the situation for being related to remote computer, remote computer can be by the network-bag of any kind
LAN (LAN) or wide area network (WAN)-be connected to subscriber computer are included, or, it may be connected to outer computer (such as profit
With ISP come by Internet connection).In certain embodiments, by using computer-readable program instructions
Status information carry out personalized customization electronic circuit, such as PLD, field programmable gate array (FPGA) or can
Programmed logic array (PLA) (PLA), the electronic circuit can perform computer-readable program instructions, so as to realize each side of the invention
Face.
Referring herein to method according to embodiments of the present invention, device (system) and computer program product flow chart and/
Or block diagram describes various aspects of the invention.It should be appreciated that each square frame and flow chart of flow chart and/or block diagram and/
Or in block diagram each square frame combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to all-purpose computer, special-purpose computer or other programmable datas
The processor of processing unit, so as to produce a kind of machine so that these instructions are by computer or other programmable datas
During the computing device of processing unit, work(specified in one or more square frames realized in flow chart and/or block diagram is generated
The device of energy/action.Can also be the storage of these computer-readable program instructions in a computer-readable storage medium, these refer to
Order causes that computer, programmable data processing unit and/or other equipment work in a specific way, so that, be stored with instruction
Computer-readable medium then includes a manufacture, and it includes realizing in one or more square frames in flow chart and/or block diagram
The instruction of the various aspects of the function/action of regulation.
Can also computer-readable program instructions be loaded into computer, other programmable data processing units or other
In equipment so that perform series of operation steps on computer, other programmable data processing units or miscellaneous equipment, to produce
The computer implemented process of life, so that performed on computer, other programmable data processing units or miscellaneous equipment
Instruct function/action specified in one or more square frames realized in flow chart and/or block diagram.
Flow chart and block diagram in accompanying drawing show system, method and the computer journey of multiple embodiments of the invention
The architectural framework in the cards of sequence product, function and operation.At this point, each square frame in flow chart or block diagram can generation
One part for module, program segment or instruction of table a, part for the module, program segment or instruction is used comprising one or more
In the executable instruction of the logic function for realizing regulation.In some realizations as replacement, the function of being marked in square frame
Can occur with different from the order marked in accompanying drawing.For example, two continuous square frames can essentially be held substantially in parallel
OK, they can also be performed in the opposite order sometimes, and this is depending on involved function.It is also noted that block diagram and/or
The combination of the square frame in each square frame and block diagram and/or flow chart in flow chart, can use the function of performing regulation or dynamic
The special hardware based system made is realized, or can be realized with the combination of computer instruction with specialized hardware.It is right
For those skilled in the art it is well known that, realized by hardware mode, realized by software mode and by software and
The mode of combination of hardware realizes it being all of equal value.
It is described above various embodiments of the present invention, described above is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.In the case of without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport
Best explaining the principle of each embodiment, practical application or to the technological improvement in market, or make the art its
Its those of ordinary skill is understood that each embodiment disclosed herein.The scope of the present invention be defined by the appended claims.
Claims (16)
1. a kind of method for updating web storage, including:
The collision rate of the first Hash table of web storage system is detected, wherein, first Hash table stores web data;
In the case where collision rate is more than threshold value is updated, the second Hash table is created, wherein, the capacity of the second Hash table is more than first
Hash table;And
During web data in the first Hash table moved into the second Hash table with multiple migration process, wherein, at each migration
In reason, during a part for web data in the first Hash table moved into the second Hash table.
2. method according to claim 1, wherein, the collision rate is the web data of currently practical receiving in Hash table
The ratio of the whole Hash barrelages in shared Hash barrelage and Hash table, the renewal threshold value is the threshold on the ratio
Value.
3. method according to claim 1, wherein, the collision rate is currently practical storage in first Hash table
Hash barrelage shared by web data, and the renewal threshold value is the threshold value on Hash barrelage.
4. method according to claim 1, wherein, web page contents in the first Hash table are moved to multiple migration process
Also include in second Hash table:
During web data in the first Hash table moved into the second Hash table when a query is received.
5. method according to claim 4, wherein, web data in the first Hash table is moved to when a query is received
Also include in second Hash table:
Migration vernier i is set in the first Hash table, wherein, migration vernier i indicates current web data element to be migrated;
And
Web data migration of element indicated by vernier i to the second Hash table will be migrated when a query is received.
6. method according to claim 5, wherein, the current web data element to be migrated includes one or many
One or more elements in the individual corresponding element of Hash bucket or a Hash bucket.
7. method according to claim 1, wherein, in transition process, new web data is written to the second Hash
In table.
8. method according to claim 1, also includes:
Web data is read from the second Hash table;And
In the case of related web page data are not found in the second Hash table web data is read from the first Hash table.
9. method according to claim 1, wherein, web page contents in the first Hash table are moved to multiple migration process
Also include in second Hash table:
By from the web data of the first Hash table write-in file cache;And
In the case where the web data amount of write-in file cache is more than cache threshold, by the web data write-in in file cache
To the second Hash table.
10. method according to claim 9, also includes:
When the web data in file cache is written into the second Hash table, web data file in the second Hash table
In the case that length is more than the length of corresponding web data file in file cache, the webpage number in file cache is read
According to file.
11. methods according to claim 1, wherein, the web data includes web-page summarization.
12. methods according to claim 1, wherein, the web storage system is the storage system of web page search system.
A kind of 13. equipment for updating web storage, including:
Device for detecting the collision rate of the first Hash table of web storage system, wherein, first Hash table stores net
Page data;
The device of the second Hash table is created in the case of for being more than and updating threshold value in collision rate;And
For web data in the first Hash table to be moved to the device in the second Hash table with multiple migration process, wherein,
In each migration process, during a part for web data in the first Hash table moved into the second Hash table.
A kind of 14. web storage systems, including the equipment for updating web storage according to claim 13.
A kind of 15. web storage systems, including:Memory and processor, wherein, the memory includes that machine is executable and refers to
Order, when the web storage system operation, the machine-executable instruction is used to control the computing device according to right
It is required that the treatment in method described in any one in 1-12.
A kind of 16. web page search systems, including the web storage system according to claims 14 or 15, for storing webpage
Data, for retrieval.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710065766.1A CN106844706A (en) | 2017-02-06 | 2017-02-06 | Update method, equipment, web storage system and the search system of web storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710065766.1A CN106844706A (en) | 2017-02-06 | 2017-02-06 | Update method, equipment, web storage system and the search system of web storage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844706A true CN106844706A (en) | 2017-06-13 |
Family
ID=59121903
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710065766.1A Pending CN106844706A (en) | 2017-02-06 | 2017-02-06 | Update method, equipment, web storage system and the search system of web storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844706A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109828966A (en) * | 2019-01-17 | 2019-05-31 | 平安科技(深圳)有限公司 | Gradual heavy hash method, device, computer equipment and storage medium |
CN111143744A (en) * | 2019-12-26 | 2020-05-12 | 杭州安恒信息技术股份有限公司 | Method, device and equipment for detecting web assets and readable storage medium |
CN113407462A (en) * | 2021-06-16 | 2021-09-17 | 新华三信息安全技术有限公司 | Data processing method and device, electronic equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810244A (en) * | 2013-12-09 | 2014-05-21 | 北京理工大学 | Distributed data storage system expansion method based on data distribution |
CN104504076A (en) * | 2014-12-22 | 2015-04-08 | 西安电子科技大学 | Method for implementing distributed caching with high concurrency and high space utilization rate |
CN104954444A (en) * | 2015-05-27 | 2015-09-30 | 华为技术有限公司 | Cached data migration method and device |
-
2017
- 2017-02-06 CN CN201710065766.1A patent/CN106844706A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810244A (en) * | 2013-12-09 | 2014-05-21 | 北京理工大学 | Distributed data storage system expansion method based on data distribution |
CN104504076A (en) * | 2014-12-22 | 2015-04-08 | 西安电子科技大学 | Method for implementing distributed caching with high concurrency and high space utilization rate |
CN104954444A (en) * | 2015-05-27 | 2015-09-30 | 华为技术有限公司 | Cached data migration method and device |
Non-Patent Citations (1)
Title |
---|
LUOTUO44: "memcached源码分析——哈希表基本操作以及扩容过程", 《HTTPS://BLOG.CSDN.NET/LUOTUO44/ARTICLE/DETAILS/42773231》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109828966A (en) * | 2019-01-17 | 2019-05-31 | 平安科技(深圳)有限公司 | Gradual heavy hash method, device, computer equipment and storage medium |
CN111143744A (en) * | 2019-12-26 | 2020-05-12 | 杭州安恒信息技术股份有限公司 | Method, device and equipment for detecting web assets and readable storage medium |
CN111143744B (en) * | 2019-12-26 | 2023-10-13 | 杭州安恒信息技术股份有限公司 | Method, device and equipment for detecting web asset and readable storage medium |
CN113407462A (en) * | 2021-06-16 | 2021-09-17 | 新华三信息安全技术有限公司 | Data processing method and device, electronic equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8380680B2 (en) | Piecemeal list prefetch | |
US10133770B2 (en) | Copying garbage collector for B+ trees under multi-version concurrency control | |
CN106886375A (en) | The method and apparatus of data storage | |
CN106844706A (en) | Update method, equipment, web storage system and the search system of web storage | |
KR101773781B1 (en) | Method and apparatus for user oriented data visualzation based on the web | |
CN108228649A (en) | For the method and apparatus of data access | |
CN111198868A (en) | Intelligent sub-database real-time data migration method and device | |
CN102591855A (en) | Data identification method and data identification system | |
US10970262B2 (en) | Multiple versions of triggers in a database system | |
US20230012642A1 (en) | Method and device for snapshotting metadata, and storage medium | |
US10552399B2 (en) | Predicting index fragmentation caused by database statements | |
US20150007118A1 (en) | Software development using gestures | |
CN107025247A (en) | Method, equipment, browser and the electronic equipment handled web data | |
CN112654995A (en) | Tracking content attribution in online collaborative electronic documents | |
US10114579B2 (en) | Data migration tool with intermediate incremental copies | |
US11093389B2 (en) | Method, apparatus, and computer program product for managing storage system | |
AU2018214032A1 (en) | Systems and methods for maintaining group membership records | |
CN111427511B (en) | Data storage method and device | |
US10535011B2 (en) | Predicting capacity based upon database elements | |
CN107122401A (en) | To the method for data database storing, equipment, middleware equipment and server | |
US11385967B2 (en) | Method for managing backup data by having space recycling operations on executed backup data blocks | |
CN107665124A (en) | Modularization JavaScript file processing method, equipment and server | |
CN105159756A (en) | Information processing method and information processing equipment | |
CN107977245A (en) | A kind of application terminal exchange method and application terminal | |
CN109376148B (en) | Data processing method and device for slow change dimension table and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |