CN106844706A - Update method, equipment, web storage system and the search system of web storage - Google Patents

Update method, equipment, web storage system and the search system of web storage Download PDF

Info

Publication number
CN106844706A
CN106844706A CN201710065766.1A CN201710065766A CN106844706A CN 106844706 A CN106844706 A CN 106844706A CN 201710065766 A CN201710065766 A CN 201710065766A CN 106844706 A CN106844706 A CN 106844706A
Authority
CN
China
Prior art keywords
hash table
web
web data
hash
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710065766.1A
Other languages
Chinese (zh)
Inventor
蔡迥航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Shenma Search Technology Co Ltd
Original Assignee
Guangdong Shenma Search Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Shenma Search Technology Co Ltd filed Critical Guangdong Shenma Search Technology Co Ltd
Priority to CN201710065766.1A priority Critical patent/CN106844706A/en
Publication of CN106844706A publication Critical patent/CN106844706A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions

Abstract

The invention discloses a kind of method for updating web storage, equipment, web storage system and search system.It is described to include for updating the method for web storage:The collision rate of the first Hash table of web storage system is detected, wherein, first Hash table stores web data;In the case where collision rate is more than threshold value is updated, the second Hash table is created;And in multiple migration process web data in the first Hash table being moved into the second Hash table, wherein, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.According to one embodiment, influence of the migration operation to query performance can be mitigated.

Description

Update method, equipment, web storage system and the search system of web storage
Technical field
The present invention relates to web storage and Webpage search technical field, deposited for more new web page more particularly, to one kind The method of storage, the equipment for updating web storage, web storage system and web page search system.
Background technology
In the web page search system of the Internet, applications, it usually needs web data storage is existed in the form of web-page summarization In the web storage system of web page search system.Due to the webpage enormous amount in internet, therefore, generally with the shape of key-value pair Formula stores in web storage system the web data, wherein, the major key of the key-value pair is normalized web page address, Value in the key-value pair is web page contents or web-page summarization content.
In existing web page search system, because webpage enormous amount and web page search system are for web page contents Update insensitive, therefore, generally daily or weekly the web data in web storage system is updated, or only for Part web data carries out real-time update.
Generally, there are two kinds of modes that real-time update is carried out to web data.
In first way, real-time update is realized by the storage device (for example, Redis systems) for aiding in.It is this Mode can increase the complexity of web storage system.
In the second way, web data is stored in the way of open chain Hash table, to realize to the fast of web data Quick checking is looked for.In the case of a large amount of webpages are stored in Hash table, Hash table conflict can cause query performance degradation.This When, it is necessary to carry out dilatation to Hash table.
In the prior art, when dilatation is carried out to Hash table, new Hash table is created first, then by original Hash table The web data of storage is disposably copied in new Hash table.In the disposable copy procedure, the inquiry of original Hash table Performance degradation.
Accordingly, it is desirable to provide a kind of new technical scheme, enters for above-mentioned at least one technical problem of the prior art Row is improved.
The content of the invention
It is an object of the present invention to provide a kind of new solution for updating web storage.
According to the first aspect of the invention, there is provided a kind of method for updating web storage, including:Detection webpage is deposited The collision rate of the first Hash table of storage system, wherein, first Hash table stores web data;In collision rate more than renewal threshold In the case of value, the second Hash table is created, wherein, the capacity of the capacity more than the first Hash table of the second Hash table;And with many Secondary migration process moves to web data in the first Hash table in the second Hash table, wherein, in each migration process, by During a part for web data moves to the second Hash table in one Hash table.
Alternatively or alternatively, the collision rate is the Hash shared by the web data of currently practical receiving in Hash table The ratio of the whole Hash barrelages in barrelage and Hash table, the renewal threshold value is the threshold value on the ratio.
Alternatively or alternatively, the collision rate is shared by the web data of currently practical storage in first Hash table Hash barrelage, and the renewal threshold value is the threshold value on Hash barrelage.
Alternatively or alternatively, in web page contents in the first Hash table being moved into the second Hash table with multiple migration process Also include:During web data in the first Hash table moved into the second Hash table when a query is received.
Alternatively or alternatively, in web data in the first Hash table being moved into the second Hash table when a query is received Also include:Migration vernier i is set in the first Hash table, wherein, migration vernier i indicates current web data unit to be migrated Element;And web data migration of element indicated by vernier i to the second Hash table will be migrated when a query is received.
Alternatively or alternatively, the current web data element to be migrated includes one or more Hash bucket correspondence Element or Hash bucket in one or more elements.
Alternatively or alternatively, in transition process, new web data is written in the second Hash table.
Alternatively or alternatively, also include:Web data is read from the second Hash table;And work as in the second Hash table not In the case of finding related web page data web data is read from the first Hash table.
Alternatively or alternatively, in web page contents in the first Hash table being moved into the second Hash table with multiple migration process Also include:By from the web data of the first Hash table write-in file cache;And in the web data of write-in file cache Web data in file cache is written to the second Hash table by amount more than in the case of cache threshold.
Alternatively or alternatively, also include:When the web data in file cache is written into the second Hash table, In the case that the length of the web data file in two Hash tables is more than the length of corresponding web data file in file cache, Read the web data file in file cache.
Alternatively or alternatively, the web data is web-page summarization.
Alternatively or alternatively, the web storage system is the storage system of web page search system.
According to the second aspect of the invention, there is provided a kind of equipment for updating web storage, including:For detecting net The device of the collision rate of the first Hash table of page storage system, wherein, first Hash table stores web data;For in punching Prominent rate creates the device of the second Hash table in the case of being more than renewal threshold value, wherein, the capacity of the second Hash table is breathed out more than first The capacity of uncommon table;And for web data in the first Hash table to be moved to the dress in the second Hash table with multiple migration process Put, wherein, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.
According to the third aspect of the invention we, there is provided a kind of web storage system, it is including above-mentioned for updating web storage Equipment.
According to the fourth aspect of the invention, there is provided a kind of web storage system, including:Memory and processor, wherein, The memory includes machine-executable instruction, and when the web storage system operation, the machine-executable instruction is used for Control the treatment in method of the computing device according to any of the above described one.
According to the fifth aspect of the invention, there is provided a kind of web page search system, including according to any of the above-described described net Page storage system, for storing web data, for retrieval.
According to one embodiment of present invention, inquiry of the web data migration to Hash table can to a certain extent be mitigated The influence of performance.
By referring to the drawings to the detailed description of exemplary embodiment of the invention, further feature of the invention and its Advantage will be made apparent from.
Brief description of the drawings
The accompanying drawing for being combined in the description and constituting a part for specification shows embodiments of the invention, and even It is used to explain principle of the invention together with its explanation.
Fig. 1 shows the schematic flow for updating the method for web storage according to an embodiment of the invention Figure.
Fig. 2 shows the schematic block diagram of web storage system according to an embodiment of the invention.
Fig. 3 shows the schematic block diagram of web storage system according to another embodiment of the invention.
Fig. 4 shows the schematic block diagram of web page search system according to an embodiment of the invention.
Specific embodiment
Describe various exemplary embodiments of the invention in detail now with reference to accompanying drawing.It should be noted that:Unless had in addition Body illustrates that the part and the positioned opposite of step, numerical expression and numerical value for otherwise illustrating in these embodiments do not limit this The scope of invention.
The description only actually at least one exemplary embodiment is illustrative below, never as to the present invention And its any limitation applied or use.
May be not discussed in detail for technology, method and apparatus known to person of ordinary skill in the relevant, but suitable In the case of, the technology, method and apparatus should be considered as a part for specification.
In all examples shown here and discussion, any occurrence should be construed as merely exemplary, without It is as limitation.Therefore, other examples of exemplary embodiment can have different values.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it need not be further discussed in subsequent accompanying drawing.
Below, each embodiment of the invention and example are described with reference to the accompanying drawings.
<Method>
Fig. 1 shows the schematic flow for updating the method for web storage according to an embodiment of the invention Figure.
As shown in figure 1, in step S1100, detecting the collision rate of the first Hash table of web storage system.First Hash table Storage web data.
Web storage system is the storage system of web page search system.The web storage system has Hash table, in Hash Be stored with a large amount of web datas in table.For example, the web data can include web-page summarization.In such a case, it is possible to use key Value to mode store web data, the major key of the key-value pair is normalized web page address, and the value in the key-value pair is Web page contents or web-page summarization.
Stored using Hash table, can rapidly be found corresponding web data.
In step S1200, in the case where collision rate is more than threshold value is updated, the second Hash table is created.For example, the second Hash Capacity of the capacity of table more than the first Hash table.
The renewal threshold value is corresponding with the collision rate of the Hash table that step S1100 is related to.According to the collision rate of Hash table Definition mode it is different, then the renewal threshold value for setting is also different.
In one example, the collision rate of the Hash table is shared by the web data of currently practical receiving in Hash table Hash barrelage and Hash table in whole Hash barrelages ratio, and the renewal threshold value is the threshold on the ratio Value.For example, the whole Hash barrelages in Hash table are 100, Hash shared by the web data of currently practical receiving in Hash table Barrelage 50, then the collision rate of Hash table is 50 and 100 ratio, i.e., 0.5.For example, the renewal threshold value is 0.5.Work as Hash table Collision rate when reaching 0.5, start to create the second Hash table.
In another example, when a Hash table is set up, the threshold value of Hash barrelage can be preset.When actually depositing When the Hash barrelage for storing up data reaches the threshold value, it is updated.In such a case, it is possible to directly using Hash barrelage as conflict Rate.For example, the collision rate is the Hash barrelage shared by the web data of currently practical storage in first Hash table, with And the renewal threshold value is the threshold value on Hash barrelage.For example, the renewal threshold value for pre-setting is 60.When current in Hash table When Hash barrelage shared by the web data of actual storage is 60, start to create the second Hash table.
For example, the collision rate of the Hash table can also be the Kazakhstan shared by the web data of currently practical receiving in Hash table The ratio of the total capacity of the whole Hash bucket in the capacity and Hash table of uncommon bucket.For example, in Hash table currently practical receiving net The capacity of the Hash bucket shared by page data is 7000, and the total capacity of the whole Hash bucket in Hash table is 10000, then Hash table Collision rate be 7000 and 10000 ratio, i.e., 0.7.For example, the renewal threshold value is set as 0.7.When collision rate is 0.7, Start to create the second Hash table.
The quantity of the web data that second Hash table can be accommodated is more than the webpage number that the first Hash table can be accommodated According to quantity.The second Hash table can be created by way of using increase Hash barrelage, or, it is also possible to by increasing each The mode of the quantity of the web data that Hash barrelage can be accommodated creates the second Hash table.For example, by increasing Hash barrelage Mode create the second Hash table in the case of, the spreading coefficient of Hash table is q (q > 1), and the Hash barrelage of the first Hash table is N.When the collision rate of the first Hash table is more than threshold value is updated, the second Hash table is created, wherein, the second Hash table for being created Hash barrelage N '=N*q.
In step S1300, during web data in the first Hash table moved into the second Hash table with multiple migration process, its In, in each migration process, during a part for web data in the first Hash table moved into the second Hash table.
It is to receive web page interrogation to ask by the trigger condition that web data in the first Hash table moves in the second Hash table Ask.If detecting the collision rate of the first Hash table of web storage system more than threshold value is updated, asked when web page interrogation is received When, the part web data in the first Hash table is moved into the second Hash table.So can as far as possible reduce migration operation to looking into Ask the influence of operation.In one example, after inquiry operation is performed, the migration operation is performed, is moved with further reduction Move influence of the operation to inquiry operation.
In one embodiment, migration vernier i is pre-set in the first Hash table.Migration vernier i indicates currently The web data element being migrated.When a query is received, web data migration of element indicated by vernier i to second will be migrated Hash table.Current web data element to be migrated can be the corresponding element of one or more Hash bucket, or, Ke Yiwei One or more elements in one Hash bucket.
When the web data element migrated indicated by vernier i is written into the second Hash table, migration vernier i can be moved With the web data element to be migrated of next group in the first Hash table of instruction.When inquiry request is received again, this is moved Web data migration of element to be migrated that vernier i indicates is moved in the second Hash table.According to aforesaid operations step, by first The web data stored in Hash table is moved in the second Hash table in batches, until the web data that the first Hash table is stored is complete Portion moves to the second Hash table, so as to complete the migration to the web data stored in the first Hash table.
For example, second can be moved in the part web data that will migrate the first Hash table storage indicated by vernier i After Hash table, the web data corresponding to this migration process is deleted from the first Hash table.Alternatively, when the first Hash table Whole web datas of middle storage are migrated to after the second Hash table, are deposited in the first Hash table of the first Hash table of deletion or deletion Whole web datas of storage.
In one embodiment, during the web data migration process of the first Hash table storage, preferentially breathed out from second Uncommon table reads the corresponding web data of inquiry request.When not finding the corresponding associated nets number of pages of inquiry request in the second Hash table In the case of, the corresponding web data of inquiry request is read from the first Hash table.
In each migration process, when the web data in the first Hash table is moved in the second Hash table, can be with Web data to be migrated in first Hash table is written in file cache.It is big in the web data amount of write-in file cache In the case of cache threshold, the web data in file cache is written to the second Hash table.It is written to wanting for file cache The web data being migrated can be queried.
During the web data in file cache is written into the second Hash table, when inquiry request is received, It is to read the web data file from the second Hash table that the length of web data file that can be in the second Hash table judges, Or read the web data file from file cache.Generally, when web data file is write to the second Hash table, can be by the It is sufficiently large that file size in two Hash tables is set, to guarantee to accommodate this document.When write operation is terminated, by file Length is set to physical length.Under the circumstances, when inquiry operation is performed, the web data text in the second Hash table In the case that the length of part is more than the length of corresponding web data file in file cache, then the webpage in file cache is read Data file.
According to one embodiment, the web data of web storage system is migrated using progressive manner, it is to avoid disposably copy Shellfish mass data, so as to mitigate influence of the single migration operation to query performance.
Additionally, in one embodiment, in this way can be with the capacity of dynamic expansion Hash table.
Additionally, in one embodiment, read/write operation can be to a certain extent separated, so as to improve systematic function.
<Equipment>
It will be appreciated by those skilled in the art that in electronic technology field, can be by software, hardware and software and hard The mode that part is combined, the above method is embodied in the product.Those skilled in the art are easy to, based on method as disclosed above, produce A kind of raw equipment for updating web storage.The equipment can include being described previously for updating web storage for realization Method in each operation device.For example, the equipment can include the first Hash table for detecting web storage system Collision rate device, wherein, first Hash table stores web data;For in situation of the collision rate more than renewal threshold value The lower device for creating the second Hash table, wherein, the capacity of the second Hash table is more than the first Hash table;And for repeatedly migrating Web data in first Hash table is moved to device in the second Hash table for treatment, wherein, in each migration process, by the During a part for web data moves to the second Hash table in one Hash table.
<Web storage system>
Equipment for updating web storage described above can be a part for web storage system.In such case Under, the web storage system is used to update web storage, realizes repeatedly moving to the web data that the first Hash table is stored In second Hash table.Fig. 2 shows the schematic block diagram of web storage system according to an embodiment of the invention.Referring to figure 2, web storage system 2000 includes the equipment 2010 for being described previously for updating web storage.
Fig. 3 shows the schematic block diagram of web storage system according to another embodiment of the invention.Referring to Fig. 3, Web storage system 3000 can include processor 3010, memory 3020, interface arrangement 3030, communicator 3040, display Device 3050, input unit 3060, loudspeaker 3070, microphone 3080, etc..
Processor 3010 for example can be central processor CPU, Micro-processor MCV etc..Memory 3020 for example includes ROM (read-only storage), RAM (random access memory), the nonvolatile memory of hard disk etc..Interface arrangement 3030 is for example Including USB interface, earphone interface etc..
Communicator 3040 can for example carry out wired or wireless communication.
Display device 3050 is, for example, LCDs, touch display screen etc..Input unit 3060 can for example include touching Touch screen, keyboard etc..User can be by loudspeaker 3070 and the inputting/outputting voice information of microphone 3080.
Web storage system shown in Fig. 3 is only explanatory, and never be intended to limitation the present invention, its application or Purposes.
In this embodiment, the memory 3020 is used for store instruction, and the instruction is used to control the processor 3010 are operated to perform the method for updating web storage shown in Fig. 1.Although it will be appreciated by those skilled in the art that Figure 3 illustrates multiple devices, but, the present invention can only relate to partial devices therein, for example, processor 3010 and depositing Reservoir 3020 etc..Technical staff can instruct according to presently disclosed conceptual design.How control process device is grasped for instruction Make, this is it is known in the art that therefore being not described in detail herein.
<Web page search system>
Web storage system described above is the storage system of web page search system.Fig. 4 shows of the invention The schematic block diagram of the web page search system of one embodiment.Referring to Fig. 4, web page search system 4000 includes foregoing net Page storage system 4010, for storing web data, for retrieval.
<Example>
Here is according to an example for specific embodiment.
According to an example, the web data in web storage system is updated on the premise of not influenceing query performance as far as possible Storage.For example, the web storage system is the storage system of web page search system.
In one example, by the way of gradual, the memory capacity or renewal web storage of extended web data.Example Such as, at no point in the update process, two Hash tables of holding, i.e. the first Hash table (or current Hash table) and the second Hash table (or update Hash table afterwards).
In the prior art, when updating, the second Hash table is copied to by the data in the first Hash table are disposable.This The read-write operation of the first Hash table can be produced a very large impact.
In one example, by the way of gradual, it is to avoid disposably copy mass data, so as to mitigate to systematicness The influence of energy.
In one example, by file cache, data are write to the second Hash table in a batch.Can so keep away The random writing of disk is exempted from.
Alternatively, according to one embodiment, can facilitate and web data is segmented by data volume.
Alternatively, because only one piece of data is currently written into, and other data segments only provide reading service, and therefore, it can will be big Partial read request and write request is separated by be left, and this can improve the performance for reading web-page summarization.
First, the collision rate of the first Hash table of detection web storage system.
For example, the collision rate of Hash table be Hash barrelage shared by the web data of currently practical receiving in Hash table with The ratio of the whole Hash barrelages in Hash table.Whole Hash barrelage N in Hash table are 100.It is default renewal threshold value be 0.6.When the Hash barrelage shared by the web data of currently practical receiving in the Hash table for detecting web storage system is 61 When, the collision rate of the first Hash table is 61 and 100 ratio, i.e., 0.61, it is more than default renewal threshold value.
Then, the second Hash table is created.
For example, creating the second Hash table in the way of to increase Hash barrelage.The spreading coefficient of Hash table is q (q > 1).When When the collision rate of the first Hash table is more than threshold value is updated, the second Hash table is created.The Hash barrelage of the second Hash table for being created N '=N*q.For example, it is assumed that q=1.5, then the Hash barrelage N ' of the second Hash table is 150.
Then, the part web data in the first Hash table is written in file cache.
For example, during the web data in the first Hash table moved into the second Hash table with multiple migration process.Receiving During to inquiry request, a part for the web data in the first Hash table is written in file cache.
Here, due to triggering migration operation based on inquiry request, therefore, it can be grasped relative to migration to a certain extent Make the degree of priority of raising inquiry request, so as to mitigate influence of the migration operation to query performance.
For example, migration vernier i can be pre-set in the first Hash table.Migration vernier i indicates current to be migrated Web data element.When inquiry request is received, the web data element migrated indicated by vernier i is written to file and is delayed In depositing.Current web data element to be migrated can be the element in one or more Hash bucket.
It is mobile to migrate vernier i to refer to after the web data element indicated by migration vernier i is written in file cache Show next group web data element to be migrated in the first Hash table.When inquiry request is received again, by migration trip The web data element to be migrated that mark i is indicated is written in file cache.
Afterwards, the web data in file cache is written to the second Hash table.
When the web data amount for writing file cache is more than cache threshold, the web data in file cache is written to Second Hash table.
According to aforesaid operations step, the web data batch that the first Hash table is stored is moved in the second Hash table, directly The second Hash table is all written to by the web data that the first Hash table is stored, so as to complete the webpage of the first Hash table storage The migration of data.
After migration is completed, the first Hash table can be deleted.Furthermore, it is possible to reset migration vernier in the second Hash table I, so as to when needed, during the web data in the second Hash table moved into new Hash table.
The present invention can be system, method and/or computer program product.Computer program product can include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.
Computer-readable recording medium can be the tangible of the instruction that holding and storage are used by instruction execution equipment Equipment.Computer-readable recording medium for example can be-- but be not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electromagnetism storage device, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer-readable recording medium More specifically example (non exhaustive list) includes:Portable computer diskette, hard disk, random access memory (RAM), read-only deposit It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static RAM (SRAM), portable Compact disk read-only storage (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon Be stored with instruction punch card or groove internal projection structure and above-mentioned any appropriate combination.Calculating used herein above Machine readable storage medium storing program for executing is not construed as instantaneous signal in itself, the electromagnetic wave of such as radio wave or other Free propagations, logical Cross electromagnetic wave (for example, the light pulse for passing through fiber optic cables) that waveguide or other transmission mediums propagate or by wire transfer Electric signal.
Computer-readable program instructions as described herein can from computer-readable recording medium download to each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, LAN, wide area network and/or wireless network Portion's storage device.Network can include copper transmission cable, Optical Fiber Transmission, be wirelessly transferred, router, fire wall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for storing the meter in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.
For perform the present invention operation computer program instructions can be assembly instruction, instruction set architecture (ISA) instruction, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming language Source code or object code that any combination is write, programming language of the programming language including object-oriented-such as Smalltalk, C++ etc., and routine procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can perform fully on the user computer, partly perform on the user computer, as one solely Vertical software kit is performed, part performs or completely in remote computer on the remote computer on the user computer for part Or performed on server.In the situation for being related to remote computer, remote computer can be by the network-bag of any kind LAN (LAN) or wide area network (WAN)-be connected to subscriber computer are included, or, it may be connected to outer computer (such as profit With ISP come by Internet connection).In certain embodiments, by using computer-readable program instructions Status information carry out personalized customization electronic circuit, such as PLD, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can perform computer-readable program instructions, so as to realize each side of the invention Face.
Referring herein to method according to embodiments of the present invention, device (system) and computer program product flow chart and/ Or block diagram describes various aspects of the invention.It should be appreciated that each square frame and flow chart of flow chart and/or block diagram and/ Or in block diagram each square frame combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to all-purpose computer, special-purpose computer or other programmable datas The processor of processing unit, so as to produce a kind of machine so that these instructions are by computer or other programmable datas During the computing device of processing unit, work(specified in one or more square frames realized in flow chart and/or block diagram is generated The device of energy/action.Can also be the storage of these computer-readable program instructions in a computer-readable storage medium, these refer to Order causes that computer, programmable data processing unit and/or other equipment work in a specific way, so that, be stored with instruction Computer-readable medium then includes a manufacture, and it includes realizing in one or more square frames in flow chart and/or block diagram The instruction of the various aspects of the function/action of regulation.
Can also computer-readable program instructions be loaded into computer, other programmable data processing units or other In equipment so that perform series of operation steps on computer, other programmable data processing units or miscellaneous equipment, to produce The computer implemented process of life, so that performed on computer, other programmable data processing units or miscellaneous equipment Instruct function/action specified in one or more square frames realized in flow chart and/or block diagram.
Flow chart and block diagram in accompanying drawing show system, method and the computer journey of multiple embodiments of the invention The architectural framework in the cards of sequence product, function and operation.At this point, each square frame in flow chart or block diagram can generation One part for module, program segment or instruction of table a, part for the module, program segment or instruction is used comprising one or more In the executable instruction of the logic function for realizing regulation.In some realizations as replacement, the function of being marked in square frame Can occur with different from the order marked in accompanying drawing.For example, two continuous square frames can essentially be held substantially in parallel OK, they can also be performed in the opposite order sometimes, and this is depending on involved function.It is also noted that block diagram and/or The combination of the square frame in each square frame and block diagram and/or flow chart in flow chart, can use the function of performing regulation or dynamic The special hardware based system made is realized, or can be realized with the combination of computer instruction with specialized hardware.It is right For those skilled in the art it is well known that, realized by hardware mode, realized by software mode and by software and The mode of combination of hardware realizes it being all of equal value.
It is described above various embodiments of the present invention, described above is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.In the case of without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport Best explaining the principle of each embodiment, practical application or to the technological improvement in market, or make the art its Its those of ordinary skill is understood that each embodiment disclosed herein.The scope of the present invention be defined by the appended claims.

Claims (16)

1. a kind of method for updating web storage, including:
The collision rate of the first Hash table of web storage system is detected, wherein, first Hash table stores web data;
In the case where collision rate is more than threshold value is updated, the second Hash table is created, wherein, the capacity of the second Hash table is more than first Hash table;And
During web data in the first Hash table moved into the second Hash table with multiple migration process, wherein, at each migration In reason, during a part for web data in the first Hash table moved into the second Hash table.
2. method according to claim 1, wherein, the collision rate is the web data of currently practical receiving in Hash table The ratio of the whole Hash barrelages in shared Hash barrelage and Hash table, the renewal threshold value is the threshold on the ratio Value.
3. method according to claim 1, wherein, the collision rate is currently practical storage in first Hash table Hash barrelage shared by web data, and the renewal threshold value is the threshold value on Hash barrelage.
4. method according to claim 1, wherein, web page contents in the first Hash table are moved to multiple migration process Also include in second Hash table:
During web data in the first Hash table moved into the second Hash table when a query is received.
5. method according to claim 4, wherein, web data in the first Hash table is moved to when a query is received Also include in second Hash table:
Migration vernier i is set in the first Hash table, wherein, migration vernier i indicates current web data element to be migrated; And
Web data migration of element indicated by vernier i to the second Hash table will be migrated when a query is received.
6. method according to claim 5, wherein, the current web data element to be migrated includes one or many One or more elements in the individual corresponding element of Hash bucket or a Hash bucket.
7. method according to claim 1, wherein, in transition process, new web data is written to the second Hash In table.
8. method according to claim 1, also includes:
Web data is read from the second Hash table;And
In the case of related web page data are not found in the second Hash table web data is read from the first Hash table.
9. method according to claim 1, wherein, web page contents in the first Hash table are moved to multiple migration process Also include in second Hash table:
By from the web data of the first Hash table write-in file cache;And
In the case where the web data amount of write-in file cache is more than cache threshold, by the web data write-in in file cache To the second Hash table.
10. method according to claim 9, also includes:
When the web data in file cache is written into the second Hash table, web data file in the second Hash table In the case that length is more than the length of corresponding web data file in file cache, the webpage number in file cache is read According to file.
11. methods according to claim 1, wherein, the web data includes web-page summarization.
12. methods according to claim 1, wherein, the web storage system is the storage system of web page search system.
A kind of 13. equipment for updating web storage, including:
Device for detecting the collision rate of the first Hash table of web storage system, wherein, first Hash table stores net Page data;
The device of the second Hash table is created in the case of for being more than and updating threshold value in collision rate;And
For web data in the first Hash table to be moved to the device in the second Hash table with multiple migration process, wherein, In each migration process, during a part for web data in the first Hash table moved into the second Hash table.
A kind of 14. web storage systems, including the equipment for updating web storage according to claim 13.
A kind of 15. web storage systems, including:Memory and processor, wherein, the memory includes that machine is executable and refers to Order, when the web storage system operation, the machine-executable instruction is used to control the computing device according to right It is required that the treatment in method described in any one in 1-12.
A kind of 16. web page search systems, including the web storage system according to claims 14 or 15, for storing webpage Data, for retrieval.
CN201710065766.1A 2017-02-06 2017-02-06 Update method, equipment, web storage system and the search system of web storage Pending CN106844706A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710065766.1A CN106844706A (en) 2017-02-06 2017-02-06 Update method, equipment, web storage system and the search system of web storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710065766.1A CN106844706A (en) 2017-02-06 2017-02-06 Update method, equipment, web storage system and the search system of web storage

Publications (1)

Publication Number Publication Date
CN106844706A true CN106844706A (en) 2017-06-13

Family

ID=59121903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710065766.1A Pending CN106844706A (en) 2017-02-06 2017-02-06 Update method, equipment, web storage system and the search system of web storage

Country Status (1)

Country Link
CN (1) CN106844706A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109828966A (en) * 2019-01-17 2019-05-31 平安科技(深圳)有限公司 Gradual heavy hash method, device, computer equipment and storage medium
CN111143744A (en) * 2019-12-26 2020-05-12 杭州安恒信息技术股份有限公司 Method, device and equipment for detecting web assets and readable storage medium
CN113407462A (en) * 2021-06-16 2021-09-17 新华三信息安全技术有限公司 Data processing method and device, electronic equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810244A (en) * 2013-12-09 2014-05-21 北京理工大学 Distributed data storage system expansion method based on data distribution
CN104504076A (en) * 2014-12-22 2015-04-08 西安电子科技大学 Method for implementing distributed caching with high concurrency and high space utilization rate
CN104954444A (en) * 2015-05-27 2015-09-30 华为技术有限公司 Cached data migration method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810244A (en) * 2013-12-09 2014-05-21 北京理工大学 Distributed data storage system expansion method based on data distribution
CN104504076A (en) * 2014-12-22 2015-04-08 西安电子科技大学 Method for implementing distributed caching with high concurrency and high space utilization rate
CN104954444A (en) * 2015-05-27 2015-09-30 华为技术有限公司 Cached data migration method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LUOTUO44: "memcached源码分析——哈希表基本操作以及扩容过程", 《HTTPS://BLOG.CSDN.NET/LUOTUO44/ARTICLE/DETAILS/42773231》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109828966A (en) * 2019-01-17 2019-05-31 平安科技(深圳)有限公司 Gradual heavy hash method, device, computer equipment and storage medium
CN111143744A (en) * 2019-12-26 2020-05-12 杭州安恒信息技术股份有限公司 Method, device and equipment for detecting web assets and readable storage medium
CN111143744B (en) * 2019-12-26 2023-10-13 杭州安恒信息技术股份有限公司 Method, device and equipment for detecting web asset and readable storage medium
CN113407462A (en) * 2021-06-16 2021-09-17 新华三信息安全技术有限公司 Data processing method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
US8380680B2 (en) Piecemeal list prefetch
US10133770B2 (en) Copying garbage collector for B+ trees under multi-version concurrency control
CN106886375A (en) The method and apparatus of data storage
CN106844706A (en) Update method, equipment, web storage system and the search system of web storage
KR101773781B1 (en) Method and apparatus for user oriented data visualzation based on the web
CN108228649A (en) For the method and apparatus of data access
CN111198868A (en) Intelligent sub-database real-time data migration method and device
CN102591855A (en) Data identification method and data identification system
US10970262B2 (en) Multiple versions of triggers in a database system
US20230012642A1 (en) Method and device for snapshotting metadata, and storage medium
US10552399B2 (en) Predicting index fragmentation caused by database statements
US20150007118A1 (en) Software development using gestures
CN107025247A (en) Method, equipment, browser and the electronic equipment handled web data
CN112654995A (en) Tracking content attribution in online collaborative electronic documents
US10114579B2 (en) Data migration tool with intermediate incremental copies
US11093389B2 (en) Method, apparatus, and computer program product for managing storage system
AU2018214032A1 (en) Systems and methods for maintaining group membership records
CN111427511B (en) Data storage method and device
US10535011B2 (en) Predicting capacity based upon database elements
CN107122401A (en) To the method for data database storing, equipment, middleware equipment and server
US11385967B2 (en) Method for managing backup data by having space recycling operations on executed backup data blocks
CN107665124A (en) Modularization JavaScript file processing method, equipment and server
CN105159756A (en) Information processing method and information processing equipment
CN107977245A (en) A kind of application terminal exchange method and application terminal
CN109376148B (en) Data processing method and device for slow change dimension table and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613