CN109918575A - A kind of superseded method and apparatus of the data applied to search system - Google Patents

A kind of superseded method and apparatus of the data applied to search system Download PDF

Info

Publication number
CN109918575A
CN109918575A CN201910247135.0A CN201910247135A CN109918575A CN 109918575 A CN109918575 A CN 109918575A CN 201910247135 A CN201910247135 A CN 201910247135A CN 109918575 A CN109918575 A CN 109918575A
Authority
CN
China
Prior art keywords
business datum
data
accessed
value
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910247135.0A
Other languages
Chinese (zh)
Inventor
刘一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910247135.0A priority Critical patent/CN109918575A/en
Publication of CN109918575A publication Critical patent/CN109918575A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

This specification provides a kind of superseded method and apparatus of the data applied to search system, first records the accessed situation of every business datum, and the active value of business datum is calculated based on accessed situation.When carrying out data and eliminating, with reference to two data between in the presence of data activity value and data.Compared to traditional strategy for temporally directly eliminating old service data, this programme can be such that cold data (being indifferent to data) eliminates as early as possible, dsc data resident system, under the conditions of system space is limited, search hit rate is improved, avoids frequently reloading system load and delay caused by having eliminated data.

Description

A kind of superseded method and apparatus of the data applied to search system
Technical field
This specification is related to internet area more particularly to a kind of data applied to search system eliminate method and dress It sets.
Background technique
Es (ElasticSearch) is a kind of search server based on Lucene, it, which is provided, has distributed be mostly used The search engine of family retrieval capability.Can be stored in Es cluster a certain amount of business datum in case access, for example: in anti money washing Field, user service data are stored in Es cluster in the form of data directory, in case search uses.
In view of the limitation of business datum amount and Es cluster capacity, the old number stored in Es cluster generally can be periodically eliminated According to so as to the transition from the old to the new of storing data.In traditional data robin scheme, be according to time dimension timing eliminate it is expired Data.For example, Es cluster only allows to store one month business datum, then according to the life cycle algorithm of succession of the old by the new, in this month 1 Number when, then need to eliminate No. 1 business datum of last month, with guarantee Es cluster capacity health and balance.
The data of date earlier above are eliminated and are deleted according to date dimension by traditional scheme.After eliminating, if necessary The business datum being eliminated is accessed, needs data reloading synchronization under line.The dimension of traditional robin scheme is excessively single One, it is not bound with actual business demand progress data and eliminates, (entered and eliminated section) He Duozhu across time dimension in access When body client's connected reference, it is easy to hit failure occur.It must reload and eliminate data, increase system load and prolong Late.
Summary of the invention
In view of the above technical problems, this specification embodiment provide a kind of data applied to search system eliminate method and Device, technical solution are as follows:
It, should according to this specification embodiment in a first aspect, providing a kind of data applied to search system eliminates method Method includes:
The accessed information of business datum is obtained, the accessed time that the accessed information includes at least business datum believes Breath, calculates according to the accessed information and adjusts the active value of corresponding business datum;
The time field for extracting business datum, determines presence of the business datum in search system according to the time field Duration;
The temperature score value of every business datum, the temperature score value and business number are calculated using preset data temperature algorithm According to there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
According to the second aspect of this specification embodiment, a kind of superseded device of the data applied to search system is provided, it should Device includes:
Access monitoring module: for obtaining the accessed information of business datum, the accessed information includes at least business The accessed temporal information of data calculates according to the accessed information and adjusts the active value of corresponding business datum;
Duration determining module: for extracting the time field of business datum, business datum is determined according to the time field In search system there are durations;
Temperature computing module: for calculating the temperature score value of every business datum, institute using preset data temperature algorithm State temperature score value and business datum there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
Data eliminate module: for carrying out pair the temperature score value and predefined superseded threshold value that calculate business datum Than the business datum that temperature score value is lower than superseded threshold value is deleted from search system.
According to the third aspect of this specification embodiment, a kind of computer equipment is provided, including memory, processor and deposit Store up the computer program that can be run on a memory and on a processor, wherein the processor is realized when executing described program A kind of superseded method of the data applied to search system, this method comprises:
The accessed information of business datum is obtained, the accessed time that the accessed information includes at least business datum believes Breath, calculates according to the accessed information and adjusts the active value of corresponding business datum;
The time field for extracting business datum, determines presence of the business datum in search system according to the time field Duration;
The temperature score value of every business datum, the temperature score value and business number are calculated using preset data temperature algorithm According to there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
The temperature score value for calculating business datum and predefined superseded threshold value are compared, by temperature score value lower than naughty The business datum for eliminating threshold value is deleted from search system.
Technical solution provided by this specification embodiment records the accessed situation of every business datum, and is based on quilt Access the active value that situation calculates business datum.When progress data are eliminated, in the presence of data activity value and data Between two data.Compared to traditional temporally directly superseded strategy, this programme can make cold data (being indifferent to data) to the greatest extent Early to eliminate, dsc data resident system improves search hit rate, avoids frequently reloading under the conditions of system space is limited System load and delay caused by data are eliminated.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not This specification embodiment can be limited.
In addition, any embodiment in this specification embodiment does not need to reach above-mentioned whole effects.
Detailed description of the invention
In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only The some embodiments recorded in this specification embodiment for those of ordinary skill in the art can also be attached according to these Figure obtains other attached drawings.
Fig. 1 is a kind of stream of the superseded method of the data applied to search system shown in one exemplary embodiment of this specification Cheng Tu;
Fig. 2 is a kind of flow chart of the superseded threshold value calculation method shown in one exemplary embodiment of this specification;
Fig. 3 is a kind of flow chart of the batch updating mechanism shown in one exemplary embodiment of this specification;
Fig. 4 is that one kind of the superseded method of the data applied to search system shown in one exemplary embodiment of this specification is shown It is intended to;
Fig. 5 is a kind of structural schematic diagram of computer equipment shown in one exemplary embodiment of this specification.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with this specification.On the contrary, they are only and such as institute The example of the consistent device and method of some aspects be described in detail in attached claims, this specification.
It is only to be not intended to be limiting this explanation merely for for the purpose of describing particular embodiments in the term that this specification uses Book.The "an" of used singular, " described " and "the" are also intended to packet in this specification and in the appended claims Most forms are included, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein is Refer to and includes that one or more associated any or all of project listed may combine.
It will be appreciated that though various information may be described using term first, second, third, etc. in this specification, but These information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not taking off In the case where this specification range, the first information can also be referred to as the second information, and similarly, the second information can also be claimed For the first information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".
Es (ElasticSearch) is a kind of search server based on Lucene, it, which is provided, has distributed be mostly used The search engine of family retrieval capability.Can be stored in Es cluster quantitative business datum in case access, for example: anti money washing lead Domain, user service data are stored in Es cluster in the form of data directory, in case search uses.
In view of the limitation of business datum amount and Es cluster capacity, the old number stored in Es cluster generally can be periodically eliminated According to so as to the transition from the old to the new of storing data.In traditional data robin scheme, be according to time dimension timing eliminate it is expired Data.For example, Es cluster only allows to store one month business datum, then according to the life cycle algorithm of succession of the old by the new, in this month 1 Number when, then need to eliminate No. 1 business datum of last month, with guarantee Es cluster capacity health and balance.
The data of date earlier above are eliminated and are deleted according to date dimension by traditional scheme.After eliminating, if necessary The business datum being eliminated is accessed, needs data reloading synchronization under line.The dimension of traditional robin scheme is excessively single One, it is not bound with actual business demand progress data and eliminates, (entered and eliminated section) He Duozhu across time dimension in access When body client's connected reference, it is easy to hit failure occur.It must reload and eliminate data, increase system load and prolong Late.
In view of the above problems, this specification embodiment provides a kind of superseded method of the data applied to search system, and A kind of superseded square law device of the data applied to search system for executing this method.The application that the present embodiment is related to below It eliminates method in the data of search system to be described in detail, shown in Figure 1, this method may comprise steps of:
S101, obtains the accessed information of business datum, and the accessed information includes at least the accessed of business datum Temporal information calculates according to the accessed information and adjusts the active value of corresponding business datum;
The present embodiment is illustrated search system by taking search server Es (ElasticSearch) as an example, and Es is a kind of Search engine, a certain amount of business datum can be stored in Es cluster is searched for standby user and use.This specification is hereafter referred to Es For search system.
By taking anti-money laundering field as an example, history service data are stored in Es cluster in the form of data directory, business datum It may include the payer in transaction, beneficiary, exchange hour, tradable commodity, the information such as transaction amount.User can be by business Certain information input Es in data search the single for being associated with the information or a plurality of business datum in turn.For example: user Payer account name " woods one " and transaction time of origin " October 8 " are inputted into Es as filter information, then can search for obtaining " woods One " the All Activity occurred in " October 8 ", each of these transaction can be considered as a business datum.
It is appreciated that this business datum can be considered as and " be more concerned about if a business datum is often accessed by the user This business datum can be considered as " data not being concerned about " if a business datum is not accessed by the user by data ".Therefore, originally Embodiment introduces this parameter of the active value of business network data, first obtains the accessed information of business datum, and according to the quilt Access information calculates and adjusts the active value of corresponding business datum, such as: the work of corresponding business datum is raised after being accessed every time Jump value.
Wherein, the accessed information of business datum needs to record when user accesses business datum, specifically, can be Increase the access information recording strategy to business datum in the search history record of Es engine, after each business datum is accessed, Record the accessed information;
Specifically, configuration access data can be corresponded to for every business datum, accessing in data may include that the business datum exists Total access times in preset time period, total the number of visiting people, the information such as time being accessed every time, which can be in business Data are corresponding after being accessed every time to be updated.
S102 extracts the time field of business datum, determines business datum in search system according to the time field There are durations;
S103, calculates the temperature score value of every business datum using preset data temperature algorithm, the temperature score value with Business datum there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
Temperature score value and business datum there are when length be inversely proportional, and it is directly proportional to the active value of the business datum, should Algorithmic formula can be with are as follows:
Score=α * (current time-business datum creation time)+β * (active value * c)+temperature radix;Wherein α<0, β> 0, c > 0, temperature radix is presetting constant.
The algorithmic notation, it is superseded whether a business datum carries out, depend not only on it is simple there are time dimensions, it is heavier What is wanted is client's degree of attentiveness (i.e. business datum active value).Based on the replacement policy and algorithm, handed over compared to traditional the old and new Probability and the response time of data breakdown can be greatly lowered in the replacement policy replaced.
Wherein, when calculating the active value of business datum, the association calculating parameter of active value is set according to specific needs. Generally, institute can be calculated according to the accessed frequency information in the newest accessed temporal information and predetermined period of business datum State the active value of business datum.It illustrates;The newest accessed time of one business datum closer to current time, then the industry The active value for data of being engaged in is higher;Accessed number was more at upper one week for one business datum, then the active value of the business datum is got over It is high
It, can also be by business in addition to the accessed time based on business datum and other than accessed frequency calculates corresponding active value The other parameters of data are added in the influent factor of temperature score value.Such as:
A) the accessed number in the predetermined amount of time of record traffic data, using accessed number as calculating hot value One of weight parameter is higher than quilt by the active value that 100 people access 1 business datum respectively such as in same 100 times access The business datum of 1 personal visit 100 times;
B) different business scenarios is pre-defined respectively, and business locations, the hot spot value of the business factors such as business hours section will The above-mentioned traffic hotspots value of business datum is as one of the weight parameter for calculating hot value.Such as hot spot service is defined as by double 11 Period, it is believed that occur " business transaction of double 11 " be easier by user query.When then calculating the active value of business datum, It can be " the corresponding up-regulation of the active value of the business datum of double 11 " by business hours section.
S104 compares the temperature score value for calculating business datum and predefined superseded threshold value, by temperature score value Business datum lower than superseded threshold value is deleted from search system.
Data superseded specific opportunity can generally take timing replacement policy according to application scenarios sets itself.Periodically Replacement policy executes a data in each predetermined point of time and eliminates, by " cold data " that can be eliminated in batches from Es cluster It deletes, such as executes a data every other week and eliminate.This data, which eliminate mode, more new business number occurs in each access According to temperature score value, but the related datas such as the accessed information of business datum are recorded at the business datum, when timing is washed in a pan When eliminating beginning, the temperature score value of every business datum is unifiedly calculated, and superseded temperature score value is ineligible after calculating Business datum.
It, can also be according to practical business scene settings others replacement policy other than taking timing replacement policy.Such as: for Es Cluster sets a safety margin value, after new business datum is added, detects that the remaining space of Es cluster is less than the safety When margin value, then business datum eliminative mechanism is triggered, executes the superseded process of primary above-mentioned business datum and wash in a pan " cold data " Eliminate processing.
Wherein, the superseded threshold value of a search system not instead of fixed data, the data that a dynamic updates are eliminated The update method of threshold value can specifically refer to Fig. 2, include the following steps:
S201, determine search system maximum bearer cap and currently use space, according to it is described maximum bearer cap with The remaining available space of search system is currently calculated with space;
S202 makes superseded threshold value and the residue that can use sky according to the remaining available space dynamic replacement and obsoleteness threshold value Between be inversely proportional.
Eliminating threshold value is dynamically to have been adjusted with space according to Es cluster bearer cap with current.When es cluster is remaining When space is few, business datum must have higher temperature score value that could not be eliminated, and eliminate be unsatisfactory for temperature point by a larger margin Value > superseded threshold value business datum, increases the survival threshold of business datum.The dynamic adjustable strategies for eliminating threshold value can be more preferable While meeting Es cluster and avoid itself occurring storage pressure, more demands for storing hot spot service data.
It is in addition to comparing the temperature score value for calculating business datum with predefined superseded threshold value, temperature score value is low Outside the business datum for eliminating threshold value is deleted from search system, plan can also be eliminated according to practical business scene settings others Slightly.Such as: business datum being ranked up from high to low according to temperature score value, the partial service data of sequence rearward are carried out superseded Processing.
Under application scenes, such as anti-money laundering field, the business association between each account and account of risk clique is past Toward that can be in reticular structure, multiagent task Transaction Inquiries, i.e. continuous-query be likely to when business datum in user's access clique The connected transaction of multiple accounts in clique.In this case, it can be criticized according to the characteristics of operation system with experience, consideration introducing Update mechanism is measured, batch updating mechanism can be from breakdown when largely reducing anti money washing multiagent task Transaction Inquiries Rate, the access data updating process can refer to Fig. 3, particularly may be divided into following steps:
S301, monitoring business data and the in real time accessed information of record traffic data;
S302 determines the associated client of the client, the associated client is at least after the business datum of client is accessed Including other clients of transaction occurred with the client in preset time range;
S303 determines the connected transaction of the associated client, according to the determination of the correlation degree of the connected transaction and updates The active value of corresponding business datum.
As can be seen that updating the business datum after the business datum of a batch updating mechanism i.e. account is accessed While accessing data, the access data of its interlock account transaction data are also updated.
Specifically, different label tagged traffic data can be set, after being accessed such as business datum, what is be directly accessed Label should " direct Access Events " and corresponding access time at business datum;By the associated data of the business datum of batch updating Place's label should " associated access event " and corresponding access time.It, can be according to different Access Events when calculating temperature score value Different calculating weights is set for the temperature score value of business datum, such as by the weight setting " directly accessed " is 1, " association visit Ask " weight setting be 0.7.
The data robin scheme that this specification provides eliminates two aspects in service data visitation and business datum respectively and is situated between Enter and play a role: when user carries out service inquiry, updating the newest accessed time of business datum, is eliminated carrying out data When, the hot value of each business datum is calculated based on the accessed time.The accessed number of business datum is more, and the accessed time gets over Close to current time, then it is believed that the more accessed user of business datum is concerned about, then temperature score value is correspondinglyd increase.Business datum exists Es is longer there are the time, it is believed that data get over " old ", then accordingly reduce temperature score value.Thus it can realize that dsc data resides Es collection Group, cold data (being indifferent to data) technical effect superseded as early as possible improve user's under conditions of the Es cluster confined space Search hit rate avoids frequently reloading system load and retardation rate caused by having eliminated data.
Corresponding to above method embodiment, it is superseded that this specification embodiment also provides a kind of data applied to search system Square law device, it is shown in Figure 4, including access monitoring module 410, duration determining module 420,430 sum number of temperature determining module According to determining module 440.
Access monitoring module 410: for obtaining the accessed information of business datum, the accessed information includes at least industry The accessed temporal information for data of being engaged in, calculates according to the accessed information and adjusts the active value of corresponding business datum;
Duration determining module 420: for extracting the time field of business datum, business number is determined according to the time field According in search system there are durations;
Temperature computing module 430: for calculating the temperature score value of every business datum using preset data temperature algorithm, The temperature score value and business datum there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
Data eliminate module 440: temperature score value and predefined superseded threshold value for that will calculate business datum carry out Comparison deletes the business datum that temperature score value is lower than superseded threshold value from search system.
This specification embodiment also provides a kind of computer equipment, includes at least memory, processor and is stored in On reservoir and the computer program that can run on a processor, wherein processor realized when executing described program aforementioned applications in The data of search system eliminate method, and the method includes at least:
The accessed information of business datum is obtained, the accessed time that the accessed information includes at least business datum believes Breath, calculates according to the accessed information and adjusts the active value of corresponding business datum;
The time field for extracting business datum, determines presence of the business datum in search system according to the time field Duration;
The temperature score value of every business datum, the temperature score value and business number are calculated using preset data temperature algorithm According to there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
The temperature score value for calculating business datum and predefined superseded threshold value are compared, by temperature score value lower than naughty The business datum for eliminating threshold value is deleted from search system.
Fig. 5 shows one kind provided by this specification embodiment and more specifically calculates device hardware structural schematic diagram, The equipment may include: processor 1010, memory 1020, input/output interface 1030, communication interface 1040 and bus 1050.Wherein processor 1010, memory 1020, input/output interface 1030 and communication interface 1040 are real by bus 1050 The now communication connection inside equipment each other.
Processor 1010 can use general CPU (Central Processing Unit, central processing unit), micro- place Reason device, application specific integrated circuit (Application Specific Integrated Circuit, ASIC) or one Or the modes such as multiple integrated circuits are realized, for executing relative program, to realize technical side provided by this specification embodiment Case.
Memory 1020 can use ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, the forms such as dynamic memory realize.Memory 1020 can store Operating system and other applications are realizing technical solution provided by this specification embodiment by software or firmware When, relevant program code is stored in memory 1020, and execution is called by processor 1010.
Input/output interface 1030 is for connecting input/output module, to realize information input and output.Input and output/ Module can be used as component Configuration (not shown) in a device, can also be external in equipment to provide corresponding function.Wherein Input equipment may include keyboard, mouse, touch screen, microphone, various kinds of sensors etc., output equipment may include display, Loudspeaker, vibrator, indicator light etc..
Communication interface 1040 is used for connection communication module (not shown), to realize the communication of this equipment and other equipment Interaction.Wherein communication module can be realized by wired mode (such as USB, cable etc.) and be communicated, can also be wirelessly (such as mobile network, WIFI, bluetooth etc.) realizes communication.
Bus 1050 include an access, equipment various components (such as processor 1010, memory 1020, input/it is defeated Outgoing interface 1030 and communication interface 1040) between transmit information.
It should be noted that although above equipment illustrates only processor 1010, memory 1020, input/output interface 1030, communication interface 1040 and bus 1050, but in the specific implementation process, which can also include realizing normal fortune Other assemblies necessary to row.In addition, it will be appreciated by those skilled in the art that, it can also be only comprising real in above equipment Component necessary to existing this specification example scheme, without including all components shown in figure.
This specification embodiment also provides a kind of computer readable storage medium, is stored thereon with computer program, the journey Realize that the data above-mentioned applied to search system eliminate method when sequence is executed by processor, the method includes at least:
The accessed information of business datum is obtained, the accessed time that the accessed information includes at least business datum believes Breath, calculates according to the accessed information and adjusts the active value of corresponding business datum;
The time field for extracting business datum, determines presence of the business datum in search system according to the time field Duration;
The temperature score value of every business datum, the temperature score value and business number are calculated using preset data temperature algorithm According to there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
The temperature score value for calculating business datum and predefined superseded threshold value are compared, by temperature score value lower than naughty The business datum for eliminating threshold value is deleted from search system.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual The purpose for needing to select some or all of the modules therein to realize this specification scheme.Those of ordinary skill in the art are not In the case where making the creative labor, it can understand and implement.
As seen through the above description of the embodiments, those skilled in the art can be understood that this specification Embodiment can be realized by means of software and necessary general hardware platform.Based on this understanding, this specification is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are to make It is each to obtain computer equipment (can be personal computer, server or the network equipment etc.) execution this specification embodiment Method described in certain parts of a embodiment or embodiment.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by the product with certain function.A kind of typically to realize that equipment is computer, the concrete form of computer can To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment The combination of any several equipment.
The above is only the specific embodiment of this specification embodiment, it is noted that for the general of the art For logical technical staff, under the premise of not departing from this specification embodiment principle, several improvements and modifications can also be made, this A little improvements and modifications also should be regarded as the protection scope of this specification embodiment.

Claims (11)

1. a kind of data applied to search system eliminate method, which comprises
The accessed information of business datum is obtained, the accessed information includes at least the accessed temporal information of business datum, It is calculated according to the accessed information and adjusts the active value of corresponding business datum;
Extract business datum time field, according to the time field determine business datum in search system there are when It is long;
Calculate the temperature score value of every business datum using preset data temperature algorithm, the temperature score value and business datum In the presence of length be inversely proportional, and it is directly proportional to the active value of the business datum;
The temperature score value for calculating business datum and predefined superseded threshold value are compared, temperature score value is lower than and eliminates threshold The business datum of value is deleted from search system.
2. according to the method described in claim 1, described calculate according to the accessed information and adjust corresponding business datum Active value, comprising:
The business is updated according to the accessed frequency information in the newest accessed temporal information and predetermined period of business datum The active value of data.
3. according to the method described in claim 2, the accessed information for obtaining business datum, comprising:
Monitoring business data and the in real time accessed information of record traffic data, after the business datum of client is accessed, simultaneously Update the active value of the business datum of the associated client of the client.
4. according to the method described in claim 3, the business datum of associated client that is described while updating the client is enlivened Value, comprising:
It determines that the associated client of the client, the associated client include at least in preset time range, occurs with the client Cross other clients of transaction;
The connected transaction for determining the associated client according to the determination of the correlation degree of the connected transaction and updates corresponding business number According to active value.
5. according to the method described in claim 1, the update method of the superseded threshold value, comprising:
It determines the maximum bearer cap of search system and has currently used space, sky has been used with current according to the maximum bearer cap Between calculate search system remaining available space;
According to the remaining available space dynamic replacement and obsoleteness threshold value, superseded threshold value is made to be inversely proportional with the remaining available space.
6. a kind of data applied to search system eliminate device, described device includes:
Access monitoring module: for obtaining the accessed information of business datum, the accessed information includes at least business datum Accessed temporal information, calculated according to the accessed information and adjust the active value of corresponding business datum;
Duration determining module: for extracting the time field of business datum, determine that business datum is being searched according to the time field In cable system there are durations;
Temperature computing module: for calculating the temperature score value of every business datum, the heat using preset data temperature algorithm Degree score value and business datum there are when length be inversely proportional, and it is directly proportional to the active value of the business datum;
Data eliminate module:, will for comparing the temperature score value for calculating business datum with predefined superseded threshold value The business datum that temperature score value is lower than superseded threshold value is deleted from search system.
7. device according to claim 6, described to be calculated according to the accessed information and adjust corresponding business datum Active value, comprising:
The business is updated according to the accessed frequency information in the newest accessed temporal information and predetermined period of business datum The active value of data.
8. device according to claim 7, the accessed information for obtaining business datum, comprising:
Monitoring business data and the in real time accessed information of record traffic data, after the business datum of client is accessed, simultaneously Update the active value of the business datum of the associated client of the client.
9. device according to claim 8, the business datum of associated client that is described while updating the client is enlivened Value, comprising:
It determines that the associated client of the client, the associated client include at least in preset time range, occurs with the client Cross other clients of transaction;
The connected transaction for determining the associated client according to the determination of the correlation degree of the connected transaction and updates corresponding business number According to active value.
10. device according to claim 6, the updating device of the superseded threshold value, comprising:
It determines the maximum bearer cap of search system and has currently used space, sky has been used with current according to the maximum bearer cap Between calculate search system remaining available space;
According to the remaining available space dynamic replacement and obsoleteness threshold value, superseded threshold value is made to be inversely proportional with the remaining available space.
11. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, wherein the processor realizes the method as described in claim 1 when executing described program.
CN201910247135.0A 2019-03-29 2019-03-29 A kind of superseded method and apparatus of the data applied to search system Pending CN109918575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910247135.0A CN109918575A (en) 2019-03-29 2019-03-29 A kind of superseded method and apparatus of the data applied to search system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910247135.0A CN109918575A (en) 2019-03-29 2019-03-29 A kind of superseded method and apparatus of the data applied to search system

Publications (1)

Publication Number Publication Date
CN109918575A true CN109918575A (en) 2019-06-21

Family

ID=66967606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910247135.0A Pending CN109918575A (en) 2019-03-29 2019-03-29 A kind of superseded method and apparatus of the data applied to search system

Country Status (1)

Country Link
CN (1) CN109918575A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559504A (en) * 2020-12-09 2021-03-26 北京思特奇信息技术股份有限公司 Data cleaning method and device based on data heat and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138587A (en) * 2015-07-31 2015-12-09 小米科技有限责任公司 Data access method, apparatus and system
CN105677240A (en) * 2015-12-30 2016-06-15 上海联影医疗科技有限公司 Data deleting method and system
CN108241725A (en) * 2017-05-24 2018-07-03 新华三大数据技术有限公司 A kind of data hot statistics system and method
CN109240611A (en) * 2018-08-28 2019-01-18 郑州云海信息技术有限公司 The cold and hot data hierarchy method of small documents, small documents data access method and its device
CN109299144A (en) * 2018-08-22 2019-02-01 北京奇艺世纪科技有限公司 A kind of data processing method, device, system and application server
CN109471875A (en) * 2018-09-25 2019-03-15 网宿科技股份有限公司 Based on data cached temperature management method, server and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138587A (en) * 2015-07-31 2015-12-09 小米科技有限责任公司 Data access method, apparatus and system
CN105677240A (en) * 2015-12-30 2016-06-15 上海联影医疗科技有限公司 Data deleting method and system
CN108241725A (en) * 2017-05-24 2018-07-03 新华三大数据技术有限公司 A kind of data hot statistics system and method
CN109299144A (en) * 2018-08-22 2019-02-01 北京奇艺世纪科技有限公司 A kind of data processing method, device, system and application server
CN109240611A (en) * 2018-08-28 2019-01-18 郑州云海信息技术有限公司 The cold and hot data hierarchy method of small documents, small documents data access method and its device
CN109471875A (en) * 2018-09-25 2019-03-15 网宿科技股份有限公司 Based on data cached temperature management method, server and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559504A (en) * 2020-12-09 2021-03-26 北京思特奇信息技术股份有限公司 Data cleaning method and device based on data heat and storage medium

Similar Documents

Publication Publication Date Title
US20180331905A1 (en) Method and system for tuning performance of microservices-based applications
US20210232201A1 (en) Resource scheduling method and computer device
EP2904737B1 (en) Consistency-based service-level agreements in cloud storage environments
US10673722B2 (en) Search result suggestions based on dynamic network latency classification
KR100954624B1 (en) Method and system for providing content according to personal preference
US20110125924A1 (en) Method and system for synchronizing user content in a social network
CN105808736B (en) A kind of list data-updating method, apparatus and system
US9501428B2 (en) Managing apparatus
CN107092628B (en) Time series data processing method and device
US10554737B2 (en) Method and apparatus for leveling loads of distributed databases
CN109240611A (en) The cold and hot data hierarchy method of small documents, small documents data access method and its device
US10379983B2 (en) Simulation device and distribution simulation system
CN109446114A (en) Spatial data caching method and device and storage medium
CN107623732A (en) A kind of date storage method based on cloud platform, device, equipment and storage medium
Macielak et al. Delayed tumor growth in vestibular schwannoma: an argument for lifelong surveillance
WO2023279970A1 (en) Blockchain-based data synchronization method and apparatus
US20160267020A1 (en) Computing method and apparatus associated with context-aware management of a file cache
CN111966887A (en) Dynamic caching method and device, electronic equipment and storage medium
CN109918678A (en) A kind of field meanings recognition methods and device
CN109918575A (en) A kind of superseded method and apparatus of the data applied to search system
CN108920326A (en) Determine system time-consuming abnormal method, apparatus and electronic equipment
US11243597B2 (en) Microprocessor power logging at a sub-process level
CN109889562A (en) A kind of offline access method and system of the advanced application platform of Enterprise Mobile
CN109669814A (en) A kind of restoration methods of Metadata Service, device, equipment and readable storage medium storing program for executing
CN109255001A (en) Maintaining method and device, the electronic equipment in interface instance library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200925

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right