CN105677652A - Data management method and device - Google Patents

Data management method and device Download PDF

Info

Publication number
CN105677652A
CN105677652A CN201410659318.0A CN201410659318A CN105677652A CN 105677652 A CN105677652 A CN 105677652A CN 201410659318 A CN201410659318 A CN 201410659318A CN 105677652 A CN105677652 A CN 105677652A
Authority
CN
China
Prior art keywords
data sheet
data
numerical value
information
sheet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410659318.0A
Other languages
Chinese (zh)
Other versions
CN105677652B (en
Inventor
李炉阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Tmall Technology Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410659318.0A priority Critical patent/CN105677652B/en
Publication of CN105677652A publication Critical patent/CN105677652A/en
Application granted granted Critical
Publication of CN105677652B publication Critical patent/CN105677652B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to an embodiment disclosing a data management method and device; the method comprises the following steps: obtaining sheet property information of data sheets, matching the sheet property information with a first information set so as to obtain a matching result, and determining a first value for the data sheet with a successful matching result; determining a first period for a data sheet (with an unsuccessful matching result) according to the sheet property information, and calculate the first value of the data sheet according to the first period of the data sheet and a life period in the data sheet property information; determining a to-be processed data sheet from the data sheets according to the first value of the data sheet, and using preset rules to process the to-be processed data sheet. The data management method and device can realize long time effective data management.

Description

A kind of data managing method and device
Technical field
The application relates to microcomputer data processing field, in particular to a kind of data managing method and device.
Background technology
Data have penetrated into each industry current and business function field, become the important factor of production. Along with the arrival of big data age, the data volume of each enterprise and the equal fast growth of business complexity, the storage demand of data is also increasing, and this adds difficulty to the management of data.
Existing data managing method, can comprise usually: when data exist storage bottleneck, it is possible to increase new storing machine; If new storing machine can not be increased, then server can initiate a secondary data cleaning. The process of described data cleaning can comprise: finding front n the large data volume table that data volume is bigger, such as data volume is greater than the table of 10TB; Judge whether described large data volume table can be deleted, if can delete, then delete described large data volume table; Or, the life cycle of the large data volume table found described in reduction.
Realizing in the application's process, contriver finds that in prior art, at least there are the following problems: search large data volume table every time, and confirm that the operating process whether described large data volume table can carry out deleting or reducing life cycle all needs cost a large amount of time and manpower, and existing data managing method can only short-term alleviate memory space inadequate problem, after a certain period of time, still there will be storage bottleneck. Therefore, existing data managing method can not carry out the optimum management of data long-term effectively.
Summary of the invention
The object of the embodiment of the present application is to provide a kind of data managing method and device, to realize permanently effective data management.
For solving the problems of the technologies described above, the embodiment of the present application provides a kind of data managing method and device to be achieved in that
A kind of data managing method, comprising:
Obtain the Table Properties information of data sheet, mate described Table Properties information and first information set, obtain matching result, it is determined that described matching result is the first numerical value of successful data sheet;
Matching result is unsuccessful data sheet, determines the period 1 of described data sheet according to described Table Properties information, according to the life cycle in the period 1 of described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet;
The first numerical value according to described data sheet determines the pending data sheet in described data sheet, processes described pending data sheet according to preset rules.
In preferred version, described data managing method also comprises: according to the Table Properties information of the first numerical value of described data sheet and described data sheet, it is determined that the 2nd numerical value of Unit first.
A kind of data management device, comprising: matching module, the first numerical evaluation module and processing module; Wherein,
Described matching module, for obtaining the Table Properties information of data sheet, mates described Table Properties information and first information set, obtains matching result, it is determined that described matching result is the first numerical value of successful data sheet;
Described first numerical evaluation module, for being unsuccessful data sheet to matching result in described matching module, the period 1 of described data sheet is determined according to described Table Properties information, life cycle in period 1 according to described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet;
Described processing module, for the pending data sheet determining in described data sheet according to the first numerical value of the data sheet determined in described matching module and the first numerical evaluation module, processes described pending data sheet according to preset rules.
In preferred version, described data management device also comprises: the first unit module; Described first unit module, for according to the first numerical value of data sheet determined in described matching module and the first numerical evaluation module and the Table Properties information of described data sheet, it is determined that the 2nd numerical value of Unit first.
The technical scheme provided from above the embodiment of the present application, data managing method disclosed in the embodiment of the present application and device, by the Table Properties information of described data sheet is analyzed, determine the first numerical value of described data sheet, first numerical value can reflect the data validity of described data sheet intuitively, by processing the lower data sheet of data validity in time, it is possible to realize permanently effective data management. Further, in preferred embodiment, it may also be determined that the 2nd numerical value of the Unit first corresponding to individual, application or business unit, 2nd numerical value can reflect the data validity of the data sheet being associated with Unit first intuitively, is of value to the data sheet to being associated with Unit first and manages.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present application or technical scheme of the prior art, it is briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, the accompanying drawing that the following describes is only some embodiments recorded in the application, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the schema of the application's data managing method embodiment;
Fig. 2 is the schema of the application's another embodiment of data managing method;
Fig. 3 is the module map of the application's data management device embodiment;
Fig. 4 is the module map of the application's data management another embodiment of device.
Embodiment
The embodiment of the present application provides a kind of data managing method and device.
In order to make those skilled in the art understand the technical scheme in the application better, below in conjunction with the accompanying drawing in the embodiment of the present application, technical scheme in the embodiment of the present application is clearly and completely described, obviously, described embodiment is only some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making other embodiments all obtained under creative work prerequisite, all should belong to the scope of the application's protection.
Fig. 1 is the schema of the application's data managing method embodiment. As shown in Figure 1, described data managing method can comprise:
S101: the Table Properties information obtaining data sheet, mates described Table Properties information and first information set, obtains matching result, it is determined that described matching result is the first numerical value of successful data sheet.
Obtain the Table Properties information of data sheet, described Table Properties information can comprise following at least one: access information, output time information, life cycle information, span information, byte number information, person liable's information in the type information of table, manual examination and verification information, very first time interval.
The type information of described table, it is possible to for describing the type of described data sheet. The type of described table can be the types such as source table, partition table or case of non-partitioned tables. Described source table can be the final table needing to retain in a link. A general link can comprise a source table.
Described manual examination and verification information, it is possible to for whether describing described data sheet through manual examination and verification. If described data sheet is through manual examination and verification, then described data sheet can be need to retain.
Interval access of described very first time information, it is possible to for describing described data sheet whether accessed mistake in very first time interval. If described data sheet is accessed mistake in very first time interval, then described data sheet can be retained. Described very first time interval can set in advance, for example, it can be set to be one month.
Described output time information, it is possible to for describing described data sheet from setting up time span so far.
Described life cycle information, it is possible to the most long-time for what describe that data in described data sheet store. The such as life cycle of a data sheet is 7 days, then stored in data " XXX " in described data sheet, and described data " XXX " automatically will delete from described data sheet for the 8th day after storing.
Described span information, it is possible to for representing the information that described data sheet is accessed after output, it is possible to comprise span value and the time information corresponding with described span value.
The described limit stores information, it is possible to record whether described data sheet carried out limit storage. Described limit storage refers to the record reducing repeated storage in data sheet so that the data stored in table can not cause the waste of resource space.
Described byte number information, it is possible to for the data volume size represented in described data sheet.
Described person liable's information, it is possible to for representing person liable's information of described data sheet, such as, can comprise the name of person liable, contact method or affiliated function etc.
Described first information set comprises: the type of table is source table, in manual examination and verification, very first time interval, the type of accessed mistake, table is case of non-partitioned tables and accessed mistake in the 2nd timed interval, and the output time is less than the first predetermined time period.
Described 2nd timed interval can be identical with the value at very first time interval. Described first predetermined time period can pre-set. Described first predetermined time period generally can be less than 20 days, such as, can be set to 10 days.
Mate described Table Properties information and first information set, specifically can comprise: described Table Properties information and first information set are sought common ground. If the common factor tried to achieve is not empty set, represent that the part or all of Table Properties information of described data sheet is identical with the information in first information set, it may be determined that described matching result is successfully.
If the matching result of the Table Properties information of described data sheet and first information set is successfully, it may be determined that the first numerical value of described data sheet, comprising: the first numerical value arranging described data sheet equals the maximum value in the first preset range.
Described first preset range can be the span of described first numerical value. Described first preset range can set in advance, and such as the first preset range can be 0~100 or 0~1 etc. Assume that the first preset range is 0~100, when the Table Properties information of described data sheet and the matching result of first information set are successfully, it may be determined that described first numerical value of described data sheet is 100.
Meet the data sheet of part information in first information set, it is possible to be identified that data need the data sheet retained. Can filter out, with mating of first information set, the data sheet that in described data sheet, data validity is stronger by Table Properties information.
S102: matching result is unsuccessful data sheet, determines the period 1 of described data sheet according to described Table Properties information, according to the life cycle in the period 1 of described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet.
For the data sheet that matching result is unsuccessful, it is possible to determine the period 1 of described data sheet according to described Table Properties information. The described period 1 may be used for the data rational memory cycle represented in described data sheet. Specifically, the period 1 of described data sheet is determined according to the span information in the Table Properties information of described data sheet.
Described span information, it is possible to for representing described data sheet from the generation time to the timed interval of accessed time. Described span information can comprise: span value, and the time information corresponding with described span value. The span information of a data sheet can comprise one or more span value and the time information corresponding with described span value. Such as, the time of a data sheet generation is on October 01st, 2014, accessed on October 10th, 2014 and on October 20th, 2014 respectively, then the span value of described data sheet on October 10th, 2014 is 10 days, and the span value on October 20th, 2014 is 20 days.
Determine that the period 1 of described data sheet comprises according to described span information: the maximum span value obtaining the 2nd timed interval before time information is positioned at current time, judge the interval range belonging to described maximum span value, according to the corresponding relation of described interval range and period 1, it is determined that the period 1 of described data sheet.
The value in described 2nd timed interval can pre-set, and such as described 2nd timed interval can be 90 days. The maximum span value of described data sheet within the 2nd timed interval is more big, and the period 1 of corresponding described data sheet can be more long.
The corresponding relation of described interval range and period 1 can pre-set. Such as, the corresponding relation of described interval range and period 1 can be as shown in table 1.
Table 1
Interval range Period 1
0~4 day 7 days
5~12 days 15 days
13~30 days 33 days
31~90 days 93 days
91~180 days 183 days
191~365 days 368 days
366~730 days 1095 days
It is greater than 730 days + 365 days period 1
Period 1 according to described data sheet and the life cycle of described data sheet, it is possible to calculate the first numerical value of described data sheet.
The life cycle of described data sheet can obtain according to the Table Properties information of described data sheet. Period 1 and life cycle calculations first numerical value according to data sheet specifically can comprise: calculating life cycle and the difference of period 1, the maximum value of the first preset range subtracts described difference, obtains the first candidate result; Judge whether described candidate result is greater than the first preset value, if described first candidate result is less than the first preset value, the first numerical value arranging described data sheet equals the first preset value, if the first candidate result is more than or equal to the first preset value, the first numerical value arranging described data sheet equals described first candidate result. Wherein, described first preset value belongs to the first preset range, and such as the first preset range is the 0~100, first preset value can value be 20.
In this step, by the Table Properties information of data sheet is carried out analyzing and processing, obtaining the first numerical value of data sheet, the first numerical value can reflect data validity in described data sheet, and the data validity of the data sheet of the more big correspondence of the first numerical value is more strong.
S103: determine the pending data sheet in described data sheet according to the first numerical value of described data sheet, processes described pending data sheet according to preset rules.
The first numerical value according to described data sheet can determine the pending data sheet in described data sheet, specifically can adopt following in the combination of any one or a few method to determine pending data sheet:
Described data sheet is sorted by the size according to the first numerical value, selects the minimum front m data table of the first numerical value in described data sheet as pending data sheet; Wherein, m is positive integer, and m is less than or equals the total number of data sheet;
Size according to the first numerical value sorts to described data sheet, selects minimum front p% the data sheet of the first numerical value in described data sheet as pending data sheet; Wherein, the value of p is 0~100;
Comparing the first numerical value and the preset reference value of data sheet, the data sheet that described first numerical value is less than described preset reference value can as pending data sheet; Wherein said preset reference value is greater than the first preset value.
Process described pending data sheet according to preset rules can comprise: delete described pending data sheet; Or, delete the part data in described pending data sheet; Or, change the life cycle of described pending data sheet.
The life cycle of the pending data sheet of described change is it may be that change to the period 1 by the life cycle of described pending data sheet.
Owing to the data validity of the data sheet of the more big correspondence of the first numerical value is more strong, carry out the data sheet that the first numerical value is less processing the waste that can effectively reduce the storage resources that the lower data sheet of data validity causes.
Data managing method embodiment disclosed in the present application, by the Table Properties information of described data sheet is analyzed, the first numerical value of described data sheet can be determined, first numerical value can reflect the data validity of described data sheet intuitively, by processing the lower data sheet of data validity in time, it is possible not only to effectively process large data volume table, it is also possible to effectively process quantity is more but long-tail table that data volume is less, thus realizes permanently effective data management.
Fig. 2 is the schema of the application's another embodiment of data managing method. As shown in Figure 2, the difference of the present embodiment and data managing method first embodiment is, shown method can also comprise:
S104: according to the Table Properties information of the first numerical value of described data sheet and described data sheet, it is determined that the 2nd numerical value of Unit first.
Described Unit first can comprise: individual, application or business unit. Described 2nd numerical value may be used for describing the data degree of functioning of all data sheet being associated with Unit first. 2nd numerical value is more high, it is possible to represent that the data degree of functioning of the data sheet being associated with Unit first is more high.
When described Unit first is individual, the defining method of described 2nd numerical value can comprise: the first numerical value of the data sheet being associated with Unit first is carried out weighted superposition and averages, described mean value is multiplied by the Management rate of described data sheet, and the result obtained is the 2nd numerical value.
Wherein,
The method of calculation of the weight of described data sheet can be: the byte number of described data sheet adds 1, gained and open root.
The method of calculation of described management data table fraction of coverage can be: the first data volume is divided by total amount of data. Described total amount of data can be the data volume of the data sheet being associated with Unit first.
Described first data volume can be the data volume sum of the data sheet meeting the first rule, described first rule can be the Table Properties information conforms of described data sheet following at least one item: comprise life cycle information, table type is source table, through manual examination and verification, carried out the limit and store.
When described Unit first is for application or during business unit, the defining method of described 2nd numerical value can comprise: the first numerical value of the data sheet being associated with Unit first is carried out weighted superposition and averages, described mean value is multiplied by the Management rate of described data sheet, the product obtained is multiplied by the complete rate of person liable, and the result of gained is the 2nd numerical value.
Wherein,
The method of calculation of the weight of described data sheet can be: the byte number of described data sheet adds 1, gained and open root.
The method of calculation of the Management rate of described data sheet can be: the 2nd data volume is divided by total amount of data. Described total amount of data can be the data volume of the data sheet being associated with Unit first.
Described 2nd data volume can be the data volume sum of the data sheet meeting the 2nd rule, described 2nd rule can be the Table Properties information conforms of described data sheet following at least one item: comprise life cycle information, through manual examination and verification, carried out the limit and store.
The method of calculation of the complete rate of described person liable can comprise: the 3rd data volume is divided by total amount of data. Described total amount of data can be the data volume of the data sheet being associated with Unit first.
Described 3rd data volume can be: the data volume sum containing the data sheet of person liable's information with Table Properties packet in the data sheet that Unit first is associated.
Other parts of the present embodiment are identical with the application's data managing method first embodiment, do not repeat them here.
It should be noted that, S104 can perform before S103, it is also possible to performing after S103, this is not made restriction by the application.
Data managing method embodiment disclosed in the present application, on the basis of method first embodiment, it may also be determined that the 2nd numerical value of the Unit first corresponding to individual, application or business unit, 2nd numerical value can reflect the data validity of the data sheet being associated with Unit first intuitively, is of value to the data sheet to being associated with Unit first and manages.
Introduce the data management device embodiment of the application below.
Fig. 3 shows the module map of the application's data management device embodiment. As shown in Figure 3, described data management device can comprise: matching module 301, first numerical evaluation module 302 and processing module 303. Wherein,
Described matching module 301, it is possible to for obtaining the Table Properties information of data sheet, mate described Table Properties information and first information set, obtain matching result, it is determined that described matching result is the first numerical value of successful data sheet;
Described first numerical evaluation module 302, may be used for matching result in described matching module 301 is unsuccessful data sheet, the period 1 of described data sheet is determined according to described Table Properties information, life cycle in period 1 according to described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet;
Described processing module 303, it is possible to for the pending data sheet determining in described data sheet according to the first numerical value of the data sheet determined in described matching module 301 and the first numerical evaluation module 302, processes described pending data sheet according to preset rules.
Fig. 4 shows the module map of the application's data management another embodiment of device. As shown in Figure 4, the difference of the present embodiment and the application's data management device first embodiment is, described data management device can also comprise: the first unit module 304.
Described first unit module 304, it is possible to for according to the first numerical value of data sheet determined in described matching module 301 and the first numerical evaluation module 302 and the Table Properties information of described data sheet, it is determined that the 2nd numerical value of Unit first.
Other parts of the present embodiment are identical with the application's data management device first embodiment, do not repeat them here.
Data management device disclosed in above-described embodiment is corresponding with the application's data managing method embodiment, it is possible to realize the technique effect of the application's data managing method embodiment.
In the nineties in 20th century, such as, can clearly distinguish for the improvement of a technology is improvement (to the improvement of the circuit structures such as diode, transistor, switch) on hardware or the improvement (improvement for method flow) on software. but, along with the development of technology, the improvement of current a lot of method flows can be considered as the direct improvement of hardware circuit. designer nearly all obtains corresponding hardware circuit by being programmed in hardware circuit by the method flow of improvement. therefore, can not say that the improvement of a method flow just can not realize by hardware entities module. such as, programmable logic device part (ProgrammableLogicDevice, PLD) (such as field-programmable gate array (FieldProgrammableGateArray, FPGA) being exactly) a kind of like this unicircuit, device programming is determined by its logic function by user. programme voluntarily a digital display circuit " integrated " on a slice PLD by designer, and do not need chip maker to carry out the special integrated circuit (IC) chip of designing and making 2. and, nowadays, replace and manually make integrated circuit (IC) chip, this kind of programming is also mostly used " logic compiler (logiccompiler) " software instead and is realized, software compiler used when it is write with program development is mutually similar, and original code before being compiled also handy specific programming language write, this is referred to as hardware description language (HardwareDescriptionLanguage, HDL), and HDL also not only has one, but have many kinds, such as ABEL (AdvancedBooleanExpressionLanguage), AHDL (AlteraHardwareDescriptionLanguage), Confluence, CUPL (CornellUniversityProgrammingLanguage), HDCal, JHDL (JavaHardwareDescriptionLanguage), Lava, Lola, MyHDL, PALASM, RHDL (RubyHardwareDescriptionLanguage) etc., the most generally use VHDL (Very-High-SpeedIntegratedCircuitHardwareDescriptionLangu age) and Verilog2 at present. those skilled in the art are also it should be appreciated that only need slightly to make method flow above-mentioned several hardware description language logic and programme and be programmed in unicircuit, so that it may to be easy to the hardware circuit of this logical method flow process accomplished.
Controller can realize by any suitable mode, such as, controller can be taked such as microprocessor or treater and store the computer-readable medium of the computer readable program code (such as software or firmware) that can perform by this (micro-) treater, logical gate, switch, application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), the form of programmable logic controller and embedding microcontroller, the example of controller includes but not limited to following microcontroller: ARC625D, AtmelAT91SAM, MicrochipPIC18F26K20 and SiliconeLabsC8051F320, storer controller can also be implemented as a part for the steering logic of storer.
Those skilled in the art also know, except realizing controller in pure computer readable program code mode, controller can be made to realize identical function with the form of logical gate, switch, application specific integrated circuit, programmable logic controller and embedding microcontroller etc. by method steps carries out logic programming completely. Therefore this kind of controller can be considered as a kind of hardware component, and to the structure that can also be considered as in hardware component for realizing the device of various function comprised in it. Or even, it is possible to be considered as the device being used for realizing various function not only can being the software module of implementation method but also can be the structure in hardware component.
System, device, module or the unit that above-described embodiment is illustrated, specifically can be realized by computer chip or entity, or realize by the product with certain function.
For convenience of description, it is divided into various unit to describe respectively with function when describing above device. Certainly, the function of each unit can be realized in same or multiple software and/or hardware when implementing the application.
As seen through the above description of the embodiments, the technician of this area can be well understood to the application and can realize by the mode that software adds required general hardware platform. based on such understanding, the technical scheme of the application in essence or says that part prior art contributed can embody with the form of software product, in one typically configuration, calculating equipment comprises one or more treater (CPU), input/output interface, network interface and internal memory. this computer software product can comprise some instructions with so that computer equipment (can be Personal Computer, server, or the network equipment etc.) performs the method described in some part of each embodiment of the application or embodiment. this computer software product can be stored in internal memory, internal memory may comprise the volatile memory in computer-readable medium, the forms such as random access memory (RAM) and/or Nonvolatile memory, such as read-only storage (ROM) or flash memory (flashRAM). internal memory is the example of computer-readable medium. computer-readable medium comprises permanent and impermanency, removable and non-removable media can to realize, information stores by any method or technology. information can be computer-readable instruction, data structure, the module of program or other data. the example of the storage media of computer comprises, but it is not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic RAM (DRAM), the random access memory (RAM) of other types, read-only storage (ROM), electrically erasable read-only storage (EEPROM), fast-flash memory body or other memory techniques, read-only optical disc read-only storage (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic magnetictape cartridge, tape magnetic rigid disk stores or other magnetic storage apparatus or any other non-transmission medium, can be used for storing the information can accessed by calculating equipment. according to defining herein, computer-readable medium does not comprise of short duration computer readable media (transitorymedia), such as data signal and the carrier wave of modulation.
Each embodiment in this specification sheets all adopts the mode gone forward one by one to describe, and what between each embodiment, identical similar part illustrated see, each embodiment emphasis mutually is the difference with other embodiments.Especially, for system embodiment, owing to it is substantially similar to embodiment of the method, so what describe is fairly simple, relevant part illustrates see the part of embodiment of the method.
The application can be used in numerous general or special purpose computing system environments or configuration. Such as: Personal Computer, server computer, handheld device or handheld device, dull and stereotyped type equipment, multi-processor system, system based on microprocessor, top set box, consumer-elcetronics devices able to programme, network PC, small-size computer, giant-powered computer, the distributed computing environment comprising above any system or equipment etc.
The application can describe in the general context of computer executable instructions, such as programmodule. Generally, programmodule comprises execution particular task or realizes the routine of particular abstract data type, program, object, assembly, data structure etc. The application can also be put into practice in a distributed computing environment, in these distributed computing environment, execute the task by the remote processing devices being connected by network of communication. In a distributed computing environment, programmodule can be arranged in the local and remote computer-readable storage medium comprising storing device.
Although depicting the application by embodiment, those of ordinary skill in the art know, the application has many distortion and change and do not depart from the spirit of the application, it is desirable to appended claim comprises these distortion and change and do not depart from the spirit of the application.

Claims (25)

1. a data managing method, it is characterised in that, comprising:
Obtain the Table Properties information of data sheet, mate described Table Properties information and first information set, obtain matching result, it is determined that described matching result is the first numerical value of successful data sheet;
Matching result is unsuccessful data sheet, determines the period 1 of described data sheet according to described Table Properties information, according to the life cycle in the period 1 of described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet;
The first numerical value according to described data sheet determines the pending data sheet in described data sheet, processes described pending data sheet according to preset rules.
2. a kind of data managing method as claimed in claim 1, it is characterised in that, mate described Table Properties information and first information set, comprising: described Table Properties information and first information set are sought common ground.
3. a kind of data managing method as claimed in claim 2, it is characterised in that, described matching result is successfully, comprising: the common factor that described Table Properties information and first information set are tried to achieve is not empty set.
4. a kind of data managing method as claimed in claim 3, it is characterised in that, it is determined that described matching result is the first numerical value of successful data sheet, comprising: the first numerical value arranging described data sheet equals the maximum value in the first preset range; Wherein, described first preset range is the span of described first numerical value.
5. a kind of data managing method as claimed in claim 1, it is characterized in that, described first information set comprises: the type of table is source table, in manual examination and verification, very first time interval, the type of accessed mistake, table is case of non-partitioned tables and accessed mistake in the 2nd timed interval, and the output time is less than the first predetermined time period.
6. a kind of data managing method as claimed in claim 1, it is characterized in that, described Table Properties information at least comprise following in one: access information, output time information, life cycle information, span information, byte number information, person liable's information in the type information of table, manual examination and verification information, very first time interval.
7. a kind of data managing method as claimed in claim 6, it is characterised in that, the span information of described data sheet: one or more span value and the time information corresponding with described span value.
8. a kind of data managing method as claimed in claim 7, it is characterized in that, the described period 1 determining described data sheet according to Table Properties information, comprise: the maximum span value obtaining the 2nd timed interval before time information is positioned at current time, judge the interval range belonging to described maximum span value, according to the corresponding relation of described interval range and period 1, it is determined that the period 1 of described data sheet.
9. a kind of data managing method as claimed in claim 8, it is characterised in that, the maximum span value of described data sheet within the 2nd timed interval is more big, and the period 1 of corresponding described data sheet is more long.
10. data managing method as claimed in claim 1 a kind of, it is characterised in that, the life cycle in the described period 1 according to data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet, comprising:
Calculating life cycle and the difference of period 1, the maximum value of the first preset range subtracts described difference, obtains the first candidate result;
Judge whether described candidate result is more than or equal to the 2nd preset value, if described first candidate result is less than the first preset value, the first numerical value arranging described data sheet equals the first preset value, if the first candidate result is greater than the first preset value, the first numerical value arranging described data sheet equals described first candidate result;
Wherein, described first preset range is the span of described first numerical value, and described first preset value belongs to the first preset range.
11. a kind of data managing methods as claimed in claim 10, it is characterized in that, described the first numerical value according to data sheet determines the pending data sheet in described data sheet, specifically, adopt following in any one method or the combination of several method to determine pending data sheet:
Described data sheet is sorted by the size according to the first numerical value, selects the minimum front m data table of the first numerical value in described data sheet as pending data sheet; Wherein, m is positive integer, and m is less than or equals total number of data sheet;
Size according to the first numerical value sorts to described data sheet, selects minimum front p% the data sheet of the first numerical value in described data sheet as pending data sheet; Wherein, the value of p is 0~100;
Comparing the first numerical value and the preset reference value of data sheet, described first numerical value is less than the data sheet of described preset reference value as pending data sheet; Wherein, described preset reference value is greater than described first preset value.
12. a kind of data managing methods as claimed in claim 1, it is characterised in that, process described pending data sheet according to preset rules, comprising:
Delete described pending data sheet; Or,
Delete the part data in described pending data sheet; Or,
Change the life cycle of described pending data sheet.
13. a kind of data managing methods as claimed in claim 12, it is characterised in that, the life cycle of the pending data sheet of described change, comprising: the life cycle of described pending data sheet is changed to the period 1.
14. a kind of data managing methods as claimed in claim 1, it is characterised in that, also comprise:
The first numerical value according to described data sheet and the Table Properties information of described data sheet, it is determined that the 2nd numerical value of Unit first.
15. a kind of data managing methods as claimed in claim 14, it is characterised in that, described Unit first comprises: individual, application or business unit.
16. a kind of data managing methods as claimed in claim 15, it is characterised in that, when described Unit first is individual, it is determined that the 2nd numerical value of Unit first comprises:
First numerical value of the data sheet being associated with Unit first being carried out weighted superposition and averages, described mean value is multiplied by the Management rate of described data sheet, and the result obtained is the 2nd numerical value.
17. a kind of data managing methods as claimed in claim 16, it is characterised in that, the method for calculation of described management data table fraction of coverage are: the first data volume is divided by total amount of data;
Wherein, described total amount of data is the data volume of the data sheet being associated with Unit first;
Described first data volume is the data volume sum of the data sheet meeting the first rule.
18. a kind of data managing methods as claimed in claim 17, it is characterised in that, described first rule comprises:
At least one item during the Table Properties information conforms of described data sheet is following: comprise life cycle information, table type is source table, through manual examination and verification, carried out the limit store.
19. data managing methods as claimed in claim 15 a kind of, it is characterised in that, when described Unit first for application or during business unit, it is determined that the 2nd numerical value of Unit first comprises:
First numerical value of the data sheet being associated with Unit first being carried out weighted superposition and averages, described mean value is multiplied by the Management rate of described data sheet, and the product obtained is multiplied by the complete rate of person liable, and the result of gained is the 2nd numerical value.
The 20. a kind of data managing methods as described in claim 16 or 19, it is characterised in that, the method for calculation of the weight of described data sheet comprise: the byte number of described data sheet adds 1, gained and open root.
21. a kind of data managing methods as claimed in claim 19, it is characterised in that, the method for calculation of the Management rate of described data sheet comprise: the 2nd data volume is divided by total amount of data;
Wherein, described total amount of data is the data volume of the data sheet being associated with Unit first;
Described 2nd data volume is the data volume sum of the data sheet meeting the 2nd rule.
22. a kind of data managing methods as claimed in claim 21, it is characterised in that, described 2nd rule comprises:
At least one item during the Table Properties information conforms of described data sheet is following: comprise life cycle information, through manual examination and verification, carried out the limit store.
23. a kind of data managing methods as claimed in claim 19, it is characterised in that, the method for calculation of the complete rate of described person liable comprise: the 3rd data volume is divided by total amount of data;
Wherein, described total amount of data is the data volume of the data sheet being associated with Unit first;
Described 3rd data volume comprises: the data volume sum containing the data sheet of person liable's information with Table Properties packet in the data sheet that Unit first is associated.
24. 1 kinds of data management devices, it is characterised in that, comprising: matching module, the first numerical evaluation module and processing module; Wherein,
Described matching module, for obtaining the Table Properties information of data sheet, mates described Table Properties information and first information set, obtains matching result, it is determined that described matching result is the first numerical value of successful data sheet;
Described first numerical evaluation module, for being unsuccessful data sheet to matching result in described matching module, the period 1 of described data sheet is determined according to described Table Properties information, life cycle in period 1 according to described data sheet and described data sheet Table Properties information, calculates the first numerical value of described data sheet;
Described processing module, for the pending data sheet determining in described data sheet according to the first numerical value of the data sheet determined in described matching module and the first numerical evaluation module, processes described pending data sheet according to preset rules.
25. a kind of data management devices as claimed in claim 24, it is characterised in that, also comprise: the first unit module;
Described first unit module, for according to the first numerical value of data sheet determined in described matching module and the first numerical evaluation module and the Table Properties information of described data sheet, it is determined that the 2nd numerical value of Unit first.
CN201410659318.0A 2014-11-19 2014-11-19 A kind of data managing method and device Active CN105677652B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410659318.0A CN105677652B (en) 2014-11-19 2014-11-19 A kind of data managing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410659318.0A CN105677652B (en) 2014-11-19 2014-11-19 A kind of data managing method and device

Publications (2)

Publication Number Publication Date
CN105677652A true CN105677652A (en) 2016-06-15
CN105677652B CN105677652B (en) 2019-01-04

Family

ID=56944649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410659318.0A Active CN105677652B (en) 2014-11-19 2014-11-19 A kind of data managing method and device

Country Status (1)

Country Link
CN (1) CN105677652B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365244A (en) * 2020-11-27 2021-02-12 深圳前海微众银行股份有限公司 Data life cycle management method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257144A1 (en) * 2009-04-01 2010-10-07 Touchstone Systems, Inc. Method and system for data aggregation, targeting and acquisition
CN102141996A (en) * 2010-01-29 2011-08-03 国际商业机器公司 Data access method and configuration management database system
CN102651008A (en) * 2011-02-28 2012-08-29 国际商业机器公司 Method and equipment for organizing data records in relational data base
CN103577455A (en) * 2012-07-31 2014-02-12 国际商业机器公司 Data processing method and system for database aggregating operation
CN104111936A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and system for querying data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257144A1 (en) * 2009-04-01 2010-10-07 Touchstone Systems, Inc. Method and system for data aggregation, targeting and acquisition
CN102141996A (en) * 2010-01-29 2011-08-03 国际商业机器公司 Data access method and configuration management database system
CN102651008A (en) * 2011-02-28 2012-08-29 国际商业机器公司 Method and equipment for organizing data records in relational data base
CN103577455A (en) * 2012-07-31 2014-02-12 国际商业机器公司 Data processing method and system for database aggregating operation
CN104111936A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and system for querying data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365244A (en) * 2020-11-27 2021-02-12 深圳前海微众银行股份有限公司 Data life cycle management method and device
CN112365244B (en) * 2020-11-27 2024-04-26 深圳前海微众银行股份有限公司 Data life cycle management method and device

Also Published As

Publication number Publication date
CN105677652B (en) 2019-01-04

Similar Documents

Publication Publication Date Title
EP3117347B1 (en) Systems and methods for rapid data analysis
CN109597974B (en) Report generation method and device
CN104615661A (en) Service recommendation method, device and system oriented to cloud platform applications
CN105824855B (en) Method and device for screening and classifying data objects and electronic equipment
CN107784070A (en) A kind of method, apparatus and equipment for improving data cleansing efficiency
CN109684332A (en) A kind of wide table generating method of data, apparatus and system
CN107480268A (en) Data query method and device
US10394788B2 (en) Schema-free in-graph indexing
US20170278193A1 (en) Rule based hierarchical configuration
CN105868216B (en) A kind of method, apparatus and equipment for realizing the expired operation of object
CN105589853B (en) A kind of classification catalogue determines method and device, automatic classification method and device
CN110704417A (en) Metadata management method, equipment and storage medium
CN107315652B (en) Data backup method and cloud HDFS system
CN110020333A (en) Data analysing method and device, electronic equipment, storage medium
CN110889272A (en) Data processing method, device, equipment and storage medium
CN105787004A (en) Text classification method and device
CN105677652A (en) Data management method and device
CN103793469A (en) Data inquiry statistics method and data inquiry statistics system
CN106202374A (en) A kind of data processing method and device
CN109769027A (en) A kind of information push method, device and equipment
CN110287218A (en) A kind of matched method of tax revenue sorting code number, system and equipment
CN115759250A (en) Attribution analysis method, attribution analysis device, electronic equipment and storage medium
CN105677677A (en) Information classification and device
CN109376285A (en) Data sorting verification method, electronic equipment and medium based on json format
US9852164B2 (en) Task handling in a multisystem environment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211109

Address after: Room 507, floor 5, building 3, No. 969, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee after: ZHEJIANG TMALL TECHNOLOGY Co.,Ltd.

Address before: Greater Cayman, British Cayman Islands

Patentee before: ALIBABA GROUP HOLDING Ltd.