CN102411594B - Method and device for obtaining information - Google Patents

Method and device for obtaining information Download PDF

Info

Publication number
CN102411594B
CN102411594B CN 201010292828 CN201010292828A CN102411594B CN 102411594 B CN102411594 B CN 102411594B CN 201010292828 CN201010292828 CN 201010292828 CN 201010292828 A CN201010292828 A CN 201010292828A CN 102411594 B CN102411594 B CN 102411594B
Authority
CN
China
Prior art keywords
data
information
collection
information entropy
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010292828
Other languages
Chinese (zh)
Other versions
CN102411594A (en
Inventor
李少年
蔡俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Hunan Co Ltd
Original Assignee
China Mobile Group Hunan Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Hunan Co Ltd filed Critical China Mobile Group Hunan Co Ltd
Priority to CN 201010292828 priority Critical patent/CN102411594B/en
Publication of CN102411594A publication Critical patent/CN102411594A/en
Application granted granted Critical
Publication of CN102411594B publication Critical patent/CN102411594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for obtaining information, which have the main technical scheme that: data used for obtaining information is determined in advance, and in addition, the time period for generating the data is divided into a plurality of sub time periods; and the following steps are executed by aiming at each sub time period: loading the data generated in the current sub time period; determining the first information entropy corresponding to each itemset obtained from at least one preset data attribute combination in the loaded data; determining the second information entropy corresponding to each itemset in the data generated in all sub time periods before the current sub time period; and updating the itemset set of information used for mark obtaining according to the first information entropy and the second information entropy corresponding to the first itemset. When the technical scheme is adopted, on one hand, the information obtaining efficiency is improved, and on the other hand, the system expanse is reduced.

Description

A kind of method of obtaining information and device
Technical field
The present invention relates to data processing field, relate in particular to a kind of method and device of obtaining information.
Background technology
Along with the social informatization degree improves constantly, the data of information system amount constantly expands, and different industries needs to process, analyzes data stream a large amount of and that constantly update.At present, the problem that every profession and trade faces is that data volume is very large, but wherein real valuable information seldom, therefore, how from the data of a large amount of and continuous renewal, to excavate valuable information so that follow-up business is instructed, become the difficult point of puzzlement every profession and trade.
Data mining is exactly the data processing technique of arising at the historic moment for complying with the needs that obtain valuable information from mass data.Data mining claims again the Knowledge Discovery in database (knowledge discovery in database), refer to implicit, unknown, information or pattern non-trivial and that potential using value is arranged from a large amount of incomplete, noisy, fuzzy extracting data, merged the theory and technology in a plurality of fields such as database, artificial intelligence, machine learning, statistics.Data Mining Tools can be predicted trend and behavior in the future, thereby supports well people's decision-making.
Obtain valuable information from mass data, general way is to utilize relational database at present, detailed process is: will be loaded into relational database for the mass data unification of obtaining information, and then on this basis the data that load be carried out to data mining in the hope of finding Useful Information.Relational database is to take relational model as basic database, define various data relationships in this relational model, utilized the relation of definition to carry out data of description, wherein, a relation both can be used for describing an entity and attribute thereof, also can be used for describing the contact of inter-entity.Therefore, according to relational database, data are processed, at first data source file is completed to complete loading and form the data acquisition that meets the inspection of relational database normal form, then database table is carried out on composite attribute projection meter's calculation, obtained the counting statistics value.In actual applications, adopt relational database obtaining information from mass data, after need waiting the total data that is ready to use in obtaining information to produce again property be loaded in relational database and process, make the data volume gathering that needs to carry out concerning calculating in relational database, the problem caused thus is: on the one hand, need to consume the system resources such as a large amount of CPU, I/O, internal memory, system overhead is very large; On the other hand, need the data volume of disposable processing huge, processing procedure need to expend a large amount of time, and information acquisition efficiency is low.
In sum, prior art is based on relational database obtaining information from data, and information acquisition efficiency is low, and system overhead is large.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method and device of obtaining information, adopts this technical scheme, has improved on the one hand the efficiency of acquisition of information, has reduced on the other hand system overhead.
The embodiment of the present invention is achieved through the following technical solutions:
According to an aspect of the embodiment of the present invention, provide a kind of method of obtaining information.
The method of the obtaining information provided according to the embodiment of the present invention, pre-determine the data for obtaining information, and the time period that will produce described data is divided into a plurality of sub-time periods;
Carry out for each sub-time period:
Load the data that the current sub-time period produces;
The first information entropy that the every set pair obtained by predefined at least one data attribute combination in definite described data that load is answered;
Determine the second information entropy that described in the data that all sub-time period before the current sub-time period produces, every set pair is answered;
The first information entropy of answering according to described every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign;
The data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the described information of obtaining for sign, the item collection that the item collection set of the described information of obtaining for sign is preserved carries out packet transaction.
According to another aspect of the embodiment of the present invention, also provide a kind of device of obtaining information.
The device of the obtaining information provided according to the embodiment of the present invention comprises:
The data loading unit, for being identified for the data of obtaining information, and the time period that will produce described data be divided into a plurality of sub-time periods, and load the data that the current sub-time period produces;
First information entropy determining unit, the first information entropy that the every set pair obtained by predefined at least one data attribute combination for the described data of determining described data loading unit loading is answered;
The second information entropy determining unit, second information entropy of answering for every set pair described in the data of determining all sub-time period loading of described data loading unit before the current sub-time period;
Item collection set updating block, the item collection set of the information that the second information entropy renewal that the first information entropy of answering for every set pair of determining according to described first information entropy determining unit and described the second information entropy determining unit are determined is obtained for sign;
Packet processing unit, after the item collection set of upgrading the information of obtaining for sign in the first information entropy of answering according to described every set pair respectively and the second information entropy, the data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the described information of obtaining for sign, the item collection that the item collection set of the described information of obtaining for sign is preserved carries out packet transaction.
Above-mentioned at least one technical scheme provided by the embodiment of the present invention, pre-determine the data for obtaining information, and the time period that will produce data is divided into a plurality of sub-time periods, carry out for each sub-time period: load the data that the current sub-time period produces, the first information entropy that the every set pair obtained by predefined at least one data attribute combination in definite data that load is answered, determine the second information entropy that in the data that all sub-time period before the current sub-time period produces, every set pair is answered, and the first information entropy of answering according to every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign.Adopt this technical scheme, to be divided into a plurality of sub-time periods for data based its generation time of obtaining information, once only load the data of a time period, the item collection set of the information that the Data Update based on producing in this time period is obtained for sign, compared with prior art, will from data, the task distribution of obtaining information be a plurality of execution, greatly reduced the data volume of each processing, thereby improved the efficiency of acquisition of information, and reduced system overhead.
Other features and advantages of the present invention will be set forth in the following description, and, partly from instructions, become apparent, or understand by implementing the present invention.Purpose of the present invention and other advantages can realize and obtain by specifically noted structure in the instructions write, claims and accompanying drawing.
The accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms the part of instructions, with the embodiment of the present invention one, is used from explanation the present invention, is not construed as limiting the invention.In the accompanying drawings:
The method flow diagram one of the obtaining information that Fig. 1 provides for the embodiment of the present invention one;
The process flow diagram of definite first information entropy that Fig. 2 provides for the embodiment of the present invention one;
The process flow diagram of determining the second information entropy that Fig. 3 provides for the embodiment of the present invention one;
Fig. 4 collects the process flow diagram of set for the item of the information that the renewal that the embodiment of the present invention one provides is obtained for sign;
The method flow diagram two of the obtaining information that Fig. 5 provides for the embodiment of the present invention one;
The method flow diagram two of the obtaining information that Fig. 6 provides for the embodiment of the present invention three;
The method flow diagram three of the obtaining information that Fig. 7 provides for the embodiment of the present invention three;
The method flow diagram four of the obtaining information that Fig. 8 provides for the embodiment of the present invention three;
The method flow diagram five of the obtaining information that Fig. 9 provides for the embodiment of the present invention three;
The method flow diagram six of the obtaining information that Figure 10 provides for the embodiment of the present invention three;
The method flow diagram seven of the obtaining information that Figure 11 provides for the embodiment of the present invention three;
The method flow diagram of the obtaining information that Figure 12 provides for the embodiment of the present invention four.
Embodiment
In order to provide the efficiency that improves acquisition of information and the implementation that reduces system overhead, the embodiment of the present invention provides a kind of method and device of obtaining information, below in conjunction with Figure of description, the preferred embodiments of the present invention are described, be to be understood that, preferred embodiment described herein only, for description and interpretation the present invention, is not intended to limit the present invention.And, in the situation that do not conflict, embodiment and the feature in embodiment in the application can combine mutually.
Embodiment mono-
According to the embodiment of the present invention one, a kind of method of obtaining information is provided, the method will be divided into a plurality of sub-time periods for data based its generation time of obtaining information, once only load the data of a time period, the item collection set of the information that the Data Update based on producing in this time period is obtained for sign will the task distribution of obtaining information be that a plurality of execution are to reach the efficiency that improves acquisition of information and the purpose that reduces system overhead from data.
In the method for the obtaining information that the embodiment of the present invention one provides, need to pre-determine the data for obtaining information, and the time period that will produce these data is divided into a plurality of sub-time periods.Preferably, can be divided into to a plurality of sub-time period of W constant duration T the time period that produces these data, wherein, this time interval T is more than or equal to and estimates the required duration of obtaining information from each sub-time period obtained, guaranteed before the data that load current sub-time period generation, from in the data that produce of sub-time period obtaining information complete, according to this optimal way, after having avoided loading data corresponding to current sub-time period, due to the data problem that also untreated complete (also not complete from data acquisition information) causes data processing corresponding to current sub-time period to lag behind corresponding to upper one sub-time period, thereby can guarantee the continuity that data are processed, improve the efficiency that data are processed.
Be to be understood that, the preferred implementation that the method for the sub-time period of division more than provided only provides for the embodiment of the present invention one, in concrete application, can taking into account system processing power and the concrete factors such as data processing amount, determine flexibly dividing mode, will not enumerate herein.
In the data that are identified for obtaining information and after completing the division of sub-time period, as shown in Figure 1, sub-time period of each obtaining for division is carried out following steps 101 to step 104 to the method for the obtaining information that the embodiment of the present invention one provides:
Step 101, load the data that current sub-time period produces.
In this step 101, after determining each sub-time period, this sub-time period is carried out to timing, after this sub-time period finishes, mean data that should sub-time section to be produced complete, put down in writing the data that the current sub-time period produces.In practical application, can carry out timing by time controller, and trigger the data that load each sub-time period generation.
The first information entropy that the every set pair obtained by predefined at least one data attribute combination in the data that step 102, definite current sub-time period loaded produce is answered.
Before carrying out this step 102, preset the data attribute of the data of wanting obtaining information, one or more data attributes can be set according to actual needs, and obtain a collection by the data attribute combination arranged.For example, in particular cases, a data attribute only is set, corresponding item collection is also one, and this set pair should data attribute; If N data attribute (N is more than or equal to 2) is set, can combine the item collection that obtains the corresponding different pieces of information attribute of a plurality of difference to this N data attribute, for example, 3 data attribute A, B, C are arranged, can combine and obtain 7 kinds of item collection, these 7 item collection are respectively: { A}, { B}, { C}, { A, B}, { B, C}, { A, C}, { A, B, C}.
In this step 102, determine that the detailed process of first information entropy will describe in detail in subsequent embodiment, wouldn't describe herein.
Step 103, determine the second information entropy that in the data that all sub-time period before the current sub-time period produces, every set pair is answered.
In this step 103, if the current sub-time period is first sub-time period, the second information entropy that in the data of all sub-time period generation before the current sub-time period, every set pair is answered is 0.
In this step 103, determine that the detailed process of the second information entropy will describe in detail in subsequent embodiment, wouldn't describe herein.
Step 104, the first information entropy of answering according to every set pair of determining and the second information entropy are upgraded the item collection set of the information of obtaining for sign.
In this step 103, the detailed process of upgrading the item collection set of the information of obtaining for sign will describe in detail in subsequent embodiment, wouldn't describe herein.
So far, a process of gathering for the item collection that identifies the information of obtaining according to the Data Update of a sub-time period generation finishes, and from the data of current sub-time period generation, obtaining information is complete.In above-mentioned flow process, step 102 and step 103 do not have strict execution sequence, and can first perform step 103 in practical application and perform step again 102, or executed in parallel.
In the embodiment of the present invention one, after being identified for the data of obtaining information, the data that successively each sub-time period produced according to the described flow process of Fig. 1 are processed, thereby complete the process of obtaining information from the data of each sub-time period generation.
In the step 102 of the described flow process of Fig. 1, determine the process of the first information entropy that in the described data that load, each set pair is answered, as shown in Figure 2, comprise the steps:
The data volume that meets the data attribute that this set pair answers in step 201, the data that determine to load.
The total amount of data of step 202, definite data that load.
Step 203, according to the data volume and the total amount of data that meet the data attribute that this set pair answers determined, determine the first information entropy that this set pair is answered.
So far, the process of determining the first information entropy that an item set pair is answered finishes.In above-mentioned flow process, step 201 and step 202 do not have strict execution sequence, and can first perform step 202 in practical application and perform step again 201, or executed in parallel.
In the step 201 and step 202 of flow process shown in Fig. 2, the data volume of data can be the number of data recording, the storage size that also can take for data.
Shown in Fig. 2 in the step 203 of flow process, according to the data volume and the described total amount of data that meet the data attribute that this set pair answers determined, determine the first information entropy that this set pair is answered, comprising:
Determine the data volume meet the data attribute that this set pair answers and the ratio of described total amount of data;
Utilize this ratio to be multiplied by the value that this ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the first information entropy that this set pair is answered.
In the embodiment of the present invention, this ratio is taken the logarithm and can be thought to use logarithmic function to carry out the section diffusion, because this functional value is for negative, so the negative value of the product that obtains is defined as to the first information entropy that this set pair is answered.
In the step 103 of the described flow process of Fig. 1, determine the process of the second information entropy that in the data that all sub-time period before the current sub-time period produces, each set pair is answered, as shown in Figure 3, comprise the steps:
Step 301, determine in the data that all sub-time period before the current sub-time period produces the data volume that meets the data attribute that this set pair answers.
Step 302, determine the total amount of data of the data that all sub-time period before the current sub-time period produces.
Step 303, according to described data volume and the described total amount of data that meets the data attribute that this set pair answers, determine the second information entropy that this set pair is answered.
So far, the process of determining the second information entropy that an item set pair is answered finishes.In above-mentioned flow process, step 301 and step 302 do not have strict execution sequence, and can first perform step 302 in practical application and perform step again 301, or executed in parallel.
In the step 301 and step 302 of flow process shown in Fig. 3, the data volume of data can be the number of data recording, the storage size that also can take for data.
Shown in Fig. 3 in the step 303 of flow process, according to the data volume and the described total amount of data that meet the data attribute that this set pair answers, determine the second information entropy that this set pair is answered, comprising:
Determine data volume that this meets the data attribute that this set pair answers and the ratio of described total amount of data;
Utilize this ratio to be multiplied by the value that this ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the second information entropy that this set pair is answered.
In the embodiment of the present invention, this ratio is taken the logarithm and can be thought to use logarithmic function to carry out the section diffusion, because this functional value is for negative, so the negative value of the product that obtains is defined as to the second information entropy that this set pair is answered.
In the step 104 of the described flow process of Fig. 1, the first information entropy of answering according to every set pair of determining and the second information entropy are upgraded the process of the item collection set of the information of obtaining for sign, as shown in Figure 4, comprise the steps:
Step 401, determine that corresponding first information entropy and the second information entropy sum reach first of first threshold collection set, wherein first collection in first collection set identifies by corresponding first information entropy and the second information entropy;
The item collection set that is used for identifying the information of obtaining is upgraded in step 402, first definite collection set of utilization.
The process that the first information entropy of so far, answering according to every set pair of determining and the second information entropy are upgraded the item collection set of the information of obtaining for sign finishes.
Shown in Fig. 4, in the step 402 of flow process, the process that the item collection that utilizes first set determining to upgrade the information of obtaining for sign is gathered specifically comprises:
If first collection in first collection set is included in the item collection set of the information of obtaining for sign, utilize the corresponding entry collection of replacing the item collection set of this information of obtaining for sign in this first collection set by first collection of corresponding first information entropy and the second information entropy sign;
If first collection in first collection set is not included in the item collection set of the information of obtaining for sign, delete the corresponding entry collection of the item collection set of this information of obtaining for sign.
Further, if last sub-time period in the time period of the described data of non-generation of current sub-time period,, in above-mentioned steps 104, the first information entropy of answering according to every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign, also comprise:
Determine that corresponding first information entropy reaches second of Second Threshold collection set, wherein second collection in second collection set is by corresponding first information entropy sign;
Utilize described second collection to gather the item that upgrades the information of obtaining for sign and collect set.
Wherein: utilize described second collection to gather the item set of upgrading the information of obtaining for sign, comprising:
To not be included in second collection of the item collection set of the information of obtaining for sign in second collection set, add the item collection set of the described information of obtaining for sign.
By above embodiment in technical scheme provided by the invention from data the process of obtaining information be described in detail, for understanding better the embodiment of the present invention, below further combined with the data for for obtaining information, the complete process process when user's ticket writing describes.
Before the method for the obtaining information provided in the execution embodiment of the present invention one, arrange as follows:
Set a collection implicit information entropy threshold value E p0, wherein, the corresponding above-described first information entropy of implicit information entropy, threshold value E p0corresponding above-described Second Threshold;
Set a collection information entropy threshold value E p, wherein, information entropy is implicit information entropy and accumulative total information entropy sum, corresponding above-described the second information entropy of accumulative total information entropy, threshold value E pcorresponding above-described first threshold;
The setting-up time window number | W|, wherein, and the corresponding above-described sub-time period of time window, for each sub-time period is carried out to timing, i.e. time interval of corresponding sub-time period of the sliding time interval of time window;
Setting-up time windows library table, corresponding with the time window of setting respectively, for loading the data that produce in corresponding time window;
Set the potential frequent item set set ITEM of output, wherein, each in set collection can pass through tlv triple { collection, accumulative total information entropy, implicit information entropy } and mean, the item collection set of the corresponding above-described information of obtaining for sign of this potential frequent item set set ITEM.
In above setting up procedure, threshold value E p0can be with reference to following factor setting:
1, to the probability distribution interval of all implicit information entropys of item collection carry out segmentation (p ' i, i=1,2 ..., n), then obtaining a collection implicit information entropy threshold value E according to following formula p0:
Wherein:
The probability distribution interval of all implicit information entropys of item collection, mean the distribution of the implicit information entropys of all collection, and interval end points is respectively minimum value and the maximal value of all implicit information entropys of item collection;
Segmentation is carried out in probability distribution interval to all implicit information entropys of item collection, being about to definite probability distribution interval division is a plurality of sub-ranges sections, sub-range section number can be determined according to actual probability distribution burst length, for example, the probability distribution interval is [0,0.5], can this probability distribution is interval for being divided into 5 sub-segments, each sub-range segment length is 0.1;
P ' ibe the right end points of i sub-segment, the number that n is the sub-range section.
2,, to (each time window all load operation) under stable situation, segmentation (p is carried out in the probability distribution intervals of all accumulative total information entropys of item collection i", i=1,2 ..., n), then obtaining a collection information entropy threshold value according to following formula:
Figure GDA00002738035900102
Wherein:
P i" be the right end points of i sub-segment, the number that n is the sub-range section.
After completing above the setting, as shown in Figure 5, obtaining information from data mainly comprises the steps that 501 to step 508:
Step 501, initial frequent candidate ITEM are empty, start concurrent loading procedure, complete the importing of first time window storehouse table user ticket writing.
Step 502, according to the data source property control generator selected attribute, calculate the implicit information entropy of every collection that each combinations of attributes obtains, and by implicit information entropy>=E p0item collection item 1twith (item 1t, 0, implicit information entropy) form be incorporated to ITEM.
In this step 502, calculate the implicit information entropy and pass through following formula:
The implicit information entropy=-p iln p i(special, p i=0, the implicit information entropy is 0).
Wherein: p i=data the total amount that data volume/the current time window is corresponding of answering at this set pair of current time window.
In this step 502, the selected attribute of data source property control generator pre-defines the data attribute that is used on obtaining information.
In this step 502, calculate the process of the implicit information entropy of every collection that each combinations of attributes obtains and by above-mentioned steps 102, calculate the process of every collection first information entropy, this process is described in detail in the above-described embodiments, repeats no more herein.
Step 503, time window slide, and complete the importing of next time window storehouse table user ticket writing, and next time window storehouse table is defined as to current time windows library table.
Step 504, according to the data in current time windows library table, the selected attribute according to the data source property control generator, calculate the implicit information entropy of every collection that each combinations of attributes obtains.
Step 505, implicit information entropy>=E that step 504 is calculated p0and be not included in the item collection item in ITEM itwith (item it, 0, implicit information entropy) form be incorporated to ITEM.
Step 506, for the item collection in ITEM, calculate the accumulative total information entropy of every collection, will the accumulative total information entropy and the implicit information entropy sum>=E of corresponding current time window pitem collection item t, with (item t, accumulative total information entropy, the implicit information entropy of current time window) and corresponding entry collection in substitute I TEM; Otherwise, delete this collection item in ITEM t.
In step 506, calculate the accumulative total information entropy and pass through following formula:
The accumulative total information entropy=-p i-11np i-1(p especially, i-1=0, adding up information entropy is 0);
Wherein: p i-1the data total amount that=this set pair is answered before the current time window data total amount/before the current time window is corresponding.
Particularly, in above-mentioned steps 502 and step 506, data volume can mean by the number of ticket writing.
Step 507, current time windows library list item collection are disposed, and proceed to step step 503, until all data corresponding to sub-time period all are disposed.
Especially, in this step 507, when the time moving window is counted ﹥ | during W|, replace the most forward time window storehouse table according to queue structure, the data that are about in first time window storehouse table are deleted, will be current the | and first time window storehouse of the data importing of W|+1 sub-time period is shown, the like.
Step 508, time window slide and finish, and export potential frequent item set set ITEM.
According to the flow process shown in Fig. 5, at first real-time stream is imported to the 1st time window storehouse table, each collection item that at least one data attribute combination that the property control generator is exported obtains 1t, calculate the implicit information entropy, will meet implicit information entropy>=E p0an item collection tlv triple be incorporated to ITEM; Then the 2nd data importing that the sub-time period is corresponding constantly, complete the importing of the 2nd time window storehouse table user call bill data stream, the same according to the 1st time window storehouse list processing (LISP) method, first calculate the implicit information entropy of each collection that in the 2nd time window storehouse table, at least one data attribute combination of property control generator output obtains, the item collection be not included in ITEM is incorporated to ITEM, and further calculate the accumulative total information entropy, will meet (accumulative total information entropy+current time window implicit information entropy)>=E pan item collection tlv triple replace original frequent candidate tlv triple, otherwise, delete this frequent candidate tlv triple in ITEM.Repeat this process, until be disposed the | W| time window storehouse table now all imports data by data stream time window storehouse table complete.Complete from the obtaining information of the data for obtaining information of determining.
In next data stream, import constantly, determine the data for obtaining information make new advances and while being processed, clear history moment time window storehouse table at most (is now the 1st time window storehouse table, by that analogy), the latest data conductance is entered to this time window storehouse table.
Said process is calculated by the information entropy to every collection under each sliding time window, thereby determine potential frequent item set set ITEM, reached at window Mining Frequent Itemsets Based continuous time, but substantially kept carrying out in global data base the purpose of Knowledge Discovery result.This algorithm has been saved the complexity of obtaining information greatly.
Below with the excavation of the language data in the communications field, be treated to example and describe an embodiment in detail, suppose to arrange 5 temporary tables (corresponding time window oral thermometer of difference: time_win1, time_win2 in database table ... time_win5, | W|=5), the main process of obtaining information is as follows:
1, by 1 time window temporary table time_win1 of 6 concurrent importing 3 general-purpose family ticket inventories to the of passage;
2, client's brand according to attribute controller output, talk times, call type, the conversation community, the data attributes such as duration of call average, calculate the implicit information entropy collected that each data attribute combination obtains, as: the Xiang Jiwei that combination obtains: the Global Link Olympic Games 88 clients _ 88(client's brand) _ local call (call type) _ 23005_03133 (conversation community coding) _ 300~600sec(duration of call average), this collection occurs 120 times in this base station cell (group) ticket inventory, this base station cell (group) at this moment between the window inventory of always conversing be recorded as 2500, at this moment between window, this combinations of attributes implicit information entropy=-(120/2500) ln(120/2500)=0.146, if get E p0=0.12, this collection " the Global Link Olympic Games 88 clients _ 88_ local call _ 23005_03133 (conversation community coding) _ 300~600sec " can be incorporated to ITEM,
3,, after interval 10 minutes (the time window length of setting), import new 2 time window temporary table time_win2 of 30,000 customer voice inventory to that produce;
4, according to client's brand of attribute controller output, the data attributes such as talk times, call type, conversation community, duration of call average, calculate the implicit information entropy collected that various combinations of attributes obtain, for the implicit information entropy>=E calculated p0and be not included in the item collection item in ITEM it, by (item it0, the implicit information entropy) be incorporated to ITEM, as: the item collection in the 2nd time window temporary table time_win2 " M-ZONE standard client _ 156_ local call _ 23014_04165 (conversation community coding) _ 0~300sec " is not included in ITEM, but the implicit information entropy>=E of this collection p0, this combinations of attributes item is incorporated to ITEM.
For each collection item in ITEM Already in it, calculate the accumulative total information entropy of every collection, and utilize (accumulative total information entropy+current time window implicit information entropy)>=E pitem collection upgrade the corresponding entry collection in ITEM, otherwise delete (accumulative total information entropy+current time window implicit information entropy)<E in ITEM pitem collection.As: " the Global Link Olympic Games 88 clients _ 88_ local call _ 23005_03133 (conversation community coding)-_ 300~600sec " implicit information entropy on the 2nd time window=-(180/2500) ln(180/2500)=0.189438, this combinations of attributes adds up information entropy+implicit information entropy=0.146+0.18.Notice that at this this is the 2nd window, the accumulative total information entropy can directly be quoted the implicit information entropy of the 1st window, but if the 2nd, 3,4,5 windows need to calculate according to the computing formula of accumulative total information entropy.
According to said process, just can effectively keep the renewal of ITEM middle term collection.
5, the like to having calculated 5 windows.
The data of this obtaining information is disposed, if need to process the data of next group obtaining information, think that the 6th time interval arrives, by Occupation time, first window time_win1 at most empties, import the 6th time window client inventory, upgrade ITEM according to above the 4th step again, until the data stream cut-off.
Embodiment bis-
According to the embodiment of the present invention two, a kind of method of obtaining information is provided, the method for this obtaining information, on the basis of above-described embodiment one, is optimized the item collection set of the information of obtaining for sign that obtains according to above-described embodiment one.
Particularly, in the method for utilizing above-described embodiment one to provide, after the first information entropy of answering according to every set pair respectively and the second information entropy are upgraded the item collection set (being above-mentioned steps 104) of the information of obtaining for sign, also further carry out following steps:
The data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the information obtained for sign, the item collection that the item collection set of the information that this is obtained for sign is preserved carries out packet transaction.
The technical scheme provided according to this embodiment bis-, after through above-described embodiment, the data for obtaining information being processed to (being called coagulation), be met the potential frequent item set set ITEM of information entropy condition, now according to the requirement of data-flow analysis, ITEM item collection is further classified and extracted the frequent item set that practical significance is larger (being called two stage treatment), thereby represent more intuitively the information of obtaining.Also this analysis result can be summarized as to data knowledge and incorporate the special knowledge storehouse, further to promote the Knowledge Discovery of the potential frequent item set of data stream (being valuable information).
For example, information in the potential frequent item set set ITEM, obtained through coagulation is as shown in the table:
Figure GDA00002738035900151
The rule-based approach knowledge of above domination is carried out to the processing of knowledge alienation, that is: the management system of creating a file is carried out taxonomic organization and storage according to knowledge requirement group classification framework or standard, to identify the similarity that comprises knowledge between every collection, item collection storage file organizational form after obtaining the knowledge alienation and processing, as shown in the table:
Figure GDA00002738035900152
Wherein: Class_1 ... Class_M is the different data attribute of correspondence respectively, can be a data attribute, can be also the combination of a plurality of data attributes, and concrete data attribute is determined according to business demand.
The item that carries out taxonomic organization, storage according to knowledge requirement group classification framework collects the decision support knowledge that has begun to take shape satisfied different knowledge requirement persons, the effect reached in real time, effectively knowledge generated.And the sorting item collection now obtained can be through the internalization processing, that is: warm, renewal that the decision support knowledge that will tentatively obtain and special knowledge storehouse historical knowledge carry out, how increase Xiang Jigeng effectively adds, the description of compound knowledge.In IT system, focus on unstructured information in the special knowledge storehouse with hierarchical structure, list structure tissue and storage, and be aided with the suitable knowledge meaning of one's words and describe and represent to different knowledge requirement persons.
The item collection that the ITEM obtained according to the embodiment of the present invention one and embodiment bis-comprises can be instructed for follow-up business.For example, the item collection that ITEM comprises:
The Global Link Olympic Games 88 clients _ 88_ local call _ 23005_03133 (conversation community coding) _ 300~600sec;
The Global Link Olympic Games 88 clients _ 356_ local call _ 23005_03133 (conversation community coding) _ 2300~300sec;
At these two items, concentrate, can obtain a situation of the Global Link Olympic Games 88 set meal clients duration of call contrast in the base station cell of appointment within the time period of appointment, obtain the same set meal client duration of call contrast situation in appointed place for the business personnel, given the interval division of the duration of call while being convenient to instruct its design set meal product.
Above are only a simple illustration, in practical application, can carry out the follow-up business adjustment with reference to the information in the ITEM obtained flexibly according to concrete business demand.As, for the webmaster personnel, can, by comprising the item collection of base station cell call volume and base station equipment availability data combinations of attributes, to obtain base station cell expansion plan reference information etc., will not enumerate herein.
Embodiment tri-
Corresponding with above-described embodiment one, the embodiment of the present invention three provides a kind of device of obtaining information, and as shown in Figure 6, the device of this obtaining information comprises:
Data loading unit 601, first information entropy determining unit 602, the second information entropy determining unit 603 and a collection set updating block 604;
Wherein:
Data loading unit 601, for being identified for the data of obtaining information, and the time period that will produce described data be divided into a plurality of sub-time periods, and load the data that the current sub-time period produces;
First information entropy determining unit 602, the first information entropy that every set pair that the described data that load for specified data loading unit 601 are obtained by predefined at least one data attribute combination is answered;
The second information entropy determining unit 603, the second information entropy that described in the data that all sub-time period for specified data loading unit 601 before the current sub-time period loads, every set pair is answered;
Item collection set updating block 604, the item collection set of the information that the second information entropy renewal that the first information entropy of answering for every set pairs of determining according to first information entropy determining unit 602 and the second information entropy determining unit 603 are determined is obtained for sign.
As shown in Figure 7, in the preferred embodiment of the present invention, the data loading unit 601 that Fig. 6 shown device comprises can specifically comprise:
Time period is divided module 601A, for being identified for the data of obtaining information, is divided into to a plurality of sub-time period of constant duration the time period that produces described data; Wherein, described interval greater than equaling to estimate the required duration of obtaining information from each sub-time period obtained;
Load-on module 601B, carry out timing for being divided to each sub-time period that module 601A divides the time period, after the current sub-time period finishes, loads the data that the current sub-time period produces.
As shown in Figure 8, in the preferred embodiment of the present invention, the first information entropy determining unit 602 that Fig. 6 shown device comprises comprises:
The first data volume determination module 602A, meet the total amount of data of the described data of the data volume of the data attribute that this set pair answers and loading for the described data that determine to load;
First information entropy determination module 602B, for the data volume and the described total amount of data that meet the data attribute that this set pair answers of determining according to the first data volume determination module 602A, determine the first information entropy that this set pair is answered.
Further, the first information entropy determination module 602B shown in Fig. 8, specifically for:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the first information entropy that this set pair is answered.
As shown in Figure 9, in the preferred embodiment of the present invention, the second information entropy determining unit 603 that Fig. 6 shown device comprises comprises:
The second data volume determination module 603A, the data that produce for all sub-time period of determining before the current sub-time period meet the total amount of data of the data that the data volume of the data attribute that this set pair answers and all sub-time period before the current sub-time period produce;
The second information entropy determination module 603B, for the data volume and the described total amount of data that meet the data attribute that this set pair answers of determining according to the second data volume determination module 603A, determine the second information entropy that this set pair is answered.
Further, the second information entropy determination module 603B shown in Fig. 9, specifically for:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the second information entropy that this set pair is answered.
As shown in figure 10, in the preferred embodiment of the present invention, the item collection set updating block 604 that Fig. 6 shown device comprises comprises:
First collection set determination module 604A, for determining that corresponding first information entropy and the second information entropy sum reach first collection set of first threshold, first collection in wherein said first collection set is by corresponding first information entropy and the second information entropy sign;
The first update module 604B, the item collection set of the information of upgrading to obtain for sign for first the collection set that utilizes first collection set determination module 604A to determine.
Further, the first update module 604B shown in Figure 10, specifically for:
When first collection in first collection set is included in the item collection set of the information of obtaining for sign, utilize the corresponding entry collection of replacing the item collection set of the described information of obtaining for sign in described first collection set by first collection of corresponding first information entropy and the second information entropy sign;
When first collection in first collection set is not included in the item collection set of the information of obtaining for sign, delete the corresponding entry collection of the item collection set of the described information of obtaining for sign.
As shown in figure 11, in the preferred embodiment of the present invention, the item collection set updating block 604 that Figure 10 shown device comprises also comprises:
Second collection set determination module 604C, for when last sub-time period of time period of the described data of non-generation of current sub-time period, determine that corresponding first information entropy reaches second collection set of Second Threshold, second collection in wherein said second collection set is by corresponding first information entropy sign;
The second update module 604D, the item collection set of the information of upgrading to obtain for sign for second the collection set that utilizes described second collection set determination module to determine.
Further, the second update module 604D shown in Figure 11, for:
To not be included in second collection of the item collection set of the information of obtaining for sign in second collection set, add the item collection set of the described information of obtaining for sign.
Should be appreciated that the logical partitioning that unit that the device of above obtaining information comprises and module are only carried out for the function realized according to this terminal, in practical application, can carry out stack or the fractionation of said units and module.And the method flow of the obtaining information that the function that the device of the obtaining information that this embodiment tri-provides is realized provides with above-described embodiment one is corresponding one by one, the more detailed treatment scheme realized for this device, be described in detail in above-described embodiment one, be not described in detail herein.
Embodiment tetra-
Corresponding with above-described embodiment two, the embodiment of the present invention four provides a kind of device of obtaining information, as shown in figure 12, on the basis of Fig. 6 shown device that the device of this obtaining information provides at above-described embodiment three, further comprises:
Packet processing unit 605, this unit for the item collection set of upgrading the information of obtaining for sign in the first information entropy of answering according to described every set pair respectively and the second information entropy after, the data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the described information of obtaining for sign, the item collection that the item collection set of the described information of obtaining for sign is preserved carries out packet transaction.
Should be appreciated that the logical partitioning that unit that the device of above obtaining information comprises and module are only carried out for the function realized according to this terminal, in practical application, can carry out stack or the fractionation of said units and module.And the method flow of the obtaining information that the function that the device of the obtaining information that this embodiment tetra-provides is realized provides with above-described embodiment two is corresponding one by one, the more detailed treatment scheme realized for this device, be described in detail in above-described embodiment two, be not described in detail herein.
In the embodiment of the present invention, the device of the obtaining information that above-described embodiment three and embodiment tetra-provide can be disposed in unit, for example small-sized network environment or test macro; Also can in cluster, dispose, big-and-middle-sized network environment for example, the unit (being the unit that embodiment tri-comprises) that carries out coagulation can be deployed in respectively in each processing node, can be deployed in management node carrying out two stage treatment unit (being the unit that embodiment tetra-further comprises).
Above-mentioned at least one technical scheme provided by the embodiment of the present invention, pre-determine the data for obtaining information, and the time period that will produce data is divided into a plurality of sub-time periods, carry out for each sub-time period: load the data that the current sub-time period produces, the first information entropy that the every set pair obtained by predefined at least one data attribute combination in definite data that load is answered, determine the second information entropy that in the data that all sub-time period before the current sub-time period produces, every set pair is answered, and the first information entropy of answering according to every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign.Adopt this technical scheme, to be divided into a plurality of sub-time periods for data based its generation time of obtaining information, once only load the data of a time period, the item collection set of the information that the Data Update based on producing in this time period is obtained for sign, compared with prior art, will from data, the task distribution of obtaining information be a plurality of execution, greatly reduced the data volume of each processing, thereby improved the efficiency of acquisition of information, and reduced system overhead.
Obviously, those skilled in the art can carry out various changes and modification and not break away from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention also is intended to comprise these changes and modification interior.

Claims (21)

1. the method for an obtaining information, is characterized in that, pre-determines the data for obtaining information, and the time period that will produce described data is divided into a plurality of sub-time periods;
Carry out for each sub-time period:
Load the data that the current sub-time period produces;
The first information entropy that the every set pair obtained by predefined at least one data attribute combination in definite described data that load is answered;
Determine the second information entropy that described in the data that all sub-time period before the current sub-time period produces, every set pair is answered;
The first information entropy of answering according to described every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign;
The data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the described information of obtaining for sign, the item collection that the item collection set of the described information of obtaining for sign is preserved carries out packet transaction.
2. the method for claim 1, is characterized in that, the time period that produces described data is divided into to a plurality of sub-time periods, comprising:
Be divided into to a plurality of sub-time period of constant duration the time period that produces described data;
Wherein, described interval greater than equaling to estimate the required duration of obtaining information from each sub-time period obtained.
3. the method for claim 1, is characterized in that, determines the first information entropy that in the described data that load, each set pair is answered, and comprising:
The total amount of data that meets the described data of the data volume of the data attribute that this set pair answers and loading in the described data that determine to load;
According to described data volume and the described total amount of data that meets the data attribute that this set pair answers, determine the first information entropy that this set pair is answered.
4. method as claimed in claim 3, is characterized in that, according to described data volume and the described total amount of data that meets the data attribute that this set pair answers, determines the first information entropy that this set pair is answered, and comprising:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the first information entropy that this set pair is answered.
5. the method for claim 1, is characterized in that, determines the second information entropy that in the data that all sub-time period before the current sub-time period produces, each set pair is answered, and comprising:
Determine in the data that all sub-time period before the current sub-time period produces the total amount of data of the data that the data volume that meets the data attribute that this set pair answers and all sub-time period before the current sub-time period produce;
According to described data volume and the described total amount of data that meets the data attribute that this set pair answers, determine the second information entropy that this set pair is answered.
6. method as claimed in claim 5, is characterized in that, according to described data volume and the described total amount of data that meets the data attribute that this set pair answers, determines the second information entropy that this set pair is answered, and comprising:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the second information entropy that this set pair is answered.
7. method as described as claim 3 or 5, is characterized in that, described data volume is:
The number of data recording; Or
The storage size that data take.
8. the method for claim 1, is characterized in that, the first information entropy of answering according to described every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign, comprising:
Determine that corresponding first information entropy and the second information entropy sum reach first collection set of first threshold, first collection in wherein said first collection set is by corresponding first information entropy and the second information entropy sign;
Utilize described first collection to gather the item that upgrades the information of obtaining for sign and collect set.
9. method as claimed in claim 8, is characterized in that, utilizes described first collection to gather the item that upgrades the information of obtaining for sign and collect set, comprising:
If first collection in first collection set is included in the item collection set of the information of obtaining for sign, utilize the corresponding entry collection of replacing the item collection set of the described information of obtaining for sign in described first collection set by first collection of corresponding first information entropy and the second information entropy sign;
If first collection in first collection set is not included in the item collection set of the information of obtaining for sign, delete the corresponding entry collection of the item collection set of the described information of obtaining for sign.
10. method as claimed in claim 8, it is characterized in that, if last sub-time period in the time period of the described data of non-generation of current sub-time period, the first information entropy of answering according to described every set pair and the second information entropy are upgraded the item collection set of the information of obtaining for sign, also comprise:
Determine that corresponding first information entropy reaches second collection set of Second Threshold, second collection in wherein said second collection set is by corresponding first information entropy sign;
Utilize described second collection to gather the item that upgrades the information of obtaining for sign and collect set.
11. method as claimed in claim 10, is characterized in that, utilizes described second collection to gather the item that upgrades the information of obtaining for sign and collect set, comprising:
To not be included in second collection of the item collection set of the information of obtaining for sign in second collection set, add the item collection set of the described information of obtaining for sign.
12. the device of an obtaining information, is characterized in that, comprising:
The data loading unit, for being identified for the data of obtaining information, and the time period that will produce described data be divided into a plurality of sub-time periods, and load the data that the current sub-time period produces;
First information entropy determining unit, the first information entropy that the every set pair obtained by predefined at least one data attribute combination for the described data of determining described data loading unit loading is answered;
The second information entropy determining unit, second information entropy of answering for every set pair described in the data of determining all sub-time period loading of described data loading unit before the current sub-time period;
Item collection set updating block, the item collection set of the information that the second information entropy renewal that the first information entropy of answering for every set pair of determining according to described first information entropy determining unit and described the second information entropy determining unit are determined is obtained for sign;
Packet processing unit, after the item collection set of upgrading the information of obtaining for sign in the first information entropy of answering according to described every set pair respectively and the second information entropy, the data attribute of answering according to the every set pair of item collection set of the data attribute to be extracted of setting and the described information of obtaining for sign, the item collection that the item collection set of the described information of obtaining for sign is preserved carries out packet transaction.
13. device as claimed in claim 12, is characterized in that, described data loading unit comprises:
Time period is divided module, for being identified for the data of obtaining information, is divided into to a plurality of sub-time period of constant duration the time period that produces described data; Wherein, described interval greater than equaling to estimate the required duration of obtaining information from each sub-time period obtained;
Load-on module, carry out timing for each sub-time period of described time period being divided to Module Division, after the current sub-time period finishes, loads the data that the current sub-time period produces.
14. device as claimed in claim 12, is characterized in that, described first information entropy determining unit comprises:
The first data volume determination module, meet the total amount of data of the described data of the data volume of the data attribute that this set pair answers and loading for the described data that determine to load;
First information entropy determination module, for the data volume and the described total amount of data that meet the data attribute that this set pair answers of determining according to described the first data volume determination module, determine the first information entropy that this set pair is answered.
15. device as claimed in claim 14, is characterized in that, described first information entropy determination module, specifically for:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the first information entropy that this set pair is answered.
16. device as claimed in claim 12, is characterized in that, described the second information entropy determining unit comprises:
The second data volume determination module, the data that produce for all sub-time period of determining before the current sub-time period meet the total amount of data of the data that the data volume of the data attribute that this set pair answers and all sub-time period before the current sub-time period produce;
The second information entropy determination module, for the data volume and the described total amount of data that meet the data attribute that this set pair answers of determining according to described the second data volume determination module, determine the second information entropy that this set pair is answered.
17. device as claimed in claim 16, is characterized in that, described the second information entropy determination module, specifically for:
Determine the described data volume of the data attribute that this set pair answers and the ratio of described total amount of data of meeting;
Utilize described ratio to be multiplied by the value that described ratio is taken the logarithm and obtained, the negative value of the product that obtains is defined as to the second information entropy that this set pair is answered.
18. device as claimed in claim 12, is characterized in that, described collection set updating block comprises:
First collection set determination module, for determining that corresponding first information entropy and the second information entropy sum reach first collection set of first threshold, first collection in wherein said first collection set is by corresponding first information entropy and the second information entropy sign;
The first update module, the item collection set of the information of upgrading to obtain for sign for first the collection set that utilizes described first collection set determination module to determine.
19. device as claimed in claim 18, is characterized in that, described the first update module, specifically for:
When first collection in first collection set is included in the item collection set of the information of obtaining for sign, utilize the corresponding entry collection of replacing the item collection set of the described information of obtaining for sign in described first collection set by first collection of corresponding first information entropy and the second information entropy sign;
When first collection in first collection set is not included in the item collection set of the information of obtaining for sign, delete the corresponding entry collection of the item collection set of the described information of obtaining for sign.
20. device as claimed in claim 18, is characterized in that, described collection set updating block also comprises:
Second collection set determination module, for when last sub-time period of time period of the described data of non-generation of current sub-time period, determine that corresponding first information entropy reaches second collection set of Second Threshold, second collection in wherein said second collection set is by corresponding first information entropy sign;
The second update module, the item collection set of the information of upgrading to obtain for sign for second the collection set that utilizes described second collection set determination module to determine.
21. device as claimed in claim 20, it is characterized in that, described the second update module, for: second collection set is not included in to second collection of the item collection set of the information of obtaining for sign, adds the item collection set of the described information of obtaining for sign.
CN 201010292828 2010-09-25 2010-09-25 Method and device for obtaining information Active CN102411594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010292828 CN102411594B (en) 2010-09-25 2010-09-25 Method and device for obtaining information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010292828 CN102411594B (en) 2010-09-25 2010-09-25 Method and device for obtaining information

Publications (2)

Publication Number Publication Date
CN102411594A CN102411594A (en) 2012-04-11
CN102411594B true CN102411594B (en) 2013-06-26

Family

ID=45913668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010292828 Active CN102411594B (en) 2010-09-25 2010-09-25 Method and device for obtaining information

Country Status (1)

Country Link
CN (1) CN102411594B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809140A (en) * 2014-01-29 2015-07-29 中国银联股份有限公司 Method and system for counting trading data
CN105975623A (en) * 2016-05-30 2016-09-28 中国能源建设集团湖南火电建设有限公司 Method and system for obtaining organization information by means of query expressions
CN107290967B (en) * 2017-08-07 2018-08-31 合肥工业大学 A kind of automatic loading method and system of hydraulic press service function
CN107729571B (en) * 2017-11-23 2020-04-14 北京天广汇通科技有限公司 Relationship discovery method and device
US10922139B2 (en) * 2018-10-11 2021-02-16 Visa International Service Association System, method, and computer program product for processing large data sets by balancing entropy between distributed data segments
CN109739880A (en) * 2018-12-20 2019-05-10 中国联合网络通信集团有限公司 Method for computing data and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3491147B2 (en) * 2000-03-30 2004-01-26 Jfeスチール株式会社 Defect detection method and defect detection device
US6918099B2 (en) * 2003-02-19 2005-07-12 Sun Microsystems, Inc. Method and system for entropy driven verification
CN1770094A (en) * 2005-10-17 2006-05-10 浙江大学 High quality true random number generator
CN101378394A (en) * 2008-09-26 2009-03-04 成都市华为赛门铁克科技有限公司 Detection defense method for distributed reject service and network appliance
CN101494049A (en) * 2009-03-11 2009-07-29 北京邮电大学 Method for extracting audio characteristic parameter of audio monitoring system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3491147B2 (en) * 2000-03-30 2004-01-26 Jfeスチール株式会社 Defect detection method and defect detection device
US6918099B2 (en) * 2003-02-19 2005-07-12 Sun Microsystems, Inc. Method and system for entropy driven verification
CN1770094A (en) * 2005-10-17 2006-05-10 浙江大学 High quality true random number generator
CN101378394A (en) * 2008-09-26 2009-03-04 成都市华为赛门铁克科技有限公司 Detection defense method for distributed reject service and network appliance
CN101494049A (en) * 2009-03-11 2009-07-29 北京邮电大学 Method for extracting audio characteristic parameter of audio monitoring system

Also Published As

Publication number Publication date
CN102411594A (en) 2012-04-11

Similar Documents

Publication Publication Date Title
CN102411594B (en) Method and device for obtaining information
Yang et al. Postprocessing decision trees to extract actionable knowledge
CN104679595B (en) A kind of application oriented IaaS layers of dynamic resource allocation method
Mori et al. Optimal regression tree based rule discovery for short-term load forecasting
CN109242170A (en) A kind of City Road Management System and method based on data mining technology
CN105721279A (en) Relationship circle excavation method and system of telecommunication network users
CN107832291A (en) Client service method, electronic installation and the storage medium of man-machine collaboration
CN110069629A (en) House transaction task processing method, equipment, storage medium and device
CN108399553A (en) It is a kind of to consider geographical and circuit subordinate relation user characteristics label setting method
CN109656898A (en) Distributed large-scale complex community detection method and device based on node degree
Zhang et al. Logistics service supply chain order allocation mixed K-Means and Qos matching
Meisels et al. Combining rules and constraints for employee timetabling
CN116680090B (en) Edge computing network management method and platform based on big data
Shaw et al. The critical‐item, upper bounds, and a branch‐and‐bound algorithm for the tree knapsack problem
CN102209369B (en) Method based on wireless network interface selection to improve a smart phone user experience
CN113641654B (en) Marketing treatment rule engine method based on real-time event
CN111292201A (en) Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE
CN113641705B (en) Marketing disposal rule engine method based on calculation engine
CN110532366A (en) A kind of pattern rule management method, language generation method, apparatus and storage equipment
CN113538011B (en) Method for associating non-booked contact information with booked user in electric power system
CN112507213B (en) Method for recommending optimized system scheme based on behavior big data analysis
CN115293479A (en) Public opinion analysis workflow system and method thereof
Cheng et al. A unified approach to finding good stable matchings in the hospitals/residents setting
Levin et al. Towards modular redesign of networked system
CN112000389A (en) Configuration recommendation method, system, device and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant