CN102314491A - Method for identifying similar behavior mode users in multicore environment based on massive logs - Google Patents

Method for identifying similar behavior mode users in multicore environment based on massive logs Download PDF

Info

Publication number
CN102314491A
CN102314491A CN201110242122A CN201110242122A CN102314491A CN 102314491 A CN102314491 A CN 102314491A CN 201110242122 A CN201110242122 A CN 201110242122A CN 201110242122 A CN201110242122 A CN 201110242122A CN 102314491 A CN102314491 A CN 102314491A
Authority
CN
China
Prior art keywords
similar behavior
behavior pattern
local
log
log data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110242122A
Other languages
Chinese (zh)
Other versions
CN102314491B (en
Inventor
俞东进
李万清
郑苏杭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haining Dingcheng Intelligent Equipment Co ltd
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN 201110242122 priority Critical patent/CN102314491B/en
Publication of CN102314491A publication Critical patent/CN102314491A/en
Application granted granted Critical
Publication of CN102314491B publication Critical patent/CN102314491B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for identifying similar behavior mode users in a multicore environment based on massive logs. The conventional method has the defects of high computational load and numerous and heavy I/O (Input/Output) operation. The method comprises the following steps of: setting an independent log database at a WEB server end to store a log data set for recording user access information; reading a part of log information in the log data set into a universal computer internal storage with a built-in multicore CPU (Central Processing Unit); splitting the log data set according to a thread quantity set in the multicore environment to obtain a plurality of local log data sets serving as processing data sources of each thread; searching for a local log data set respectively for each thread, acquiring local similar behavior modes and reducing; and merging local similar behavior mode sets obtained for each thread in parallel into a global similar behavior mode set to obtain users having similar behavior modes. According to the method, the identifying process of similar behavior modes has high running efficiency and high speed-up ratio.

Description

Similar behavior pattern user identification method based on massive logs under multi-core environment
Technical field
The invention belongs to data mining technical field, and in particular to the similar behavior pattern user identification method based on massive logs under to a kind of multi-core environment.
Background technology
Internet has become a huge, widely distributed and global information service center, just gradually penetrates into the routine work, life and other fields of people.Substantial amounts of user carries out information inquiry and purchase commodity by accessing e-commerce website.By analyzing the access log file in Web server, so as to find that user accesses website and browses rule, the behavior pattern it is appreciated that different user can be helped, it is final to improve Web site and the bigger economic benefit of acquisition provides help.
The consumption habit of the different user of research, often it can be found that may have similar behavior pattern between multiple different users.For example, they may browse sales promotion information, the on every Sundays commodity of evening net purchase on every Fridays, evening confirmation arrival and progress online payment in evening on every Thursdays;Or reading on the net, evening renewal on every Saturdays blog, on every Sundays the evening plan of arranging work may be all carried out in evening on every Fridays.The principal character of this behavior pattern can be summarized as:Multiple different users are engaged in similar behavior on close time point, and they share the similar behavior pattern with temporal characteristics in other words.The above-mentioned customer group with similar behavior pattern of identification, accurately personalized service can be provided for website and provides help, for example:Arrange facing specific crowd purchases by group activity, and very popular service content is released at suitable time point, etc..
However, the access module identification of this similar behavior relates generally to TB grades of history mass data.Although, computer technology is developed rapidly, the introducing of particularly multi-core technology can make it that the computing capability of conventional computer system obtains a certain degree of raising, but, if not implementing the optimization of the analysis process for massive logs in application layer, huge operand and heavy I/O operation may still make it that multiple nucleus system is all difficult to produce a desired effect in function and performance.
The content of the invention
There is provided the similar behavior pattern user identification method based on massive logs under a kind of multi-core environment in view of the shortcomings of the prior art by the present invention.
The inventive method is comprised the concrete steps that:
Step (1) sets single log database at WEB server end, log data set for depositing record user access information, each log information that daily record data is concentrated includes ID, access time, accesses IP, requests for page, request function number;
Step (2) is limited with free memory, reads in the partial log information of daily record data concentration to the all-purpose computer internal memory of built-in multi-core CPU;
Step (3) divides equally log data set using the equidistant static projection method of level, obtains multiple local log data sets, be used as the processing data source of each thread according to the number of threads set under multi-core environment;
The local log data set that each thread difference search step (3) of step (4) obtains, obtains local similar behavior pattern, and carry out reduction;
Step (5) repeat step (2), (3), (4), all log informations into log data set, which are processed, to be finished;
The similar behavior pattern collection of part that each thread of step (6) parallel merging is obtained obtains the user with similar behavior pattern to global similar behavior pattern collection;
The user identification method with similar behavior pattern based on massive logs is made up of one group of functional module under multi-core environment provided by the present invention, and they include:Daily record the collection sub-module such as read module, daily record collection, local icotype collection generation module drawn game category antitype collection summarizing module in batches.
Read module is limited daily record collection with free memory in batches, and the partial log information that daily record data is concentrated, including ID, access time, access IP, requests for page, request function number are read in batches.
The sub-modules such as daily record collection divide equally the daily record collection log data set that read module is read in batches using the equidistant static projection method of level, obtain multiple local log data sets according to the number of threads set under multi-core environment.
Local icotype collection generation module is sorted the pending local log data set of each thread by the way of multi-threaded parallel by the access time of daily record respectively, is obtained local similar behavior pattern and support, is built each local icotype collection.Such as local similar behavior pattern collection capacity exceedes predefined maximum memory higher limit, then is swapped out to hard disk with document form.
Local icotype collection summarizing module is by the way of multi-threaded parallel, and the support of the similar behavior pattern for each local icotype collection that adds up, Formatting Output has the user profile with similar behavior pattern of high support.
The method of the bright proposition of we uses the strategy that data parallel and tasks in parallel are combined, and is cooperateed with after the local similar behavior pattern of each thread generation, then with other threads, finally to obtain all global similar behavior patterns.This method eliminates repeatedly generating and calculating for local similar behavior pattern by parallel local reduction techniques, and can combine the static load imbalance that processor is solved the problems, such as with dynamic task allocation mechanism.Analysis magnanimity access log during, with optimizing without multithreading, directly using polycaryon processor conventional method compared with, using our bright methods described can make similar access module identification process have higher operational efficiency and speed-up ratio.
Brief description of the drawings
Fig. 1 DFDs;
Fig. 2 pattern data storage structure charts;
Fig. 3 reduction flow charts.
Embodiment
The embodiment of the similar behavior pattern user identification method based on massive logs mainly divides 3 steps under multi-core environment provided by the present invention(As shown in Figure 1):
(1)According to number of threads, divide equally global log data set for each local log data set with the equidistant static projection method of level, be used as the coordinates data source of each thread;(2)Local log data set is sequenced into sequence by the access time of daily record, parallel search is less than preset time window to the log access time interval of same request function number
Figure 2011102421228100002DEST_PATH_IMAGE002
Individual different user ID(Wherein,
Figure 2011102421228100002DEST_PATH_IMAGE004
), preserved as locally similar behavior pattern and with reference to local reduction method with document form;(3)With reference to the local similar behavior pattern of dynamic task allocation mechanism method parallel merging, its support size and minimum support threshold value set in advance are contrasted(min_sup), excavate the similar behavior pattern of target that support is more than threshold value.
For sake of convenience, related symbol is defined as follows:
Figure 2011102421228100002DEST_PATH_IMAGE006
:The
Figure 2011102421228100002DEST_PATH_IMAGE008
Individual thread.
Figure 2011102421228100002DEST_PATH_IMAGE010
:It is all to contain to the log access time interval of same request function number less than preset time window
Figure 1091DEST_PATH_IMAGE002
The individual different user ID similar behavior pattern collection of part.
Figure 2011102421228100002DEST_PATH_IMAGE012
:It is all to contain to the log access time interval of same request function number less than preset time window
Figure 843145DEST_PATH_IMAGE002
Individual different user ID, and ID values are the j similar behavior pattern collection of part. 
Figure 2011102421228100002DEST_PATH_IMAGE014
:It is all to contain
Figure 799206DEST_PATH_IMAGE002
The individual different user ID similar behavior pattern support of part is more thanmin_supThe similar behavior pattern collection of target, i.e.,
Figure 2011102421228100002DEST_PATH_IMAGE016
:It is assigned toThe sequence log data set of individual thread.
Figure DEST_PATH_IMAGE022
:Distribute to
Figure 800529DEST_PATH_IMAGE006
Contain
Figure 186773DEST_PATH_IMAGE002
The individual different user ID similar behavior pattern collection of part, and
Figure DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE026
:Distribute to
Figure 516123DEST_PATH_IMAGE006
Contain
Figure 777340DEST_PATH_IMAGE002
Individual different user ID, and ID values are the j similar behavior pattern collection of part.
Figure DEST_PATH_IMAGE028
:Distribute to
Figure 508536DEST_PATH_IMAGE006
Contain
Figure 501900DEST_PATH_IMAGE002
The individual different user ID similar behavior pattern support of part is more thanmin_supThe similar behavior pattern collection of target, and
Figure DEST_PATH_IMAGE030
(1)The global log data set of static state etc. point
In order to reduce the generation of data-bias and realize the local similar behavior pattern of multithreading search, the equidistant static projection data resolution model of level is used to divide equally global log data set to realize each thread-data load balancing first.Have assuming that global daily record data is concentrated
Figure DEST_PATH_IMAGE032
Bar is recorded, and is divided into complete log data set n part (n=number of threads) using the equidistant static projection distribution method of level so that the local log data set that each thread is distributed is
Figure DEST_PATH_IMAGE034
.Wherein
Figure DEST_PATH_IMAGE036
The corresponding data set of individual threadRepresent:
Figure DEST_PATH_IMAGE040
Figure DEST_PATH_IMAGE042
For
Figure 4032DEST_PATH_IMAGE036
The of individual thread
Figure DEST_PATH_IMAGE044
Bar is recorded,
Figure DEST_PATH_IMAGE046
The concentrated for global daily record dataBar is recorded.By with up conversion, global log data set
Figure DEST_PATH_IMAGE050
It is divided into
Figure DEST_PATH_IMAGE052
Individual scale is
Figure DEST_PATH_IMAGE054
Locality set, i.e.,
Figure DEST_PATH_IMAGE058
(2)Local similar behavior pattern search, reduction and preservation
Thread
Figure DEST_PATH_IMAGE060
Obtain local log data set
Figure 52628DEST_PATH_IMAGE018
Afterwards, first it is directed to
Figure 576014DEST_PATH_IMAGE018
It is independent to search for similar behavior pattern, that is, search and store to the log access time interval of same request function number less than preset time window
Figure DEST_PATH_IMAGE062
Individual different user ID.To reduce the competition under multi-core environment on limited memory bandwidth, if
Figure 68175DEST_PATH_IMAGE018
It can be stored entirely in internal memory, then will
Figure 372117DEST_PATH_IMAGE018
Internal memory is write after being sorted by daily record access time;If
Figure DEST_PATH_IMAGE064
It is larger, it is impossible to all storages, then it is right
Figure 709558DEST_PATH_IMAGE064
Secondary division is carried out, is formed, then will
Figure 916810DEST_PATH_IMAGE066
Internal memory is write after sequence.For current MDB(Memory Database), algorithm search is all
Figure DEST_PATH_IMAGE068
And be stored in corresponding data structure(As shown in Figure 2), then search successively next
Figure 314294DEST_PATH_IMAGE018
(), circulation execution is until again without the new similar behavior pattern of part
Figure DEST_PATH_IMAGE070
Untill generation.In the process, if the similar behavior pattern capacity of part of generation has reached pre-defined maximum memory higher limit, first by the part
Figure 184346DEST_PATH_IMAGE070
Preserved in a hard disk with document form, then calculate again and store new
Figure 682324DEST_PATH_IMAGE070
In order to improve digging efficiency, reduce
Figure 516288DEST_PATH_IMAGE022
Number and
Figure DEST_PATH_IMAGE072
Complexity, algorithm combines local reduction techniques to eliminate
Figure 27778DEST_PATH_IMAGE022
Repeatedly generate and calculate.That is thread
Figure 910284DEST_PATH_IMAGE006
Scanning
Figure 325084DEST_PATH_IMAGE018
Only preserve ID value
Figure DEST_PATH_IMAGE074
Uniquely, will be identical
Figure DEST_PATH_IMAGE076
Similar behavior pattern support number added up.For thread
Figure 830201DEST_PATH_IMAGE006
, it is as shown in Figure 3 that the reduction method implements process.
Algorithm reduction operation causes all identicalsIn same chained list, the support number of the similar behavior pattern of identical part is added up so as to quickly realize, it is ensured that
Figure 287169DEST_PATH_IMAGE006
In obtain
Figure 400619DEST_PATH_IMAGE022
It is not only support(sup)For the 1 similar behavior pattern collection of part
Figure 388166DEST_PATH_IMAGE022
, and exist simultaneously
Figure DEST_PATH_IMAGE078
's
Figure 674791DEST_PATH_IMAGE022
.Therefore, first time scanning of algorithm will be seen that sup is different and contain
Figure 736288DEST_PATH_IMAGE002
Individual different user ID's
Figure 82956DEST_PATH_IMAGE022
, Ran Hou
Figure 495483DEST_PATH_IMAGE052
Time scanning is with the
Figure DEST_PATH_IMAGE080
Time
Figure 84334DEST_PATH_IMAGE022
Sup set generate new sup's as subset
Figure 265916DEST_PATH_IMAGE022
, and using it as next time scanning subset.So circulation performs all until excavating
Figure 517906DEST_PATH_IMAGE022
, and finally preserved with file.To prevent under multi-core environment to the read-writes of shared file data or writing competition, the name of file uses the guard method distinguished with different threads name.
(3)Local similar behavior pattern merger
Will
Figure 417729DEST_PATH_IMAGE006
The obtained similar behavior pattern of part with
Figure DEST_PATH_IMAGE082
Interaction, realizes that local similar behavior pattern collects, i.e., in All Files
Figure 46156DEST_PATH_IMAGE002
Individual ID value
Figure 410142DEST_PATH_IMAGE074
The local similar behavior pattern of identical
Figure 770716DEST_PATH_IMAGE076
Support sup added up, so as to obtain the similar behavior pattern of target
Figure 721617DEST_PATH_IMAGE014
.Task-decomposing pattern of this stage further in conjunction with multi-thread programming, by using dynamic task allocation mechanism, the file started with different reference numbers of a document is first collected task to be retained in successively in a global task list, then each thread handles being assigned to for task simultaneously, is mutually independent, does not interfere with each other.Cycle criterion whether there is idle processor core to method at a certain time interval, if in the presence of one new task of reading is handled on this processor core from global task list.This method is called repeatedly, collects that task is processed to be finished until all in list, what is now obtained is all
Figure 91418DEST_PATH_IMAGE010
The global similar behavior pattern as excavated, most at last
Figure 44331DEST_PATH_IMAGE010
Support number compared generation with min_sup
Figure 841385DEST_PATH_IMAGE014
, and
Figure 512538DEST_PATH_IMAGE014
Corresponding ID then shares similar behavior pattern.In order to reduce the amount of calculation of data summarization, the file initial number of current task keeps increasing 1 than the last file initial number for collecting task.
The present invention is excavated available for the massive logs of e-commerce website, has the user of similar behavior pattern with quick identification.

Claims (1)

1. the similar behavior pattern user identification method based on massive logs under multi-core environment, it is characterised in that this method is comprised the concrete steps that:
Step (1) sets single log database at WEB server end, log data set for depositing record user access information, each log information that daily record data is concentrated includes ID, access time, accesses IP, requests for page and request function number;
Step (2) is limited with free memory, reads in the partial log information of daily record data concentration to the all-purpose computer internal memory of built-in multi-core CPU;
Step (3) divides equally log data set using the equidistant static projection method of level, obtains multiple local log data sets, be used as the processing data source of each thread according to the number of threads set under multi-core environment;
If global daily record data, which is concentrated, R bars record, complete log data set is divided into by n part, wherein n=number of threads using the equidistant static projection distribution method of level so that the local log data set that each thread is distributed is
Figure 2011102421228100001DEST_PATH_IMAGE002
, wherein
Figure 2011102421228100001DEST_PATH_IMAGE004
The local log data set that each thread difference search steps (3) of step (4) obtain, obtains local similar behavior pattern, and carry out reduction;
Each thread will need local log data set to be processed to be sorted by the access time of daily record after arriving first;As k different user ID is less than default window time for the log access time interval of same request function number and not yet inserts local similar behavior pattern collection, local similar behavior pattern collection then is inserted using this k ID as an item, and remembers that the support of this is 1;Such as k different user ID is less than default window time for the log access time interval of same request function number, while corresponding item has inserted local similar behavior pattern collection, then the support of this is added 1, wherein k>=2
In the process, if the similar behavior pattern collection capacity of part of generation has reached pre-defined maximum memory higher limit, first the local similar behavior pattern collection can be preserved in a hard disk with document form;
Step (5) repeat steps (2), (3), (4), all log informations into log data set, which are processed, to be finished;
The similar behavior pattern collection of part that each thread of step (6) parallel mergings is obtained obtains the user with similar behavior pattern to global similar behavior pattern collection;
The idle core of selection, merges the local similar behavior pattern collection in part to the similar behavior collection of 1 new part, i.e., concentrates the support of identical entry to be added up local similar behavior pattern, form the similar behavior pattern collection of 1 new part;Multi-core parallel concurrent performs above-mentioned work, until finally obtaining the similar behavior pattern collection of 1 overall situation, such as wherein the support of some exceedes threshold value, then corresponding k user is the user of shared similar behavior pattern.
CN 201110242122 2011-08-23 2011-08-23 Method for identifying similar behavior mode users in multicore environment based on massive logs Expired - Fee Related CN102314491B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110242122 CN102314491B (en) 2011-08-23 2011-08-23 Method for identifying similar behavior mode users in multicore environment based on massive logs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110242122 CN102314491B (en) 2011-08-23 2011-08-23 Method for identifying similar behavior mode users in multicore environment based on massive logs

Publications (2)

Publication Number Publication Date
CN102314491A true CN102314491A (en) 2012-01-11
CN102314491B CN102314491B (en) 2013-03-13

Family

ID=45427656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110242122 Expired - Fee Related CN102314491B (en) 2011-08-23 2011-08-23 Method for identifying similar behavior mode users in multicore environment based on massive logs

Country Status (1)

Country Link
CN (1) CN102314491B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999572A (en) * 2012-11-09 2013-03-27 同济大学 User behavior mode digging system and user behavior mode digging method
CN103428217A (en) * 2013-08-19 2013-12-04 中国航空动力机械研究所 Method and system for dispatching distributed parallel computing job
CN104239133A (en) * 2014-09-26 2014-12-24 北京国双科技有限公司 Log processing method, device and server
CN104715076A (en) * 2015-04-13 2015-06-17 东信和平科技股份有限公司 Multi-threaded data processing method and device
CN104731796A (en) * 2013-12-19 2015-06-24 北京思博途信息技术有限公司 Data storage computing method and system
CN107948234A (en) * 2016-10-13 2018-04-20 北京国双科技有限公司 The processing method and processing device of data
CN110737531A (en) * 2019-09-27 2020-01-31 山东英信计算机技术有限公司 fault diagnosis method, device, equipment and medium
CN113342744A (en) * 2021-06-02 2021-09-03 北京优特捷信息技术有限公司 Parallel construction method, device and equipment of call chain and storage medium
CN113360313A (en) * 2021-07-07 2021-09-07 时代云英(深圳)科技有限公司 Behavior analysis method based on massive system logs

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009211375A (en) * 2008-03-04 2009-09-17 Nippon Telegr & Teleph Corp <Ntt> Network management server
CN101572629A (en) * 2009-05-31 2009-11-04 腾讯科技(深圳)有限公司 Method and device for processing IP data
CN101582817A (en) * 2009-06-29 2009-11-18 华中科技大学 Method for extracting network interactive behavioral pattern and analyzing similarity
CN101729288A (en) * 2008-10-31 2010-06-09 中国科学院计算机网络信息中心 Method and device for counting network access behaviours of internet users

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009211375A (en) * 2008-03-04 2009-09-17 Nippon Telegr & Teleph Corp <Ntt> Network management server
CN101729288A (en) * 2008-10-31 2010-06-09 中国科学院计算机网络信息中心 Method and device for counting network access behaviours of internet users
CN101572629A (en) * 2009-05-31 2009-11-04 腾讯科技(深圳)有限公司 Method and device for processing IP data
CN101582817A (en) * 2009-06-29 2009-11-18 华中科技大学 Method for extracting network interactive behavioral pattern and analyzing similarity

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999572A (en) * 2012-11-09 2013-03-27 同济大学 User behavior mode digging system and user behavior mode digging method
CN102999572B (en) * 2012-11-09 2015-11-04 同济大学 User's behavior pattern mining system and method thereof
CN103428217A (en) * 2013-08-19 2013-12-04 中国航空动力机械研究所 Method and system for dispatching distributed parallel computing job
CN103428217B (en) * 2013-08-19 2016-05-25 中国航空动力机械研究所 Operation distribution method and distribution system that distributed parallel calculates
CN104731796A (en) * 2013-12-19 2015-06-24 北京思博途信息技术有限公司 Data storage computing method and system
CN104731796B (en) * 2013-12-19 2017-12-19 秒针信息技术有限公司 Data storage computational methods and system
CN104239133B (en) * 2014-09-26 2018-03-30 北京国双科技有限公司 A kind of log processing method, device and server
CN104239133A (en) * 2014-09-26 2014-12-24 北京国双科技有限公司 Log processing method, device and server
CN104715076A (en) * 2015-04-13 2015-06-17 东信和平科技股份有限公司 Multi-threaded data processing method and device
CN107948234A (en) * 2016-10-13 2018-04-20 北京国双科技有限公司 The processing method and processing device of data
CN107948234B (en) * 2016-10-13 2021-02-12 北京国双科技有限公司 Data processing method and device
CN110737531A (en) * 2019-09-27 2020-01-31 山东英信计算机技术有限公司 fault diagnosis method, device, equipment and medium
CN113342744A (en) * 2021-06-02 2021-09-03 北京优特捷信息技术有限公司 Parallel construction method, device and equipment of call chain and storage medium
CN113360313A (en) * 2021-07-07 2021-09-07 时代云英(深圳)科技有限公司 Behavior analysis method based on massive system logs

Also Published As

Publication number Publication date
CN102314491B (en) 2013-03-13

Similar Documents

Publication Publication Date Title
CN102314491B (en) Method for identifying similar behavior mode users in multicore environment based on massive logs
AU2017202873B2 (en) Efficient query processing using histograms in a columnar database
Xun et al. Fidoop: Parallel mining of frequent itemsets using mapreduce
CN103748579B (en) Data are handled in MapReduce frame
Chen et al. Map-reduce meets wider varieties of applications
Curino et al. Schism: a workload-driven approach to database replication and partitioning
Luo et al. Cloudrank-d: benchmarking and ranking cloud computing systems for data processing applications
Slagter et al. An improved partitioning mechanism for optimizing massive data analysis using MapReduce
AU2015369723B2 (en) Identifying join relationships based on transactional access patterns
CN113424173A (en) Materialized graph views for active graph analysis
Turk et al. Temporal workload-aware replicated partitioning for social networks
TWI539306B (en) Information delivery method, processing server and merge server
Bagui et al. Positive and negative association rule mining in Hadoop’s MapReduce environment
CN103257923B (en) The application choosing method of data center&#39;s data analysis class benchmark and system
Li et al. Bohr: similarity aware geo-distributed data analytics
CN106599122B (en) Parallel frequent closed sequence mining method based on vertical decomposition
Pang et al. PUMA: Parallel subspace clustering of categorical data using multi-attribute weights
JP7213890B2 (en) Accelerated large-scale similarity computation
US20180349372A1 (en) Media item recommendations based on social relationships
Trifu et al. Big data components for business process optimization
JP2013101539A (en) Sampling device, sampling program, and method therefor
JP5634859B2 (en) Site cluster system and site cluster method
Hong et al. Evaluating Presto and SparkSQL with TPC-DS
Tsai et al. A document clustering approach for search engines
Tripathy et al. Designing a collaborative filtering recommender on the single chip cloud computer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20120111

Assignee: ZHEJIANG XIETONG DATA SYSTEM Co.,Ltd.

Assignor: HANGZHOU DIANZI University

Contract record no.: 2013330000098

Denomination of invention: Method for identifying similar behavior mode users in multicore environment based on massive logs

Granted publication date: 20130313

License type: Common License

Record date: 20130424

Application publication date: 20120111

Assignee: ZHEJIANG TOPTHINKING INFORMATION TECHNOLOGY Co.,Ltd.

Assignor: HANGZHOU DIANZI University

Contract record no.: 2013330000097

Denomination of invention: Method for identifying similar behavior mode users in multicore environment based on massive logs

Granted publication date: 20130313

License type: Common License

Record date: 20130424

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210630

Address after: 314400 room 609, block a, 128 Shuanglian Road, Haining Economic Development Zone, Haining City, Jiaxing City, Zhejiang Province

Patentee after: Haining Dingcheng Intelligent Equipment Co.,Ltd.

Address before: 310018 No. 2 street, Xiasha Higher Education Zone, Hangzhou, Zhejiang

Patentee before: HANGZHOU DIANZI University

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130313