CN110096529B - Network data mining method and system based on multidimensional vector data - Google Patents

Network data mining method and system based on multidimensional vector data Download PDF

Info

Publication number
CN110096529B
CN110096529B CN201910305243.9A CN201910305243A CN110096529B CN 110096529 B CN110096529 B CN 110096529B CN 201910305243 A CN201910305243 A CN 201910305243A CN 110096529 B CN110096529 B CN 110096529B
Authority
CN
China
Prior art keywords
information
data
family
network
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910305243.9A
Other languages
Chinese (zh)
Other versions
CN110096529A (en
Inventor
张俊曦
邢国贤
王石
赵学豪
吴坤鹏
朱翼署
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Jinlian Beijing Technology Co ltd
Original Assignee
Zhongke Jinlian Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Jinlian Beijing Technology Co ltd filed Critical Zhongke Jinlian Beijing Technology Co ltd
Priority to CN201910305243.9A priority Critical patent/CN110096529B/en
Publication of CN110096529A publication Critical patent/CN110096529A/en
Application granted granted Critical
Publication of CN110096529B publication Critical patent/CN110096529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a network data mining method and system based on multidimensional vector data. The method comprises the following steps: vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, searching the multidimensional vector data sources according to set conditions, summarizing search results, performing cluster analysis on the summarized search results to generate an information data family set, counting the spatial vector distribution of each information data family in the information data family set, and obtaining the relevance of the network data through relevance analysis. By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.

Description

Network data mining method and system based on multidimensional vector data
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a network data mining method and system based on multidimensional vector data.
Background
In the internet era, with the popularization and wide application of the mobile internet, any event can generate a great amount of network information on a network space, including but not limited to related contents such as a public number, a microblog, a friend circle, a short video, a picture and the like of a media user. The characteristics of the information are: the system has the advantages of large information amount, complicated contents, various forms, high growth speed, high propagation speed and strong interaction function. However, because these network information are fragmented, widely distributed, multilingual, unordered, and lack of unified database management, it is difficult to attempt to manually restore the evolution process of events from these data, discover key links, and eliminate adverse public opinion influences.
In the prior art, a technical scheme for analyzing network information of a hotspot event by adopting a data mining technology is provided. The hot event keywords extracted from the network space are used as a basis, collaborative clustering is carried out on the keywords and the data set of the physical space, and information samples related to the hot event in the physical space are extracted according to clustering results, so that a user can quickly and comprehensively know the related information of the hot event.
However, as the number of keywords increases, on one hand, the complexity of collaborative clustering operation increases, and it is difficult to obtain a clustering result quickly, and on the other hand, correlation analysis among different types of keywords is lacking, which results in incomplete analysis and low data mining efficiency.
Disclosure of Invention
In order to solve the technical problems that the clustering operation complexity is high, the clustering result is difficult to obtain quickly, the data analysis is not comprehensive enough, and the data mining efficiency is low, the invention provides a network data mining method and system based on multi-dimensional vector data.
A network data mining method based on multidimensional vector data comprises the following steps: vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, searching the multidimensional vector data sources according to set conditions, summarizing search results, performing cluster analysis on the summarized search results to generate an information data family set, counting the spatial vector distribution of each information data family in the information data family set, and obtaining the relevance of the network data through relevance analysis.
Further, the multidimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
Further, the obtaining of the relevance of the network data through relevance analysis includes calculating the relevance between every two information data families, and determining the information data family with high relevance to the event.
Further, the correlation of the network data obtained through the correlation analysis includes the distribution of the behavior, relationship and location components in the statistical information data family, and the behavior, relationship and/or location information with high correlation with the event is determined.
Further, still include: and calculating the contact degrees of the keywords representing the behaviors, the relationships and/or the position components in the information data family and the keywords representing the behaviors, the relationships and/or the position components in the event, carrying out normalization processing, and taking the keywords with high contact degrees after normalization as the behavior, the relationships and/or the position information with high time association degrees.
A multidimensional vector data based network information mining system, comprising: the system comprises a vectorization module, a search module, a cluster analysis module and a relevancy analysis module, wherein the vectorization module is used for vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, the search module is used for searching the multidimensional vector data sources according to set conditions and summarizing search results, the cluster analysis module is used for carrying out cluster analysis on the summarized search results to generate an information data family set, and the relevancy analysis module is used for counting the space vector distribution of each information data family in the information data family set and obtaining the relevancy of the network data through relevancy analysis.
Further, the multidimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
Further, the obtaining of the relevance of the network data through relevance analysis includes calculating the relevance between every two information data families, and determining the information data family with high relevance to the event.
Further, the correlation of the network data obtained through the correlation analysis includes the distribution of the behavior, relationship and location components in the statistical information data family, and the behavior, relationship and/or location information with high correlation with the event is determined.
Further, the relevance analysis module is further configured to calculate the degree of coincidence between the plurality of keywords representing the behaviors, the relationships, and/or the position components in the information data family and the plurality of keywords representing the behaviors, the relationships, and/or the position components in the event, perform normalization processing, and use the keyword with a high degree of coincidence after normalization as the behavior, the relationship, and/or the position information with a high degree of relevance to time.
The invention has the beneficial effects that: by vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved. The method and the system provided by the embodiment of the invention can be used for controlling network information, for example, providing relevant information of hot spot events or possibly interested contents for users.
Drawings
FIG. 1 is a schematic diagram of an information three-dimensional space of a proposed event according to an embodiment of the invention;
FIG. 2 is a flowchart of a proposed method for network data mining based on multidimensional vector data according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a two-dimensional distribution of clustering results according to an embodiment of the present invention;
fig. 4 is a block diagram of a network data mining system based on multidimensional vector data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings. Those skilled in the art will appreciate that the present invention is not limited to the drawings and the following examples.
The embodiment of the invention provides a network data mining method based on multi-dimensional vector data.
In the era of the mobile internet, a large amount of network information will be generated in the event that an influence has been generated in society. Each event may contain multiple sets of information, such as a set of actions, a set of relationships, and a set of positions. The behavior refers to all purposeful activities of people, and is composed of a series of simple actionsThe term "a" or "an" is used generically to describe all actions that are usually performed. Relationships refer to the interplay between people, between people and things, and between things and things. Location refers to a determined geographic location. Information set e of eventsiCan be expressed as the sum of three subsets, namely: { ∑ ak+∑rn+∑pm}∈ei. Wherein, [ ikn m [ ]]=1,2……n,∑akFor a subset of behavior information, Σ rnFor subsets of relational information, Σ pmIs a subset of the location information. As shown in fig. 1, by taking behaviors, relationships, and positions as X, Y, and Z axes in a three-dimensional space, a set of information of an event can be described in the three-dimensional space. The event information is derived from an information source, and thus the set of event information eiAs well as from information sources. Each information source may contain one or more of a subset of behavior information, a subset of relationship information, and a subset of location information.
Fig. 2 is a network data mining method based on multidimensional vector data according to an embodiment of the present invention. As shown in fig. 2, in step 210, network information in a plurality of network information data sources is vectorized to form a plurality of multidimensional vector data sources. Taking a three-dimensional vector DATA source as an example, taking behaviors, relationships and positions as X, Y, Z axes in a three-dimensional space, the three-dimensional vector DATA source may be represented as DATA (a, r, p), i.e., a point in the three-dimensional space. A vector from the origin of the event information to this point represents the vectorization of the network information. If only one or two of the behavior information subset, the relationship information subset and the position information subset are contained in the network information data source, the component of the information subset which is not contained is represented as 0. For example, if the network information DATA source contains behavior information subsets and relationship information subsets, the three-dimensional vector DATA source is represented as DATA (a, r, 0). More dimensions can be selected to construct the multi-dimensional vector data.
In step 220, a plurality of multidimensional vector data sources are searched according to the set conditions, and vector information of the plurality of network information data sources is obtained. The search may be performed by multiple iterative searches, for example, after the first search is completed, the obtained result is used as a search element to perform the search again. The number of iterations typically does not exceed 3. And after the search is finished, summarizing the search results.
In step 230, clustering analysis of various parameters is performed on the summarized search results to obtain clustering results with multi-style distribution, and a set of information data families is generated. As shown in fig. 3, clustering analysis is performed by using a density-based method and a grid-based method (search iteration is used as a scale of the grid), respectively, to obtain a two-dimensionally distributed clustering result, where the clustering result includes a plurality of information data families, and each information data family includes information elements having vector information, respectively.
In step 240, the spatial vector distribution of each information data family is counted, and relevance analysis is performed to obtain relevance of the network data. And determining an information family with high association degree with the event and main information elements in the information family through association degree analysis, thereby making correct judgment on the event. And calculating the association degree between every two information data families, and determining which information families have high association degree with the event. The closer the distance between two information families, the higher its degree of association. And (4) counting the distribution conditions of the behaviors, the relations and the position components in the information data family, and determining which behaviors, relations and positions are higher in association with the events. And calculating the contact degrees of the keywords representing the behavior/relationship/position components in the information data family and the keywords representing the behavior/relationship/position components in the event, and performing normalization processing. And taking the keywords with higher normalized contact degree as the behavior/relation/position elements with higher time association degree.
By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.
The embodiment of the invention also provides a network data mining system based on the multidimensional vector data.
Fig. 4 is a network data mining system based on multidimensional vector data according to an embodiment of the present invention. As shown in fig. 4, the network data mining system includes a vectorization module 410, configured to vectorize network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources. Taking a three-dimensional vector DATA source as an example, taking behaviors, relationships and positions as X, Y, Z axes in a three-dimensional space, the three-dimensional vector DATA source may be represented as DATA (a, r, p), i.e., a point in the three-dimensional space. A vector from the origin of the event information to this point represents the vectorization of the network information. If only one or two of the behavior information subset, the relationship information subset and the position information subset are contained in the network information data source, the component of the information subset which is not contained is represented as 0. For example, if the network information DATA source contains behavior information subsets and relationship information subsets, the three-dimensional vector DATA source is represented as DATA (a, r, 0). More dimensions can be selected to construct the multi-dimensional vector data.
The network data mining system includes a searching module 420, configured to search a plurality of multidimensional vector data sources according to a set condition, and obtain vector information of the plurality of network information data sources. The search may be performed by multiple iterative searches, for example, after the first search is completed, the obtained result is used as a search element to perform the search again. The number of iterations typically does not exceed 3. And after the search is finished, summarizing the search results.
The network data mining system comprises a cluster analysis module 430, which is used for performing cluster analysis of various parameters on the summarized search results to obtain cluster results distributed in various forms and generate a set of information data families. And performing clustering analysis by using a density-based method and a grid-based method (the number of search iterations is used as a scale of the grid), so as to obtain a two-dimensional distributed clustering result, wherein the clustering result comprises a plurality of information data families, and each information data family respectively comprises information elements with vector information.
The network data mining system includes a relevancy analysis module 440, configured to perform statistics on spatial vector distribution of each information data family, and perform relevancy analysis to obtain relevancy of network data. And determining an information family with high association degree with the event and main information elements in the information family through association degree analysis, thereby making correct judgment on the event. And calculating the association degree between every two information data families, and determining which information families have high association degree with the event. The closer the distance between two information families, the higher its degree of association. And (4) counting the distribution conditions of the behaviors, the relations and the position components in the information data family, and determining which behaviors, relations and positions are higher in association with the events.
In an embodiment, the association analysis module 440 is further configured to calculate the overlap ratio between the keywords representing the behavior/relationship/location components in the information data family and the keywords representing the behavior/relationship/location components in the event, and perform normalization processing. And taking the keywords with higher normalized contact degree as the behavior/relation/position elements with higher time association degree.
By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.
Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above method.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
Those of skill in the art will understand that the logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be viewed as implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The embodiments of the present invention have been described above. However, the present invention is not limited to the above embodiment. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A network data mining method based on multidimensional vector data is characterized by comprising the following steps:
vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources,
searching the multi-dimensional vector data sources according to set conditions, summarizing search results,
performing cluster analysis on the summarized search results to generate an information data family set,
counting the space vector distribution of each information data family in the information data family set, obtaining the correlation of network data through correlation degree analysis, calculating the correlation degree between every two information data families, determining the information family with high correlation degree with the event and main information elements in the information families, counting the distribution conditions of behavior, relationship and position components in the information data families, and determining the behavior, relationship and/or position information with high correlation degree with the event, thereby making correct judgment on the event;
the behavior, the relationship and the position are respectively used as X, Y and a Z axis of a three-dimensional space, the multi-dimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
2. The data mining method of claim 1, further comprising: and calculating the contact degrees of the keywords representing the behaviors, the relationships and/or the position components in the information data family and the keywords representing the behaviors, the relationships and/or the position components in the events, carrying out normalization processing, and taking the keywords with high contact degrees after normalization as the behavior, the relationships and/or the position information with high association degrees with the events.
3. A system for network data mining based on multidimensional vector data, comprising:
a vectorization module for vectorizing the network information in the plurality of network information data sources to form a plurality of multidimensional vector data sources,
a searching module for searching the multi-dimensional vector data sources according to the set conditions and summarizing the searching results,
a cluster analysis module for performing cluster analysis on the summarized search results to generate an information data family set,
the association degree analysis module is used for counting the space vector distribution of each information data family in the information data family set, obtaining the association of network data through association degree analysis, calculating the association degree between every two information data families, determining the information family with high association degree with the event and main information elements in the information family, counting the distribution condition of behavior, relationship and position components in the information data families, and determining the behavior, relationship and/or position information with high association degree with the event, thereby making correct judgment on the event;
the behavior, the relationship and the position are respectively used as X, Y and a Z axis of a three-dimensional space, the multi-dimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
4. The data mining system of claim 3, wherein the relevance analysis module is further configured to calculate a degree of overlap between a plurality of keywords representing behavior, relationship and/or location components in the information data family and a plurality of keywords representing behavior, relationship and/or location components in the event, and perform normalization processing to obtain a keyword with a high degree of overlap after normalization as behavior, relationship and/or location information with a high degree of relevance to the event.
CN201910305243.9A 2019-04-16 2019-04-16 Network data mining method and system based on multidimensional vector data Active CN110096529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910305243.9A CN110096529B (en) 2019-04-16 2019-04-16 Network data mining method and system based on multidimensional vector data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910305243.9A CN110096529B (en) 2019-04-16 2019-04-16 Network data mining method and system based on multidimensional vector data

Publications (2)

Publication Number Publication Date
CN110096529A CN110096529A (en) 2019-08-06
CN110096529B true CN110096529B (en) 2021-07-16

Family

ID=67444890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910305243.9A Active CN110096529B (en) 2019-04-16 2019-04-16 Network data mining method and system based on multidimensional vector data

Country Status (1)

Country Link
CN (1) CN110096529B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102331995A (en) * 2011-07-08 2012-01-25 华东师范大学 Point source-based information acquisition method and system for three-dimensional geo-information model
CN105488628A (en) * 2015-11-30 2016-04-13 国网天津市电力公司 Electric power big data visualization oriented data mining method
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN109344212A (en) * 2018-08-24 2019-02-15 武汉中地数码科技有限公司 A kind of geographical big data of subject-oriented feature excavates the method and system of recommendation
CN109389158A (en) * 2018-09-19 2019-02-26 成都城电电力工程设计有限公司 It early can system architecture method based on the dispatching of power netwoks of data mining and human-computer interaction

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN103812719B (en) * 2012-11-12 2018-05-18 华为技术有限公司 The failure prediction method and device of group system
CN105608219B (en) * 2016-01-07 2019-06-18 上海通创信息技术有限公司 A kind of streaming recommended engine, recommender system and recommended method based on cluster
CN106599436A (en) * 2016-12-08 2017-04-26 湖南大学 User in-room behavior prediction method for office building
CN106844585A (en) * 2017-01-10 2017-06-13 广东精规划信息科技股份有限公司 A kind of time-space relationship analysis system based on multi-source Internet of Things location aware
CN107133632A (en) * 2017-02-27 2017-09-05 国网冀北电力有限公司 A kind of wind power equipment fault diagnosis method and system
US11621969B2 (en) * 2017-04-26 2023-04-04 Elasticsearch B.V. Clustering and outlier detection in anomaly and causation detection for computing environments
CN108345660A (en) * 2018-01-31 2018-07-31 山东汇贸电子口岸有限公司 A kind of data analysing method based on government data
CN109376185A (en) * 2018-10-25 2019-02-22 广州市金禧信息技术服务有限公司 Data digging system and its application under big data environment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102331995A (en) * 2011-07-08 2012-01-25 华东师范大学 Point source-based information acquisition method and system for three-dimensional geo-information model
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN105488628A (en) * 2015-11-30 2016-04-13 国网天津市电力公司 Electric power big data visualization oriented data mining method
CN109344212A (en) * 2018-08-24 2019-02-15 武汉中地数码科技有限公司 A kind of geographical big data of subject-oriented feature excavates the method and system of recommendation
CN109389158A (en) * 2018-09-19 2019-02-26 成都城电电力工程设计有限公司 It early can system architecture method based on the dispatching of power netwoks of data mining and human-computer interaction

Also Published As

Publication number Publication date
CN110096529A (en) 2019-08-06

Similar Documents

Publication Publication Date Title
Dhanaraj et al. Random forest bagging and x-means clustered antipattern detection from sql query log for accessing secure mobile data
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
Skoutas et al. Ranking and clustering web services using multicriteria dominance relationships
Jiang et al. Clustering uncertain data based on probability distribution similarity
Liu et al. U-skyline: A new skyline query for uncertain databases
US20150100596A1 (en) System and method for performing set operations with defined sketch accuracy distribution
US20080104089A1 (en) System and method for distributing queries to a group of databases and expediting data access
CN111627552B (en) Medical streaming data blood-edge relationship analysis and storage method and device
CN104573130A (en) Entity resolution method based on group calculation and entity resolution device based on group calculation
Jiang et al. Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods
CN105320764A (en) 3D model retrieval method and 3D model retrieval apparatus based on slow increment features
US11868346B2 (en) Automated linear clustering recommendation for database zone maps
Singh et al. Nearest keyword set search in multi-dimensional datasets
Yu et al. Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments
Lin et al. BigIN4: Instant, interactive insight identification for multi-dimensional big data
US8650180B2 (en) Efficient optimization over uncertain data
CN115905630A (en) Graph database query method, device, equipment and storage medium
Saad et al. Efficient skyline computation on uncertain dimensions
Zhou et al. Summarisation of weighted networks
CN106126681B (en) A kind of increment type stream data clustering method and system
US20200257684A1 (en) Higher-order data sketching for ad-hoc query estimation
Gao et al. Efficient algorithms for finding the most desirable skyline objects
Singh et al. Knowledge based retrieval scheme from big data for aviation industry
Zhang et al. Mac: A probabilistic framework for query answering with machine-crowd collaboration
CN110096529B (en) Network data mining method and system based on multidimensional vector data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100102 room 16b557, 16 / F, 101, floor 4-33, building 13, District 4, Wangjing Dongyuan, Chaoyang District, Beijing

Patentee after: ZHONGKE JINLIAN (BEIJING) TECHNOLOGY Co.,Ltd.

Address before: 100102 605, 6th floor, building 13, yard 18, ziyue Road, a 1 Laiguangying middle street, Chaoyang District, Beijing

Patentee before: ZHONGKE JINLIAN (BEIJING) TECHNOLOGY Co.,Ltd.