CN110096529B - Network data mining method and system based on multidimensional vector data - Google Patents
Network data mining method and system based on multidimensional vector data Download PDFInfo
- Publication number
- CN110096529B CN110096529B CN201910305243.9A CN201910305243A CN110096529B CN 110096529 B CN110096529 B CN 110096529B CN 201910305243 A CN201910305243 A CN 201910305243A CN 110096529 B CN110096529 B CN 110096529B
- Authority
- CN
- China
- Prior art keywords
- information
- data
- family
- network
- behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a network data mining method and system based on multidimensional vector data. The method comprises the following steps: vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, searching the multidimensional vector data sources according to set conditions, summarizing search results, performing cluster analysis on the summarized search results to generate an information data family set, counting the spatial vector distribution of each information data family in the information data family set, and obtaining the relevance of the network data through relevance analysis. By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.
Description
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a network data mining method and system based on multidimensional vector data.
Background
In the internet era, with the popularization and wide application of the mobile internet, any event can generate a great amount of network information on a network space, including but not limited to related contents such as a public number, a microblog, a friend circle, a short video, a picture and the like of a media user. The characteristics of the information are: the system has the advantages of large information amount, complicated contents, various forms, high growth speed, high propagation speed and strong interaction function. However, because these network information are fragmented, widely distributed, multilingual, unordered, and lack of unified database management, it is difficult to attempt to manually restore the evolution process of events from these data, discover key links, and eliminate adverse public opinion influences.
In the prior art, a technical scheme for analyzing network information of a hotspot event by adopting a data mining technology is provided. The hot event keywords extracted from the network space are used as a basis, collaborative clustering is carried out on the keywords and the data set of the physical space, and information samples related to the hot event in the physical space are extracted according to clustering results, so that a user can quickly and comprehensively know the related information of the hot event.
However, as the number of keywords increases, on one hand, the complexity of collaborative clustering operation increases, and it is difficult to obtain a clustering result quickly, and on the other hand, correlation analysis among different types of keywords is lacking, which results in incomplete analysis and low data mining efficiency.
Disclosure of Invention
In order to solve the technical problems that the clustering operation complexity is high, the clustering result is difficult to obtain quickly, the data analysis is not comprehensive enough, and the data mining efficiency is low, the invention provides a network data mining method and system based on multi-dimensional vector data.
A network data mining method based on multidimensional vector data comprises the following steps: vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, searching the multidimensional vector data sources according to set conditions, summarizing search results, performing cluster analysis on the summarized search results to generate an information data family set, counting the spatial vector distribution of each information data family in the information data family set, and obtaining the relevance of the network data through relevance analysis.
Further, the multidimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
Further, the obtaining of the relevance of the network data through relevance analysis includes calculating the relevance between every two information data families, and determining the information data family with high relevance to the event.
Further, the correlation of the network data obtained through the correlation analysis includes the distribution of the behavior, relationship and location components in the statistical information data family, and the behavior, relationship and/or location information with high correlation with the event is determined.
Further, still include: and calculating the contact degrees of the keywords representing the behaviors, the relationships and/or the position components in the information data family and the keywords representing the behaviors, the relationships and/or the position components in the event, carrying out normalization processing, and taking the keywords with high contact degrees after normalization as the behavior, the relationships and/or the position information with high time association degrees.
A multidimensional vector data based network information mining system, comprising: the system comprises a vectorization module, a search module, a cluster analysis module and a relevancy analysis module, wherein the vectorization module is used for vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources, the search module is used for searching the multidimensional vector data sources according to set conditions and summarizing search results, the cluster analysis module is used for carrying out cluster analysis on the summarized search results to generate an information data family set, and the relevancy analysis module is used for counting the space vector distribution of each information data family in the information data family set and obtaining the relevancy of the network data through relevancy analysis.
Further, the multidimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
Further, the obtaining of the relevance of the network data through relevance analysis includes calculating the relevance between every two information data families, and determining the information data family with high relevance to the event.
Further, the correlation of the network data obtained through the correlation analysis includes the distribution of the behavior, relationship and location components in the statistical information data family, and the behavior, relationship and/or location information with high correlation with the event is determined.
Further, the relevance analysis module is further configured to calculate the degree of coincidence between the plurality of keywords representing the behaviors, the relationships, and/or the position components in the information data family and the plurality of keywords representing the behaviors, the relationships, and/or the position components in the event, perform normalization processing, and use the keyword with a high degree of coincidence after normalization as the behavior, the relationship, and/or the position information with a high degree of relevance to time.
The invention has the beneficial effects that: by vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved. The method and the system provided by the embodiment of the invention can be used for controlling network information, for example, providing relevant information of hot spot events or possibly interested contents for users.
Drawings
FIG. 1 is a schematic diagram of an information three-dimensional space of a proposed event according to an embodiment of the invention;
FIG. 2 is a flowchart of a proposed method for network data mining based on multidimensional vector data according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a two-dimensional distribution of clustering results according to an embodiment of the present invention;
fig. 4 is a block diagram of a network data mining system based on multidimensional vector data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings. Those skilled in the art will appreciate that the present invention is not limited to the drawings and the following examples.
The embodiment of the invention provides a network data mining method based on multi-dimensional vector data.
In the era of the mobile internet, a large amount of network information will be generated in the event that an influence has been generated in society. Each event may contain multiple sets of information, such as a set of actions, a set of relationships, and a set of positions. The behavior refers to all purposeful activities of people, and is composed of a series of simple actionsThe term "a" or "an" is used generically to describe all actions that are usually performed. Relationships refer to the interplay between people, between people and things, and between things and things. Location refers to a determined geographic location. Information set e of eventsiCan be expressed as the sum of three subsets, namely: { ∑ ak+∑rn+∑pm}∈ei. Wherein, [ ikn m [ ]]=1,2……n,∑akFor a subset of behavior information, Σ rnFor subsets of relational information, Σ pmIs a subset of the location information. As shown in fig. 1, by taking behaviors, relationships, and positions as X, Y, and Z axes in a three-dimensional space, a set of information of an event can be described in the three-dimensional space. The event information is derived from an information source, and thus the set of event information eiAs well as from information sources. Each information source may contain one or more of a subset of behavior information, a subset of relationship information, and a subset of location information.
Fig. 2 is a network data mining method based on multidimensional vector data according to an embodiment of the present invention. As shown in fig. 2, in step 210, network information in a plurality of network information data sources is vectorized to form a plurality of multidimensional vector data sources. Taking a three-dimensional vector DATA source as an example, taking behaviors, relationships and positions as X, Y, Z axes in a three-dimensional space, the three-dimensional vector DATA source may be represented as DATA (a, r, p), i.e., a point in the three-dimensional space. A vector from the origin of the event information to this point represents the vectorization of the network information. If only one or two of the behavior information subset, the relationship information subset and the position information subset are contained in the network information data source, the component of the information subset which is not contained is represented as 0. For example, if the network information DATA source contains behavior information subsets and relationship information subsets, the three-dimensional vector DATA source is represented as DATA (a, r, 0). More dimensions can be selected to construct the multi-dimensional vector data.
In step 220, a plurality of multidimensional vector data sources are searched according to the set conditions, and vector information of the plurality of network information data sources is obtained. The search may be performed by multiple iterative searches, for example, after the first search is completed, the obtained result is used as a search element to perform the search again. The number of iterations typically does not exceed 3. And after the search is finished, summarizing the search results.
In step 230, clustering analysis of various parameters is performed on the summarized search results to obtain clustering results with multi-style distribution, and a set of information data families is generated. As shown in fig. 3, clustering analysis is performed by using a density-based method and a grid-based method (search iteration is used as a scale of the grid), respectively, to obtain a two-dimensionally distributed clustering result, where the clustering result includes a plurality of information data families, and each information data family includes information elements having vector information, respectively.
In step 240, the spatial vector distribution of each information data family is counted, and relevance analysis is performed to obtain relevance of the network data. And determining an information family with high association degree with the event and main information elements in the information family through association degree analysis, thereby making correct judgment on the event. And calculating the association degree between every two information data families, and determining which information families have high association degree with the event. The closer the distance between two information families, the higher its degree of association. And (4) counting the distribution conditions of the behaviors, the relations and the position components in the information data family, and determining which behaviors, relations and positions are higher in association with the events. And calculating the contact degrees of the keywords representing the behavior/relationship/position components in the information data family and the keywords representing the behavior/relationship/position components in the event, and performing normalization processing. And taking the keywords with higher normalized contact degree as the behavior/relation/position elements with higher time association degree.
By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.
The embodiment of the invention also provides a network data mining system based on the multidimensional vector data.
Fig. 4 is a network data mining system based on multidimensional vector data according to an embodiment of the present invention. As shown in fig. 4, the network data mining system includes a vectorization module 410, configured to vectorize network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources. Taking a three-dimensional vector DATA source as an example, taking behaviors, relationships and positions as X, Y, Z axes in a three-dimensional space, the three-dimensional vector DATA source may be represented as DATA (a, r, p), i.e., a point in the three-dimensional space. A vector from the origin of the event information to this point represents the vectorization of the network information. If only one or two of the behavior information subset, the relationship information subset and the position information subset are contained in the network information data source, the component of the information subset which is not contained is represented as 0. For example, if the network information DATA source contains behavior information subsets and relationship information subsets, the three-dimensional vector DATA source is represented as DATA (a, r, 0). More dimensions can be selected to construct the multi-dimensional vector data.
The network data mining system includes a searching module 420, configured to search a plurality of multidimensional vector data sources according to a set condition, and obtain vector information of the plurality of network information data sources. The search may be performed by multiple iterative searches, for example, after the first search is completed, the obtained result is used as a search element to perform the search again. The number of iterations typically does not exceed 3. And after the search is finished, summarizing the search results.
The network data mining system comprises a cluster analysis module 430, which is used for performing cluster analysis of various parameters on the summarized search results to obtain cluster results distributed in various forms and generate a set of information data families. And performing clustering analysis by using a density-based method and a grid-based method (the number of search iterations is used as a scale of the grid), so as to obtain a two-dimensional distributed clustering result, wherein the clustering result comprises a plurality of information data families, and each information data family respectively comprises information elements with vector information.
The network data mining system includes a relevancy analysis module 440, configured to perform statistics on spatial vector distribution of each information data family, and perform relevancy analysis to obtain relevancy of network data. And determining an information family with high association degree with the event and main information elements in the information family through association degree analysis, thereby making correct judgment on the event. And calculating the association degree between every two information data families, and determining which information families have high association degree with the event. The closer the distance between two information families, the higher its degree of association. And (4) counting the distribution conditions of the behaviors, the relations and the position components in the information data family, and determining which behaviors, relations and positions are higher in association with the events.
In an embodiment, the association analysis module 440 is further configured to calculate the overlap ratio between the keywords representing the behavior/relationship/location components in the information data family and the keywords representing the behavior/relationship/location components in the event, and perform normalization processing. And taking the keywords with higher normalized contact degree as the behavior/relation/position elements with higher time association degree.
By vectorizing the network information in the network information data source, the complexity of clustering analysis and association degree analysis operation is reduced, the rapid convergence of an information data family is ensured, the multi-angle association degree analysis is realized, and the data mining efficiency is improved.
Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above method.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
Those of skill in the art will understand that the logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be viewed as implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The embodiments of the present invention have been described above. However, the present invention is not limited to the above embodiment. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (4)
1. A network data mining method based on multidimensional vector data is characterized by comprising the following steps:
vectorizing network information in a plurality of network information data sources to form a plurality of multidimensional vector data sources,
searching the multi-dimensional vector data sources according to set conditions, summarizing search results,
performing cluster analysis on the summarized search results to generate an information data family set,
counting the space vector distribution of each information data family in the information data family set, obtaining the correlation of network data through correlation degree analysis, calculating the correlation degree between every two information data families, determining the information family with high correlation degree with the event and main information elements in the information families, counting the distribution conditions of behavior, relationship and position components in the information data families, and determining the behavior, relationship and/or position information with high correlation degree with the event, thereby making correct judgment on the event;
the behavior, the relationship and the position are respectively used as X, Y and a Z axis of a three-dimensional space, the multi-dimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
2. The data mining method of claim 1, further comprising: and calculating the contact degrees of the keywords representing the behaviors, the relationships and/or the position components in the information data family and the keywords representing the behaviors, the relationships and/or the position components in the events, carrying out normalization processing, and taking the keywords with high contact degrees after normalization as the behavior, the relationships and/or the position information with high association degrees with the events.
3. A system for network data mining based on multidimensional vector data, comprising:
a vectorization module for vectorizing the network information in the plurality of network information data sources to form a plurality of multidimensional vector data sources,
a searching module for searching the multi-dimensional vector data sources according to the set conditions and summarizing the searching results,
a cluster analysis module for performing cluster analysis on the summarized search results to generate an information data family set,
the association degree analysis module is used for counting the space vector distribution of each information data family in the information data family set, obtaining the association of network data through association degree analysis, calculating the association degree between every two information data families, determining the information family with high association degree with the event and main information elements in the information family, counting the distribution condition of behavior, relationship and position components in the information data families, and determining the behavior, relationship and/or position information with high association degree with the event, thereby making correct judgment on the event;
the behavior, the relationship and the position are respectively used as X, Y and a Z axis of a three-dimensional space, the multi-dimensional vector DATA source is represented as DATA (a, r, p), a is a behavior information component, r is a relationship information component, and p is a position information component.
4. The data mining system of claim 3, wherein the relevance analysis module is further configured to calculate a degree of overlap between a plurality of keywords representing behavior, relationship and/or location components in the information data family and a plurality of keywords representing behavior, relationship and/or location components in the event, and perform normalization processing to obtain a keyword with a high degree of overlap after normalization as behavior, relationship and/or location information with a high degree of relevance to the event.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910305243.9A CN110096529B (en) | 2019-04-16 | 2019-04-16 | Network data mining method and system based on multidimensional vector data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910305243.9A CN110096529B (en) | 2019-04-16 | 2019-04-16 | Network data mining method and system based on multidimensional vector data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110096529A CN110096529A (en) | 2019-08-06 |
CN110096529B true CN110096529B (en) | 2021-07-16 |
Family
ID=67444890
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910305243.9A Active CN110096529B (en) | 2019-04-16 | 2019-04-16 | Network data mining method and system based on multidimensional vector data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110096529B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102331995A (en) * | 2011-07-08 | 2012-01-25 | 华东师范大学 | Point source-based information acquisition method and system for three-dimensional geo-information model |
CN105488628A (en) * | 2015-11-30 | 2016-04-13 | 国网天津市电力公司 | Electric power big data visualization oriented data mining method |
CN106304015A (en) * | 2015-05-28 | 2017-01-04 | 中兴通讯股份有限公司 | The determination method and device of subscriber equipment |
CN109344212A (en) * | 2018-08-24 | 2019-02-15 | 武汉中地数码科技有限公司 | A kind of geographical big data of subject-oriented feature excavates the method and system of recommendation |
CN109389158A (en) * | 2018-09-19 | 2019-02-26 | 成都城电电力工程设计有限公司 | It early can system architecture method based on the dispatching of power netwoks of data mining and human-computer interaction |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102880719A (en) * | 2012-10-16 | 2013-01-16 | 四川大学 | User trajectory similarity mining method for location-based social network |
CN103812719B (en) * | 2012-11-12 | 2018-05-18 | 华为技术有限公司 | The failure prediction method and device of group system |
CN105608219B (en) * | 2016-01-07 | 2019-06-18 | 上海通创信息技术有限公司 | A kind of streaming recommended engine, recommender system and recommended method based on cluster |
CN106599436A (en) * | 2016-12-08 | 2017-04-26 | 湖南大学 | User in-room behavior prediction method for office building |
CN106844585A (en) * | 2017-01-10 | 2017-06-13 | 广东精规划信息科技股份有限公司 | A kind of time-space relationship analysis system based on multi-source Internet of Things location aware |
CN107133632A (en) * | 2017-02-27 | 2017-09-05 | 国网冀北电力有限公司 | A kind of wind power equipment fault diagnosis method and system |
US11621969B2 (en) * | 2017-04-26 | 2023-04-04 | Elasticsearch B.V. | Clustering and outlier detection in anomaly and causation detection for computing environments |
CN108345660A (en) * | 2018-01-31 | 2018-07-31 | 山东汇贸电子口岸有限公司 | A kind of data analysing method based on government data |
CN109376185A (en) * | 2018-10-25 | 2019-02-22 | 广州市金禧信息技术服务有限公司 | Data digging system and its application under big data environment |
-
2019
- 2019-04-16 CN CN201910305243.9A patent/CN110096529B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102331995A (en) * | 2011-07-08 | 2012-01-25 | 华东师范大学 | Point source-based information acquisition method and system for three-dimensional geo-information model |
CN106304015A (en) * | 2015-05-28 | 2017-01-04 | 中兴通讯股份有限公司 | The determination method and device of subscriber equipment |
CN105488628A (en) * | 2015-11-30 | 2016-04-13 | 国网天津市电力公司 | Electric power big data visualization oriented data mining method |
CN109344212A (en) * | 2018-08-24 | 2019-02-15 | 武汉中地数码科技有限公司 | A kind of geographical big data of subject-oriented feature excavates the method and system of recommendation |
CN109389158A (en) * | 2018-09-19 | 2019-02-26 | 成都城电电力工程设计有限公司 | It early can system architecture method based on the dispatching of power netwoks of data mining and human-computer interaction |
Also Published As
Publication number | Publication date |
---|---|
CN110096529A (en) | 2019-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dhanaraj et al. | Random forest bagging and x-means clustered antipattern detection from sql query log for accessing secure mobile data | |
WO2019214245A1 (en) | Information pushing method and apparatus, and terminal device and storage medium | |
Skoutas et al. | Ranking and clustering web services using multicriteria dominance relationships | |
Jiang et al. | Clustering uncertain data based on probability distribution similarity | |
Liu et al. | U-skyline: A new skyline query for uncertain databases | |
US20150100596A1 (en) | System and method for performing set operations with defined sketch accuracy distribution | |
US20080104089A1 (en) | System and method for distributing queries to a group of databases and expediting data access | |
CN111627552B (en) | Medical streaming data blood-edge relationship analysis and storage method and device | |
CN104573130A (en) | Entity resolution method based on group calculation and entity resolution device based on group calculation | |
Jiang et al. | Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods | |
CN105320764A (en) | 3D model retrieval method and 3D model retrieval apparatus based on slow increment features | |
US11868346B2 (en) | Automated linear clustering recommendation for database zone maps | |
Singh et al. | Nearest keyword set search in multi-dimensional datasets | |
Yu et al. | Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments | |
Lin et al. | BigIN4: Instant, interactive insight identification for multi-dimensional big data | |
US8650180B2 (en) | Efficient optimization over uncertain data | |
CN115905630A (en) | Graph database query method, device, equipment and storage medium | |
Saad et al. | Efficient skyline computation on uncertain dimensions | |
Zhou et al. | Summarisation of weighted networks | |
CN106126681B (en) | A kind of increment type stream data clustering method and system | |
US20200257684A1 (en) | Higher-order data sketching for ad-hoc query estimation | |
Gao et al. | Efficient algorithms for finding the most desirable skyline objects | |
Singh et al. | Knowledge based retrieval scheme from big data for aviation industry | |
Zhang et al. | Mac: A probabilistic framework for query answering with machine-crowd collaboration | |
CN110096529B (en) | Network data mining method and system based on multidimensional vector data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder | ||
CP02 | Change in the address of a patent holder |
Address after: 100102 room 16b557, 16 / F, 101, floor 4-33, building 13, District 4, Wangjing Dongyuan, Chaoyang District, Beijing Patentee after: ZHONGKE JINLIAN (BEIJING) TECHNOLOGY Co.,Ltd. Address before: 100102 605, 6th floor, building 13, yard 18, ziyue Road, a 1 Laiguangying middle street, Chaoyang District, Beijing Patentee before: ZHONGKE JINLIAN (BEIJING) TECHNOLOGY Co.,Ltd. |