CN107862081A - Network Information Sources lookup method, device and server - Google Patents

Network Information Sources lookup method, device and server Download PDF

Info

Publication number
CN107862081A
CN107862081A CN201711223777.4A CN201711223777A CN107862081A CN 107862081 A CN107862081 A CN 107862081A CN 201711223777 A CN201711223777 A CN 201711223777A CN 107862081 A CN107862081 A CN 107862081A
Authority
CN
China
Prior art keywords
phrase
matrix
probability
semantic
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711223777.4A
Other languages
Chinese (zh)
Other versions
CN107862081B (en
Inventor
肖仕刚
黄勇
陈航
宋国志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Silent Information Technology Co Ltd
Original Assignee
Sichuan Silent Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Silent Information Technology Co Ltd filed Critical Sichuan Silent Information Technology Co Ltd
Priority to CN201711223777.4A priority Critical patent/CN107862081B/en
Publication of CN107862081A publication Critical patent/CN107862081A/en
Application granted granted Critical
Publication of CN107862081B publication Critical patent/CN107862081B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present invention provides a kind of Network Information Sources lookup method, device and server, is related to computer safety field.Information source Network finding and property identification function are made it have by public feelings information semantics recognition and social networks node viscosity association analysis.Extracted relative to traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space and semantic confederate matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion viscosity matching extraction information source network, viscosity clustering algorithm and cross correlation identify final information source, show as more accurate rational information source capture, based on identical public sentiment characteristic, there is various analysis dimension, social networks analysis and the identification of public sentiment characteristic to go deep into, the more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, detect public sentiment source network, easily find social network information source.

Description

Network Information Sources lookup method, device and server
Technical field
The present invention relates to computer safety field, in particular to a kind of Network Information Sources lookup method, device and clothes Business device.
Background technology
With the fast development of internet, Network Awareness form safety problem has obtained unprecedented attention.As work as The distribution centre of modern ideology and culture and the amplifier of Social Public Feelings, social networks active degree in internet have reached unprecedented The feature such as height, its direct, sudden, deviation make its as society and attention from government and the key object of monitoring.Carriage The quick grasp of feelings information, the accurate prediction of public sentiment trend and public sentiment threaten the quick excavation in source and identification to turn into public sentiment safety The key point of attacking and defending war, but in face of current multi-field extension, huge user group and fast-changing network environment, make biography The public sentiment regulatory format of system is felt simply helpless completely.At present, most public sentiment discriminance analysis mode is all based on traditional statistical Analysis pattern, the threat key word library of manual maintenance is generally based on, and does not consider the incidence relation between phrase, it is passed The property broadcast and ageing deep consideration and analysis are not done.
The content of the invention
In view of this, the purpose of the embodiment of the present invention is to provide a kind of Network Information Sources lookup method, device and service Device, the problem of source of the public feelings information with menace must not fast and accurately be excavated with solution.
The embodiment of the present invention provides a kind of Network Information Sources lookup method, including:According to public sentiment phrase database, carriage is built Feelings phrase probability space;The phrase sequence of wall scroll public feelings information is extracted, and it is semantic with reference to the public sentiment phrase probability space, structure Joint probability matrix;The wall scroll public feelings information is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm Threat coefficient, with reference to the semantic joint probability matrix structure node semantics tree;By depth probe algorithm from social node Node Internet topology distribution is obtained in network, and builds bidirectional nodes incidence matrix, square is associated according to the bidirectional nodes Battle array and the node semantics tree calculate viscosity matching factor;Enter row vector conversion to the bidirectional nodes incidence matrix, composition is treated The initial matrix of analysis, and obtain information source net from the initial matrix using Multi-layer technology algorithm and the viscosity matching factor Network;Information source semantic tree is built for described information source network, and combines node Internet topology distribution and is prolonged using viscosity Stretch algorithm and draw information source phrase viscosity distribution map;Carried using viscosity clustering algorithm from the phrase viscosity distribution map of described information source Information source feature of semanteme phrase is taken, and analysis, extraction letter are associated to the semantic tree of each node itself in information source network Breath source.
Preferably, described according to public sentiment phrase database, the step of building public sentiment phrase probability space, also includes:Calculate institute The reference probability of each phrase in public sentiment phrase database is stated, is calculated according to the phrase distribution in the public sentiment phrase database Versatility probability, the usage time according to each phrase in the public sentiment phrase database are distributed computational valid time coefficient;According to described in Quote probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
Preferably, the phrase sequence of the extraction wall scroll public feelings information, and with reference to the public sentiment phrase probability space, structure The step of semantic joint probability matrix, also includes:Extract the phrase sequence of wall scroll public feelings information;Appoint according in the phrase sequence The frequency of occurrences builds frequency matrix to two phrases of anticipating simultaneously, is existed according to the public feelings information that any two phrase is formed in the phrase sequence Threat weight distribution structure in phrase probability space threatens weight distribution matrix, according to any two phrase in the phrase sequence The integrated individual weight product matrix of structure of weight product itself is threatened, it is empty according to any two phrase itself probability in the phrase sequence Between characteristic build individual probability matrix;With reference to the frequency matrix, the threat weight distribution matrix, the individual weight product moment Battle array and the individual probability matrix build semantic joint probability matrix.
Preferably, it is described to obtain the wall scroll carriage using the semantic joint probability matrix and Naive Bayes Classification Algorithm The threat coefficient of feelings information, also include with reference to the step of semantic joint probability matrix structure node semantics tree:Using condition The independent overall reasonability for assuming to assess wall scroll public feelings information, is assumed using markov random file chain joint probability to assess list The semantic reasonability of bar public feelings information, according to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, and combine The semantic joint probability matrix structure node semantics tree.
Preferably, it is described to obtain node Internet topology distribution from social meshed network by depth probe algorithm Step also includes:The threats coefficient of each user is the threat coefficient average value of its all public feelings information operated, Mei Geyu The threat coefficient of feelings information can convert with the threat coefficient for operating its user in accumulative, if user is first node, public sentiment Information is section point, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, finally obtained Take meshed network topology distribution.
The embodiment of the present invention also provides a kind of Network Information Sources and searches device, including:Probability space build module, for according to According to public sentiment phrase database, public sentiment phrase probability space is built;Probability matrix builds module, for extracting wall scroll public feelings information Phrase sequence, and with reference to the public sentiment phrase probability space, build semantic joint probability matrix;Node semantics tree builds module, For obtaining the threat system of the wall scroll public feelings information using the semantic joint probability matrix and Naive Bayes Classification Algorithm Number, with reference to the semantic joint probability matrix structure node semantics tree;Computing module, for by depth probe algorithm from social activity Node Internet topology distribution is obtained in meshed network, and builds bidirectional nodes incidence matrix, is closed according to the bidirectional nodes Join matrix and the node semantics tree calculates viscosity matching factor;Acquisition module, for entering to the bidirectional nodes incidence matrix Row vector is changed, and forms initial matrix to be analyzed, and obtain information using Multi-layer technology algorithm and the viscosity matching factor Source network;Drafting module, for building information source semantic tree for described information source network, and combine node Internet topology It is distributed and draws information source phrase viscosity distribution map using viscosity extended algorithm;Extraction module, for utilizing viscosity clustering algorithm Information source feature of semanteme phrase is extracted from the phrase viscosity distribution map of described information source, and to each node in information source network certainly The semantic tree of body is associated analysis, extracts information source.
Preferably, the probability space structure module is additionally operable to:Calculate drawing for each phrase in the public sentiment phrase database With probability, versatility probability is calculated according to the phrase distribution in the public sentiment phrase database, according to the public sentiment phrase The usage time distribution computational valid time coefficient of each phrase in database;According to it is described reference probability, the versatility probability and The timeliness coefficient builds public sentiment phrase probability space.
Preferably, the probability matrix structure module is additionally operable to:Extract the phrase sequence of wall scroll public feelings information;According to described The frequency of occurrences builds frequency matrix to any two phrase simultaneously in phrase sequence, is formed according to any two phrase in the phrase sequence Public feelings information in phrase probability space threat weight distribution structure threaten weight distribution matrix, according to the phrase sequence In any two phrase itself threaten the integrated individual weight product matrix of structure of weight product, according to any two word in the phrase sequence Itself probability space characteristic of group builds individual probability matrix;With reference to the frequency matrix, the threat weight distribution matrix, described Individual weight product matrix and the individual probability matrix build semantic joint probability matrix.
Preferably, the node semantics tree structure module is additionally operable to:Wall scroll public feelings information is assessed using conditional independence assumption Overall reasonability, assume to assess the semantic reasonability of wall scroll public feelings information using markov random file chain joint probability, According to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, and with reference to the semantic joint probability matrix structure Build node semantics tree.
The embodiment of the present invention also provides a kind of server, including:One or more processors;Memory, for storing one Individual or multiple programs, when one or more of programs are by one or more of computing devices so that it is one or Multiple processors realize Network Information Sources lookup method as described above.
Compared with prior art, Network Information Sources lookup method, device and server provided in an embodiment of the present invention, pass through Public feelings information semantics recognition and social networks node viscosity association analysis make it have information source Network finding and property identification work( Energy.Extracted relative to traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space With semantic confederate matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion Viscosity matching extraction information source network, viscosity clustering algorithm and cross correlation identify final information source, and it is more accurate reasonable to show as Information source capture, based on identical public sentiment characteristic, there is various analysis dimension, social networks analysis and public sentiment characteristic to know Not Shen Ru, the more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, inspection Public sentiment source network is measured, easily finds social network information source.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by embodiment it is required use it is attached Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, therefore be not construed as pair The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is the process structure figure of Network Information Sources lookup method provided in an embodiment of the present invention.
Fig. 2 is the flow chart of Network Information Sources lookup method provided in an embodiment of the present invention.
Fig. 3 is the schematic diagram of public sentiment phrase probability space provided in an embodiment of the present invention.
Fig. 4 is the schematic diagram of meshed network topology distribution provided in an embodiment of the present invention.
Fig. 5 is information source phrase viscosity distribution map provided in an embodiment of the present invention.
Fig. 6 is the schematic diagram that viscosity cluster algorithm provided in an embodiment of the present invention obtains information source.
Fig. 7 is the structural representation of server provided in an embodiment of the present invention.
Fig. 8 is the high-level schematic functional block diagram that Network Information Sources provided in an embodiment of the present invention search device.
Icon:10- servers;101- processors;102- memories;103- buses;104- communication interfaces;200- networks are believed Search device in breath source;201- probability spaces build module;202- probability matrixs build module;203- node semantics tree builds mould Block;204- computing modules;205- acquisition modules;206- drafting modules;207- extraction modules.
Embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Generally exist The component of the embodiment of the present invention described and illustrated in accompanying drawing can be configured to arrange and design with a variety of herein.Cause This, the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention below Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined, then it further need not be defined and explained in subsequent accompanying drawing in individual accompanying drawing.Meanwhile the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that instruction or hint relative importance.
Fig. 1 is refer to, is the process structure figure of Network Information Sources lookup method provided in an embodiment of the present invention.The present embodiment The Network Information Sources lookup method of offer, applied to server, for searching the information source with the public feelings information threatened.This reality Apply the Network Information Sources lookup method combination phrase probability space and semantic confederate matrix division methods, Naive Bayes Classification of example Method structure node semantics tree, the detection of node depth and vector conversion viscosity matching extract information source network, viscosity clustering algorithm and Cross correlation identifies final information source, more accurate rational information source capture is shown as, based on identical public sentiment characteristic, tool There are various analysis dimension, social networks analysis and the identification of public sentiment characteristic to go deep into, the more intuitive advantage of data representation, below will be detailed Description.
Fig. 2 is refer to, is the flow chart of Network Information Sources lookup method provided in an embodiment of the present invention.Need what is mentioned It is that method of the present invention is not using Fig. 2 and particular order as shown below as limitation.Below by the specific stream shown in Fig. 2 Journey and step are described in detail, and the Network Information Sources lookup method includes:
Step S101, according to public sentiment phrase database, build public sentiment phrase probability space.
Described, public sentiment phrase database is the database pre-set, is included in the public sentiment phrase database pre-stored Have threaten phrase.The reference probability of each phrase in the public sentiment phrase database is calculated, according to the public sentiment phrase number Versatility probability is calculated according to the phrase distribution in storehouse, the usage time point according to each phrase in the public sentiment phrase database Cloth computational valid time coefficient;
According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space. Such as Fig. 3, show public sentiment phrase in the three-dimensional coordinate for quoting probability, the versatility probability and timeliness coefficient structure Distribution situation.Three-dimensional phrase probability space is divided by extracting public sentiment phrase feature and probability coefficent, and by it with solid space The form displaying of cloth, simplify public sentiment feature of semanteme analysis below, its characteristic identification accuracy can be improved.
Step S102, the phrase sequence of wall scroll public feelings information is extracted, and with reference to the public sentiment phrase probability space, build language Adopted joint probability matrix.
In all public feelings informations for needing to analyze, for wall scroll public feelings information, the phrase sequence of extraction wall scroll public feelings information Row.
According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, according to the phrase sequence In threat weight distribution structure of the public feelings information that forms of any two phrase in phrase probability space threaten weight distribution matrix, The integrated individual weight product matrix of structure of weight product is threatened according to any two phrase itself in the phrase sequence, according to institute's predicate Any two phrase itself probability space characteristic builds individual probability matrix in group sequence.
Finally with reference to the frequency matrix, the threat weight distribution matrix, the individual weight product matrix and described Body probability matrix builds semantic joint probability matrix.
Step S103, the wall scroll public sentiment is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm The threat coefficient of information, with reference to the semantic joint probability matrix structure node semantics tree.
The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, using markov random file (Markov Random Field, MRF) chain joint probability assumes that foundation obtains to assess the semantic reasonability of wall scroll public feelings information To overall reasonability and the semantic reasonability, obtain threatening coefficient, and with reference to the semantic joint probability matrix structure node Semantic tree.
Specifically, Naive Bayes Classification Algorithm:Posterior probability=standard similarity * prior probabilities, it is assumed that public feelings information For D, public feelings information D threat characteristic is drawn by the N number of phrase for forming the information, uses H+Threat information is represented, then utilizes simplicity Bayesian Classification Arithmetic can be described as:P(H+|D)∝P(H+)*P(D|H+), therefore adopted because public feelings information D is made up of N number of phrase Its overall reasonability and semantic reasonability are assessed with following two ways.
Conditional independence assumption:We assume that the phrase for forming public feelings information has no directly affect between each other, according to its connection Close probability and judge its overall reasonability, formula:
P(H+|D)∝P(H+)*P(N1|H+)*P(N2|H+)....*P(Nn|H+) will constituting a threat to property priori conditions by each word Group threat probabilities replace, final to obtain overall reasonable property coefficient.
MRF chains joint probability is assumed:According to MRF chain principles, the value of each state depends on above in status switch N number of state.With reference to public feelings information, each phrase and N number of phrase above is relevant meets semantic feature, therefore, we Assuming that N=1, then can be expressed as:
P(H+|D)∝P(H+)*P(N1)*P(N2|N1)*P(N3|N2)....*P(Nn|Hn-1), according to the joint of combination phrase Probability forms the threat probabilities of current phrase, final to obtain semantic reasonable property coefficient.
According to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, the threat coefficient meets default During threat condition, node semantics tree is built according to semantic joint probability matrix.
Step S104, node Internet topology distribution is obtained from social meshed network by depth probe algorithm, and Bidirectional nodes incidence matrix is built, viscosity matching system is calculated according to the bidirectional nodes incidence matrix and the node semantics tree Number.
The reasonable property coefficient (threatening coefficient) of every public feelings information is drawn in step S103, for " heat transfer pattern " The threats coefficient of each user is the threat coefficient average value of its all public feelings information operated, the threat of each public feelings information Coefficient can convert with the threat coefficient for operating its user in accumulative, if user is first node, public feelings information is the second section Point, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, the final meshed network that obtains is opened up Flutter distribution.It is the schematic diagram of meshed network topology distribution provided in an embodiment of the present invention such as Fig. 4.Wherein, X1, X2, X3 are representative The first node of user, Y1, Y2, Y3, Y4 are to represent the section point b11-b34 of public feelings information as user and the company of public feelings information Side, characterize user and each public feelings information can transmit carriage with opening relationships, a12, a21, a23 between three users Feelings information.
Between node by public feelings information produce interaction, and the threat characteristic of node and public feelings information by itself in net Diffusion in network determined, high in social networks is threatened and public feelings information that range of scatter is wide has specific aim, and And its algorithm accuracy is high, calculating speed is fast, can degree of concurrence height.
Step S105, row vector conversion is entered to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and profit Information source network is obtained with Multi-layer technology algorithm and the viscosity matching factor.
Vector conversion is realized to bidirectional nodes incidence matrix first, forms initial matrix to be analyzed, for initial matrix, Row i and row j represent node users, and by taking four users as an example, initial matrix is:
Data α in matrixijRepresent viscosity matching factor.The maximum preceding k node users of viscosity matching factor are selected first As seed node, search connection sensing node for each seed node and form candidate parent nodes combination, then from k kind Frequency and viscosity matching factor highest node N are extracted in the candidate parent nodes of child node, associating sensing establishment according to N is based on him Node tree.The N relevant child node of institute is searched, with reference to seed node feature, extraction seed node is analyzed by Semantic Clustering Similar child node.Its respective child nodes is obtained for seed node, by being carried out to seed node and its child nodes The strong association child nodes of Semantic Clustering analysis extraction, and repaint nodes.By that analogy, until weak rigidity causes node tree Untill convergence closure, final information source network is generated.
Viscosity matching factor can accurately represent the linkage between node, be obtained in its support lower leaf extraction algorithm To accomplish low error extraction network hierarchical structure.For the social networks with hierarchical structuring, by network topology structure Analysis extraction key node and key network, are reduced to the classification tree with hierarchical structure, its data structure is propped up Support more Data Analysis Models.
Step S106, information source semantic tree is built for described information source network, and combine node Internet topology point Cloth and utilization viscosity extended algorithm draw information source phrase viscosity distribution map.
For its node semantics tree of all Node extractions in information source network, and realize and all construct information source language Justice tree.With reference to meshed network topology distribution, information source phrase viscosity distribution map is drawn using viscosity extended algorithm.
As shown in figure 5, it is information source phrase viscosity distribution map provided in an embodiment of the present invention.This information source phrase viscosity point Butut is based on two-dimensional plane coordinate system, selects information source network root to be placed in coordinate system center O first, according to its son node number Amount calculates misalignment angle to build reference vector, according to each child node and the viscosity matching factor of root node, determines it in benchmark Position deviation on vector, then according to the viscosity matching factor of the association child node of the child node, determine it in vertical reference Position deviation on vector, finally determine the position of the child node in a coordinate system, such as O1Point.By that analogy, for all sons Node tree uses the algorithm to be extended with configuration information source phrase viscosity distribution map.
Viscosity extended algorithm realizes is converted into scatterplot distribution of the node in two-dimensional space by semantic tree, according to node tree layer Level builds multistage coordinate system and the method that node differential location is determined based on basis vector, does node combination viscosity matching factor It is distributed to accurate rational state.
Step S107, the semantic spy of information source is extracted from the phrase viscosity distribution map of described information source using viscosity clustering algorithm Property phrase, and analysis is associated to the semantic tree of each node itself in information source network, extracts information source.
In phrase viscosity distribution map, such as Fig. 6, for all phrase nodes, point-rendering can include week centered on it That encloses most multinode is just distributed very much circle, such as g (m11)、g(m22)、g(m33), and ensure number of nodes and radius of circle ratio Meet threshold value.The K node extracted in this approach just too enclose by distribution, is permeated if Centroid occur and mutually including state Individual new normal distribution circle, such as g (m22) and g (m33) composition is fused to new normal distribution circle, and marks new centromere Point combines for the two original circle center node phrases that are just being distributed very much.Eventually through extraction, just too node phrase, configuration information are being enclosed in distribution Source feature of semanteme phrase.Finally, combining information source feature of semanteme phrase, to the node language of each node itself in information source network Justice tree is associated analysis, deletes the node that similarity is unsatisfactory for condition, extracts final information source.
In other embodiments, the ageing diffusion model and public sentiment phrase of information source network can be combined with The transform characteristics of interaction transformation model analysis information source network itself, summarize and threaten the increased information source phrase of coefficient, by it It is summarized in the detection and analysis scope of information source.
Fig. 7 is refer to, is the structural representation of server 10 provided in an embodiment of the present invention.The server 10 can be meter Calculation machine or other any computing devices with data-handling capacity, including processor 101, memory 102, bus 103 and logical Believe interface 104, the processor 101, communication interface 104 and memory 102 are connected by bus 103;Processor 101 is used to hold The executable module stored in line storage 102, such as computer program.
Wherein, memory 102 may include high-speed random access memory (RAM:Random Access Memory), Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage may also be included.By at least One communication interface 103 (can be wired or wireless) realizes the communication between the system network element and at least one other network element Connection.
Bus 104 can be isa bus, pci bus or eisa bus etc..Only represented in Fig. 3 with a four-headed arrow, but It is not offered as only a bus or a type of bus.
Wherein, memory 102 is used for storage program, and Network Information Sources as shown in Figure 8 search device 200.The network is believed Device 200 is searched in breath source can be stored in the memory 102 including at least one in the form of software or firmware (firmware) In or the software function module that is solidificated in the operating system (operating system, OS) of the server 10.The place Device 101 is managed after execute instruction is received, performs described program to realize that the Network Information Sources that the embodiment of the present invention discloses are searched Method.
Processor 101 is probably a kind of IC chip, has the disposal ability of signal.It is above-mentioned in implementation process Each step of method can be completed by the integrated logic circuit of the hardware in processor 101 or the instruction of software form.On The processor 101 stated can be general processor, including central processing unit (Central Processing Unit, referred to as CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (DSP), special Integrated circuit (ASIC), ready-made programmable gate array (FPGA) either other PLDs, discrete gate or transistor Logical device, discrete hardware components.
Fig. 8 is refer to, is the high-level schematic functional block diagram that Network Information Sources provided in an embodiment of the present invention search device 200. The Network Information Sources lookup device 200 includes, probability space structure module 201, probability matrix structure module 202, node language Justice tree structure module 203, computing module 204, acquisition module 205, drafting module 206 and extraction module 207.
Probability space builds module 201, for according to public sentiment phrase database, building public sentiment phrase probability space.
In the embodiment of the present invention, the probability space structure module 201 can perform step S101.
Probability matrix builds module 202, for extracting the phrase sequence of wall scroll public feelings information, and with reference to the public sentiment phrase Probability space, build semantic joint probability matrix.
In the embodiment of the present invention, the probability matrix structure module 202 can perform step S102.
Node semantics tree builds module 203, for being calculated using the semantic joint probability matrix and Naive Bayes Classification Method obtains the threat coefficient of the wall scroll public feelings information, with reference to the semantic joint probability matrix structure node semantics tree.
In the embodiment of the present invention, the node semantics tree structure module 203 can perform step S103.
Computing module 204, for obtaining node Internet topology from social meshed network by depth probe algorithm Distribution, and bidirectional nodes incidence matrix is built, calculate viscosity according to the bidirectional nodes incidence matrix and the node semantics tree Matching factor.
In the embodiment of the present invention, the computing module 204 can perform step S104.
Acquisition module 205, for entering row vector conversion to the bidirectional nodes incidence matrix, form initial square to be analyzed Battle array, and obtain information source network using Multi-layer technology algorithm and the viscosity matching factor.
In the embodiment of the present invention, the acquisition module 205 can perform step S105.
Drafting module 206, for building information source semantic tree for described information source network, and combine the node Internet Topology distribution and utilization viscosity extended algorithm draw information source phrase viscosity distribution map.
In the embodiment of the present invention, the drafting module 206 can perform step S106.
Extraction module 207, for extracting information from the phrase viscosity distribution map of described information source using viscosity clustering algorithm Source feature of semanteme phrase, and analysis is associated to the semantic tree of each node itself in information source network, extract information source.
In the embodiment of the present invention, the extraction module 207 can perform step S107.
In summary, Network Information Sources lookup method, device and server provided in an embodiment of the present invention, are believed by public sentiment Breath semantics recognition and social networks node viscosity association analysis make it have information source Network finding and property identification function.Relatively Extracted in traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space and semantic connection Close matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion viscosity matching Extract information source network, viscosity clustering algorithm and cross correlation and identify final information source, show as more accurate rational information source Capture, based on identical public sentiment characteristic, have various analysis dimension, social networks analysis and the identification of public sentiment characteristic deeply, The more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, detect public sentiment Source network, easily find social network information source.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, can also pass through Other modes are realized.Device embodiment described above is only schematical, for example, flow chart and block diagram in accompanying drawing Show the device of multiple embodiments according to the present invention, method and computer program product architectural framework in the cards, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code Part, a part for the module, program segment or code include one or more and are used to realize holding for defined logic function Row instruction.It should also be noted that at some as in the implementation replaced, the function that is marked in square frame can also with different from The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially perform substantially in parallel, they are sometimes It can perform in the opposite order, this is depending on involved function.It is it is also noted that every in block diagram and/or flow chart The combination of individual square frame and block diagram and/or the square frame in flow chart, function or the special base of action as defined in performing can be used Realize, or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each functional module in each embodiment of the present invention can integrate to form an independent portion Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized in the form of software function module and is used as independent production marketing or in use, can be with It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words The part to be contributed to prior art or the part of the technical scheme can be embodied in the form of software product, the meter Calculation machine software product is stored in a storage medium, including some instructions are causing a computer equipment (can be People's computer, server, or network equipment etc.) perform all or part of step of each embodiment methods described of the present invention. And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.Need Illustrate, herein, such as first and second or the like relational terms be used merely to by an entity or operation with Another entity or operation make a distinction, and not necessarily require or imply between these entities or operation any this reality be present The relation or order on border.Moreover, term " comprising ", "comprising" or its any other variant are intended to the bag of nonexcludability Contain, so that process, method, article or equipment including a series of elements not only include those key elements, but also including The other element being not expressly set out, or also include for this process, method, article or the intrinsic key element of equipment. In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including the key element Process, method, other identical element also be present in article or equipment.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.It should be noted that:Similar label and letter exists Similar terms is represented in following accompanying drawing, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing It is further defined and explained.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be defined by scope of the claims.

Claims (10)

  1. A kind of 1. Network Information Sources lookup method, it is characterised in that including:
    According to public sentiment phrase database, public sentiment phrase probability space is built;
    The phrase sequence of wall scroll public feelings information is extracted, and with reference to the public sentiment phrase probability space, builds semantic joint probability square Battle array;
    The threat system of the wall scroll public feelings information is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm Number, with reference to the semantic joint probability matrix structure node semantics tree;
    Node Internet topology distribution is obtained from social meshed network by depth probe algorithm, and builds bidirectional nodes pass Join matrix, viscosity matching factor is calculated according to the bidirectional nodes incidence matrix and the node semantics tree;
    Enter row vector conversion to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and calculate using Multi-layer technology Method and the viscosity matching factor obtain information source network;
    Information source semantic tree is built for described information source network, and combines node Internet topology distribution and is prolonged using viscosity Stretch algorithm and draw information source phrase viscosity distribution map;
    Information source feature of semanteme phrase is extracted from the phrase viscosity distribution map of described information source using viscosity clustering algorithm, and to letter The semantic tree of each node itself in breath source network is associated analysis, extracts information source.
  2. 2. Network Information Sources lookup method according to claim 1, it is characterised in that described according to public sentiment phrase data Storehouse, build public sentiment phrase probability space the step of also include:
    The reference probability of each phrase in the public sentiment phrase database is calculated, according to the phrase in the public sentiment phrase database point Cloth state computation versatility probability, the usage time distribution computational valid time system according to each phrase in the public sentiment phrase database Number;
    According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
  3. 3. Network Information Sources lookup method according to claim 1 or 2, it is characterised in that the extraction wall scroll public sentiment letter The phrase sequence of breath, and with reference to the public sentiment phrase probability space, the step of building semantic joint probability matrix, also include:
    Extract the phrase sequence of wall scroll public feelings information;
    According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, appoints according in the phrase sequence Threat weight distribution structure of the public feelings information that two phrases of anticipating are formed in phrase probability space threatens weight distribution matrix, according to Any two phrase itself threatens the integrated individual weight product matrix of structure of weight product in the phrase sequence, according to the phrase sequence Any two phrase itself probability space characteristic builds individual probability matrix in row;
    With reference to the frequency matrix, threat weight distribution matrix, the individual weight product matrix and the individual probability square Battle array builds semantic joint probability matrix.
  4. 4. Network Information Sources lookup method according to claim 3, it is characterised in that described general using the semantic joint Rate matrix and Naive Bayes Classification Algorithm obtain the threat coefficient of the wall scroll public feelings information, with reference to the semantic joint probability The step of matrix structure node semantics tree, also includes:
    The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, using markov random file chain joint probability Assuming that to assess the semantic reasonability of wall scroll public feelings information, according to overall reasonability and the semantic reasonability is obtained, prestige is obtained Coefficient is coerced, and with reference to the semantic joint probability matrix structure node semantics tree.
  5. 5. Network Information Sources lookup method according to claim 4, it is characterised in that it is described by depth probe algorithm from The step of node Internet topology distribution is obtained in social meshed network also includes:
    The threat coefficient of each user is the threat coefficient average value of its all public feelings information operated, each public feelings information Threaten coefficient can be with the threat coefficient of its user is operated in accumulative conversion, if user is first node, public feelings information the Two nodes, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, it is final to obtain node net Network topology distribution.
  6. 6. a kind of Network Information Sources search device, it is characterised in that including:
    Probability space builds module, for according to public sentiment phrase database, building public sentiment phrase probability space;
    Probability matrix builds module, for extracting the phrase sequence of wall scroll public feelings information, and it is empty with reference to the public sentiment phrase probability Between, build semantic joint probability matrix;
    Node semantics tree builds module, for obtaining institute using the semantic joint probability matrix and Naive Bayes Classification Algorithm The threat coefficient of wall scroll public feelings information is stated, with reference to the semantic joint probability matrix structure node semantics tree;
    Computing module, for obtaining node Internet topology distribution from social meshed network by depth probe algorithm, and Bidirectional nodes incidence matrix is built, viscosity matching system is calculated according to the bidirectional nodes incidence matrix and the node semantics tree Number;
    Acquisition module, for entering row vector conversion to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and profit Information source network is obtained with Multi-layer technology algorithm and the viscosity matching factor;
    Drafting module, for building information source semantic tree for described information source network, and combine node Internet topology point Cloth and utilization viscosity extended algorithm draw information source phrase viscosity distribution map;
    Extraction module, it is semantic special for extracting information source from the phrase viscosity distribution map of described information source using viscosity clustering algorithm Property phrase, and analysis is associated to the semantic tree of each node itself in information source network, extracts information source.
  7. 7. Network Information Sources according to claim 6 search device, it is characterised in that the probability space structure module is also For:The reference probability of each phrase in the public sentiment phrase database is calculated, according to the phrase in the public sentiment phrase database Distribution calculates versatility probability, the usage time distribution computational valid time system according to each phrase in the public sentiment phrase database Number;
    According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
  8. 8. the Network Information Sources according to claim 6 or 7 search device, it is characterised in that the probability matrix builds mould Block is additionally operable to:Extract the phrase sequence of wall scroll public feelings information;
    According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, appoints according in the phrase sequence Threat weight distribution structure of the public feelings information that two phrases of anticipating are formed in phrase probability space threatens weight distribution matrix, according to Any two phrase itself threatens the integrated individual weight product matrix of structure of weight product in the phrase sequence, according to the phrase sequence Any two phrase itself probability space characteristic builds individual probability matrix in row;
    With reference to the frequency matrix, threat weight distribution matrix, the individual weight product matrix and the individual probability square Battle array builds semantic joint probability matrix.
  9. 9. Network Information Sources according to claim 8 search device, it is characterised in that the node semantics tree builds module It is additionally operable to:The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, is combined using markov random file chain Probability is assumed, to assess the semantic reasonability of wall scroll public feelings information, according to overall reasonability and the semantic reasonability is obtained, to obtain To threat coefficient, and with reference to the semantic joint probability matrix structure node semantics tree.
  10. A kind of 10. server, it is characterised in that including:
    One or more processors;
    Memory, for storing one or more programs,
    When one or more of programs are by one or more of computing devices so that one or more of processors Realize the method as described in any in claim 1-5.
CN201711223777.4A 2017-11-29 2017-11-29 Network information source searching method and device and server Active CN107862081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711223777.4A CN107862081B (en) 2017-11-29 2017-11-29 Network information source searching method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711223777.4A CN107862081B (en) 2017-11-29 2017-11-29 Network information source searching method and device and server

Publications (2)

Publication Number Publication Date
CN107862081A true CN107862081A (en) 2018-03-30
CN107862081B CN107862081B (en) 2021-07-16

Family

ID=61704267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711223777.4A Active CN107862081B (en) 2017-11-29 2017-11-29 Network information source searching method and device and server

Country Status (1)

Country Link
CN (1) CN107862081B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508376A (en) * 2020-11-30 2021-03-16 中国科学院深圳先进技术研究院 Index system construction method
CN112861956A (en) * 2021-02-01 2021-05-28 浪潮云信息技术股份公司 Water pollution model construction method based on data analysis

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001080080A2 (en) * 2000-04-14 2001-10-25 Rightnow Technologies, Inc. Usage based strength between related help topics and context based mapping thereof in a help information retrieval system
CN1766871A (en) * 2004-10-29 2006-05-03 中国科学院研究生院 The processing method of the semi-structured data extraction of semantics of based on the context
CN1853180A (en) * 2003-02-14 2006-10-25 尼维纳公司 System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation
US20070073748A1 (en) * 2005-09-27 2007-03-29 Barney Jonathan A Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
CN101122909A (en) * 2006-08-10 2008-02-13 株式会社日立制作所 Text message indexing unit and text message indexing method
US7702635B2 (en) * 2002-04-04 2010-04-20 Microsoft Corporation System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities
CN102012929A (en) * 2010-11-26 2011-04-13 北京交通大学 Network consensus prediction method and system
CN102411611A (en) * 2011-10-15 2012-04-11 西安交通大学 Instant interactive text oriented event identifying and tracking method
US8209331B1 (en) * 2008-04-02 2012-06-26 Google Inc. Context sensitive ranking
CN102521291A (en) * 2011-11-29 2012-06-27 浙江大学 ANTLR (Another Tool for Language Recognition)-based importing method for LDF (Log Data File) of description file of LIN (Local Interconnect Network)
CN102789498A (en) * 2012-07-16 2012-11-21 钱钢 Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning
US20140221014A1 (en) * 2013-02-05 2014-08-07 Nec (China) Co., Ltd. Device and method for mobility pattern mining
WO2014127673A1 (en) * 2013-02-25 2014-08-28 Tencent Technology (Shenzhen) Company Limited Method and apparatus for acquiring hot topics
CN105677873A (en) * 2016-01-11 2016-06-15 中国电子科技集团公司第十研究所 Text information associating and clustering collecting processing method based on domain knowledge model
US20170075877A1 (en) * 2015-09-16 2017-03-16 Marie-Therese LEPELTIER Methods and systems of handling patent claims
CN106980385A (en) * 2017-04-07 2017-07-25 吉林大学 A kind of Virtual assemble device, system and method
CN107066256A (en) * 2017-02-24 2017-08-18 中国人民解放军海军大连舰艇学院 A kind of object based on tense changes the modeling method of model

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001080080A2 (en) * 2000-04-14 2001-10-25 Rightnow Technologies, Inc. Usage based strength between related help topics and context based mapping thereof in a help information retrieval system
US7702635B2 (en) * 2002-04-04 2010-04-20 Microsoft Corporation System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities
CN1853180A (en) * 2003-02-14 2006-10-25 尼维纳公司 System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation
CN1766871A (en) * 2004-10-29 2006-05-03 中国科学院研究生院 The processing method of the semi-structured data extraction of semantics of based on the context
US20070073748A1 (en) * 2005-09-27 2007-03-29 Barney Jonathan A Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
US8131701B2 (en) * 2005-09-27 2012-03-06 Patentratings, Llc Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
CN101122909A (en) * 2006-08-10 2008-02-13 株式会社日立制作所 Text message indexing unit and text message indexing method
US8209331B1 (en) * 2008-04-02 2012-06-26 Google Inc. Context sensitive ranking
CN102012929A (en) * 2010-11-26 2011-04-13 北京交通大学 Network consensus prediction method and system
CN102411611A (en) * 2011-10-15 2012-04-11 西安交通大学 Instant interactive text oriented event identifying and tracking method
CN102521291A (en) * 2011-11-29 2012-06-27 浙江大学 ANTLR (Another Tool for Language Recognition)-based importing method for LDF (Log Data File) of description file of LIN (Local Interconnect Network)
CN102789498A (en) * 2012-07-16 2012-11-21 钱钢 Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning
US20140221014A1 (en) * 2013-02-05 2014-08-07 Nec (China) Co., Ltd. Device and method for mobility pattern mining
WO2014127673A1 (en) * 2013-02-25 2014-08-28 Tencent Technology (Shenzhen) Company Limited Method and apparatus for acquiring hot topics
US20140280242A1 (en) * 2013-02-25 2014-09-18 Tencent Technology (Shenzhen) Company Limited Method and apparatus for acquiring hot topics
US20170075877A1 (en) * 2015-09-16 2017-03-16 Marie-Therese LEPELTIER Methods and systems of handling patent claims
CN105677873A (en) * 2016-01-11 2016-06-15 中国电子科技集团公司第十研究所 Text information associating and clustering collecting processing method based on domain knowledge model
CN107066256A (en) * 2017-02-24 2017-08-18 中国人民解放军海军大连舰艇学院 A kind of object based on tense changes the modeling method of model
CN106980385A (en) * 2017-04-07 2017-07-25 吉林大学 A kind of Virtual assemble device, system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GRAHAM MCDONALD ET.AL: ""Enhancing Sensitivity Classification with Semantic"", 《COMPUTER SCIENCE》 *
冯颖: ""网络舆情敏感话题发现平台的研究"", 《中国优秀硕士学位论文全文数据库息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508376A (en) * 2020-11-30 2021-03-16 中国科学院深圳先进技术研究院 Index system construction method
CN112861956A (en) * 2021-02-01 2021-05-28 浪潮云信息技术股份公司 Water pollution model construction method based on data analysis

Also Published As

Publication number Publication date
CN107862081B (en) 2021-07-16

Similar Documents

Publication Publication Date Title
Wang et al. Extreme clustering–a clustering method via density extreme points
CN108334574B (en) Cross-modal retrieval method based on collaborative matrix decomposition
CN111612041B (en) Abnormal user identification method and device, storage medium and electronic equipment
CN107153713A (en) Overlapping community detection method and system based on similitude between node in social networks
WO2019041521A1 (en) Apparatus and method for extracting user keyword, and computer-readable storage medium
CN106776562A (en) A kind of keyword extracting method and extraction system
CN108664574A (en) Input method, terminal device and the medium of information
CN106503086A (en) The detection method of distributed local outlier
CN104239553A (en) Entity recognition method based on Map-Reduce framework
CN112148843B (en) Text processing method and device, terminal equipment and storage medium
JP5057474B2 (en) Method and system for calculating competition index between objects
Lian et al. Reverse skyline search in uncertain databases
Xiong et al. Affective impression: Sentiment-awareness POI suggestion via embedding in heterogeneous LBSNs
Yuan et al. Privacy‐preserving mechanism for mixed data clustering with local differential privacy
CN107862081A (en) Network Information Sources lookup method, device and server
CN110019763B (en) Text filtering method, system, equipment and computer readable storage medium
Yuan et al. Research of deceptive review detection based on target product identification and metapath feature weight calculation
CN114092729A (en) Heterogeneous electricity consumption data publishing method based on cluster anonymization and differential privacy protection
Liu et al. A network-based CNN model to identify the hidden information in text data
Tijare et al. Correlation between k-means clustering and topic modeling methods on twitter datasets
US20200142910A1 (en) Data clustering apparatus and method based on range query using cf tree
CN113988878A (en) Graph database technology-based anti-fraud method and system
WO2021142968A1 (en) Multilingual-oriented semantic similarity calculation method for general place names, and application thereof
Hong et al. High-quality noise detection for knowledge graph embedding with rule-based triple confidence
Wang et al. Edcleaner: Data cleaning for entity information in social network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant