CN107862081A - Network Information Sources lookup method, device and server - Google Patents
Network Information Sources lookup method, device and server Download PDFInfo
- Publication number
- CN107862081A CN107862081A CN201711223777.4A CN201711223777A CN107862081A CN 107862081 A CN107862081 A CN 107862081A CN 201711223777 A CN201711223777 A CN 201711223777A CN 107862081 A CN107862081 A CN 107862081A
- Authority
- CN
- China
- Prior art keywords
- phrase
- matrix
- probability
- semantic
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present invention provides a kind of Network Information Sources lookup method, device and server, is related to computer safety field.Information source Network finding and property identification function are made it have by public feelings information semantics recognition and social networks node viscosity association analysis.Extracted relative to traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space and semantic confederate matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion viscosity matching extraction information source network, viscosity clustering algorithm and cross correlation identify final information source, show as more accurate rational information source capture, based on identical public sentiment characteristic, there is various analysis dimension, social networks analysis and the identification of public sentiment characteristic to go deep into, the more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, detect public sentiment source network, easily find social network information source.
Description
Technical field
The present invention relates to computer safety field, in particular to a kind of Network Information Sources lookup method, device and clothes
Business device.
Background technology
With the fast development of internet, Network Awareness form safety problem has obtained unprecedented attention.As work as
The distribution centre of modern ideology and culture and the amplifier of Social Public Feelings, social networks active degree in internet have reached unprecedented
The feature such as height, its direct, sudden, deviation make its as society and attention from government and the key object of monitoring.Carriage
The quick grasp of feelings information, the accurate prediction of public sentiment trend and public sentiment threaten the quick excavation in source and identification to turn into public sentiment safety
The key point of attacking and defending war, but in face of current multi-field extension, huge user group and fast-changing network environment, make biography
The public sentiment regulatory format of system is felt simply helpless completely.At present, most public sentiment discriminance analysis mode is all based on traditional statistical
Analysis pattern, the threat key word library of manual maintenance is generally based on, and does not consider the incidence relation between phrase, it is passed
The property broadcast and ageing deep consideration and analysis are not done.
The content of the invention
In view of this, the purpose of the embodiment of the present invention is to provide a kind of Network Information Sources lookup method, device and service
Device, the problem of source of the public feelings information with menace must not fast and accurately be excavated with solution.
The embodiment of the present invention provides a kind of Network Information Sources lookup method, including:According to public sentiment phrase database, carriage is built
Feelings phrase probability space;The phrase sequence of wall scroll public feelings information is extracted, and it is semantic with reference to the public sentiment phrase probability space, structure
Joint probability matrix;The wall scroll public feelings information is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm
Threat coefficient, with reference to the semantic joint probability matrix structure node semantics tree;By depth probe algorithm from social node
Node Internet topology distribution is obtained in network, and builds bidirectional nodes incidence matrix, square is associated according to the bidirectional nodes
Battle array and the node semantics tree calculate viscosity matching factor;Enter row vector conversion to the bidirectional nodes incidence matrix, composition is treated
The initial matrix of analysis, and obtain information source net from the initial matrix using Multi-layer technology algorithm and the viscosity matching factor
Network;Information source semantic tree is built for described information source network, and combines node Internet topology distribution and is prolonged using viscosity
Stretch algorithm and draw information source phrase viscosity distribution map;Carried using viscosity clustering algorithm from the phrase viscosity distribution map of described information source
Information source feature of semanteme phrase is taken, and analysis, extraction letter are associated to the semantic tree of each node itself in information source network
Breath source.
Preferably, described according to public sentiment phrase database, the step of building public sentiment phrase probability space, also includes:Calculate institute
The reference probability of each phrase in public sentiment phrase database is stated, is calculated according to the phrase distribution in the public sentiment phrase database
Versatility probability, the usage time according to each phrase in the public sentiment phrase database are distributed computational valid time coefficient;According to described in
Quote probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
Preferably, the phrase sequence of the extraction wall scroll public feelings information, and with reference to the public sentiment phrase probability space, structure
The step of semantic joint probability matrix, also includes:Extract the phrase sequence of wall scroll public feelings information;Appoint according in the phrase sequence
The frequency of occurrences builds frequency matrix to two phrases of anticipating simultaneously, is existed according to the public feelings information that any two phrase is formed in the phrase sequence
Threat weight distribution structure in phrase probability space threatens weight distribution matrix, according to any two phrase in the phrase sequence
The integrated individual weight product matrix of structure of weight product itself is threatened, it is empty according to any two phrase itself probability in the phrase sequence
Between characteristic build individual probability matrix;With reference to the frequency matrix, the threat weight distribution matrix, the individual weight product moment
Battle array and the individual probability matrix build semantic joint probability matrix.
Preferably, it is described to obtain the wall scroll carriage using the semantic joint probability matrix and Naive Bayes Classification Algorithm
The threat coefficient of feelings information, also include with reference to the step of semantic joint probability matrix structure node semantics tree:Using condition
The independent overall reasonability for assuming to assess wall scroll public feelings information, is assumed using markov random file chain joint probability to assess list
The semantic reasonability of bar public feelings information, according to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, and combine
The semantic joint probability matrix structure node semantics tree.
Preferably, it is described to obtain node Internet topology distribution from social meshed network by depth probe algorithm
Step also includes:The threats coefficient of each user is the threat coefficient average value of its all public feelings information operated, Mei Geyu
The threat coefficient of feelings information can convert with the threat coefficient for operating its user in accumulative, if user is first node, public sentiment
Information is section point, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, finally obtained
Take meshed network topology distribution.
The embodiment of the present invention also provides a kind of Network Information Sources and searches device, including:Probability space build module, for according to
According to public sentiment phrase database, public sentiment phrase probability space is built;Probability matrix builds module, for extracting wall scroll public feelings information
Phrase sequence, and with reference to the public sentiment phrase probability space, build semantic joint probability matrix;Node semantics tree builds module,
For obtaining the threat system of the wall scroll public feelings information using the semantic joint probability matrix and Naive Bayes Classification Algorithm
Number, with reference to the semantic joint probability matrix structure node semantics tree;Computing module, for by depth probe algorithm from social activity
Node Internet topology distribution is obtained in meshed network, and builds bidirectional nodes incidence matrix, is closed according to the bidirectional nodes
Join matrix and the node semantics tree calculates viscosity matching factor;Acquisition module, for entering to the bidirectional nodes incidence matrix
Row vector is changed, and forms initial matrix to be analyzed, and obtain information using Multi-layer technology algorithm and the viscosity matching factor
Source network;Drafting module, for building information source semantic tree for described information source network, and combine node Internet topology
It is distributed and draws information source phrase viscosity distribution map using viscosity extended algorithm;Extraction module, for utilizing viscosity clustering algorithm
Information source feature of semanteme phrase is extracted from the phrase viscosity distribution map of described information source, and to each node in information source network certainly
The semantic tree of body is associated analysis, extracts information source.
Preferably, the probability space structure module is additionally operable to:Calculate drawing for each phrase in the public sentiment phrase database
With probability, versatility probability is calculated according to the phrase distribution in the public sentiment phrase database, according to the public sentiment phrase
The usage time distribution computational valid time coefficient of each phrase in database;According to it is described reference probability, the versatility probability and
The timeliness coefficient builds public sentiment phrase probability space.
Preferably, the probability matrix structure module is additionally operable to:Extract the phrase sequence of wall scroll public feelings information;According to described
The frequency of occurrences builds frequency matrix to any two phrase simultaneously in phrase sequence, is formed according to any two phrase in the phrase sequence
Public feelings information in phrase probability space threat weight distribution structure threaten weight distribution matrix, according to the phrase sequence
In any two phrase itself threaten the integrated individual weight product matrix of structure of weight product, according to any two word in the phrase sequence
Itself probability space characteristic of group builds individual probability matrix;With reference to the frequency matrix, the threat weight distribution matrix, described
Individual weight product matrix and the individual probability matrix build semantic joint probability matrix.
Preferably, the node semantics tree structure module is additionally operable to:Wall scroll public feelings information is assessed using conditional independence assumption
Overall reasonability, assume to assess the semantic reasonability of wall scroll public feelings information using markov random file chain joint probability,
According to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, and with reference to the semantic joint probability matrix structure
Build node semantics tree.
The embodiment of the present invention also provides a kind of server, including:One or more processors;Memory, for storing one
Individual or multiple programs, when one or more of programs are by one or more of computing devices so that it is one or
Multiple processors realize Network Information Sources lookup method as described above.
Compared with prior art, Network Information Sources lookup method, device and server provided in an embodiment of the present invention, pass through
Public feelings information semantics recognition and social networks node viscosity association analysis make it have information source Network finding and property identification work(
Energy.Extracted relative to traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space
With semantic confederate matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion
Viscosity matching extraction information source network, viscosity clustering algorithm and cross correlation identify final information source, and it is more accurate reasonable to show as
Information source capture, based on identical public sentiment characteristic, there is various analysis dimension, social networks analysis and public sentiment characteristic to know
Not Shen Ru, the more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, inspection
Public sentiment source network is measured, easily finds social network information source.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate
Appended accompanying drawing, is described in detail below.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by embodiment it is required use it is attached
Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, therefore be not construed as pair
The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this
A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is the process structure figure of Network Information Sources lookup method provided in an embodiment of the present invention.
Fig. 2 is the flow chart of Network Information Sources lookup method provided in an embodiment of the present invention.
Fig. 3 is the schematic diagram of public sentiment phrase probability space provided in an embodiment of the present invention.
Fig. 4 is the schematic diagram of meshed network topology distribution provided in an embodiment of the present invention.
Fig. 5 is information source phrase viscosity distribution map provided in an embodiment of the present invention.
Fig. 6 is the schematic diagram that viscosity cluster algorithm provided in an embodiment of the present invention obtains information source.
Fig. 7 is the structural representation of server provided in an embodiment of the present invention.
Fig. 8 is the high-level schematic functional block diagram that Network Information Sources provided in an embodiment of the present invention search device.
Icon:10- servers;101- processors;102- memories;103- buses;104- communication interfaces;200- networks are believed
Search device in breath source;201- probability spaces build module;202- probability matrixs build module;203- node semantics tree builds mould
Block;204- computing modules;205- acquisition modules;206- drafting modules;207- extraction modules.
Embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Ground describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Generally exist
The component of the embodiment of the present invention described and illustrated in accompanying drawing can be configured to arrange and design with a variety of herein.Cause
This, the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention below
Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi
It is defined, then it further need not be defined and explained in subsequent accompanying drawing in individual accompanying drawing.Meanwhile the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that instruction or hint relative importance.
Fig. 1 is refer to, is the process structure figure of Network Information Sources lookup method provided in an embodiment of the present invention.The present embodiment
The Network Information Sources lookup method of offer, applied to server, for searching the information source with the public feelings information threatened.This reality
Apply the Network Information Sources lookup method combination phrase probability space and semantic confederate matrix division methods, Naive Bayes Classification of example
Method structure node semantics tree, the detection of node depth and vector conversion viscosity matching extract information source network, viscosity clustering algorithm and
Cross correlation identifies final information source, more accurate rational information source capture is shown as, based on identical public sentiment characteristic, tool
There are various analysis dimension, social networks analysis and the identification of public sentiment characteristic to go deep into, the more intuitive advantage of data representation, below will be detailed
Description.
Fig. 2 is refer to, is the flow chart of Network Information Sources lookup method provided in an embodiment of the present invention.Need what is mentioned
It is that method of the present invention is not using Fig. 2 and particular order as shown below as limitation.Below by the specific stream shown in Fig. 2
Journey and step are described in detail, and the Network Information Sources lookup method includes:
Step S101, according to public sentiment phrase database, build public sentiment phrase probability space.
Described, public sentiment phrase database is the database pre-set, is included in the public sentiment phrase database pre-stored
Have threaten phrase.The reference probability of each phrase in the public sentiment phrase database is calculated, according to the public sentiment phrase number
Versatility probability is calculated according to the phrase distribution in storehouse, the usage time point according to each phrase in the public sentiment phrase database
Cloth computational valid time coefficient;
According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
Such as Fig. 3, show public sentiment phrase in the three-dimensional coordinate for quoting probability, the versatility probability and timeliness coefficient structure
Distribution situation.Three-dimensional phrase probability space is divided by extracting public sentiment phrase feature and probability coefficent, and by it with solid space
The form displaying of cloth, simplify public sentiment feature of semanteme analysis below, its characteristic identification accuracy can be improved.
Step S102, the phrase sequence of wall scroll public feelings information is extracted, and with reference to the public sentiment phrase probability space, build language
Adopted joint probability matrix.
In all public feelings informations for needing to analyze, for wall scroll public feelings information, the phrase sequence of extraction wall scroll public feelings information
Row.
According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, according to the phrase sequence
In threat weight distribution structure of the public feelings information that forms of any two phrase in phrase probability space threaten weight distribution matrix,
The integrated individual weight product matrix of structure of weight product is threatened according to any two phrase itself in the phrase sequence, according to institute's predicate
Any two phrase itself probability space characteristic builds individual probability matrix in group sequence.
Finally with reference to the frequency matrix, the threat weight distribution matrix, the individual weight product matrix and described
Body probability matrix builds semantic joint probability matrix.
Step S103, the wall scroll public sentiment is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm
The threat coefficient of information, with reference to the semantic joint probability matrix structure node semantics tree.
The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, using markov random file
(Markov Random Field, MRF) chain joint probability assumes that foundation obtains to assess the semantic reasonability of wall scroll public feelings information
To overall reasonability and the semantic reasonability, obtain threatening coefficient, and with reference to the semantic joint probability matrix structure node
Semantic tree.
Specifically, Naive Bayes Classification Algorithm:Posterior probability=standard similarity * prior probabilities, it is assumed that public feelings information
For D, public feelings information D threat characteristic is drawn by the N number of phrase for forming the information, uses H+Threat information is represented, then utilizes simplicity
Bayesian Classification Arithmetic can be described as:P(H+|D)∝P(H+)*P(D|H+), therefore adopted because public feelings information D is made up of N number of phrase
Its overall reasonability and semantic reasonability are assessed with following two ways.
Conditional independence assumption:We assume that the phrase for forming public feelings information has no directly affect between each other, according to its connection
Close probability and judge its overall reasonability, formula:
P(H+|D)∝P(H+)*P(N1|H+)*P(N2|H+)....*P(Nn|H+) will constituting a threat to property priori conditions by each word
Group threat probabilities replace, final to obtain overall reasonable property coefficient.
MRF chains joint probability is assumed:According to MRF chain principles, the value of each state depends on above in status switch
N number of state.With reference to public feelings information, each phrase and N number of phrase above is relevant meets semantic feature, therefore, we
Assuming that N=1, then can be expressed as:
P(H+|D)∝P(H+)*P(N1)*P(N2|N1)*P(N3|N2)....*P(Nn|Hn-1), according to the joint of combination phrase
Probability forms the threat probabilities of current phrase, final to obtain semantic reasonable property coefficient.
According to overall reasonability and the semantic reasonability is obtained, obtain threatening coefficient, the threat coefficient meets default
During threat condition, node semantics tree is built according to semantic joint probability matrix.
Step S104, node Internet topology distribution is obtained from social meshed network by depth probe algorithm, and
Bidirectional nodes incidence matrix is built, viscosity matching system is calculated according to the bidirectional nodes incidence matrix and the node semantics tree
Number.
The reasonable property coefficient (threatening coefficient) of every public feelings information is drawn in step S103, for " heat transfer pattern "
The threats coefficient of each user is the threat coefficient average value of its all public feelings information operated, the threat of each public feelings information
Coefficient can convert with the threat coefficient for operating its user in accumulative, if user is first node, public feelings information is the second section
Point, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, the final meshed network that obtains is opened up
Flutter distribution.It is the schematic diagram of meshed network topology distribution provided in an embodiment of the present invention such as Fig. 4.Wherein, X1, X2, X3 are representative
The first node of user, Y1, Y2, Y3, Y4 are to represent the section point b11-b34 of public feelings information as user and the company of public feelings information
Side, characterize user and each public feelings information can transmit carriage with opening relationships, a12, a21, a23 between three users
Feelings information.
Between node by public feelings information produce interaction, and the threat characteristic of node and public feelings information by itself in net
Diffusion in network determined, high in social networks is threatened and public feelings information that range of scatter is wide has specific aim, and
And its algorithm accuracy is high, calculating speed is fast, can degree of concurrence height.
Step S105, row vector conversion is entered to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and profit
Information source network is obtained with Multi-layer technology algorithm and the viscosity matching factor.
Vector conversion is realized to bidirectional nodes incidence matrix first, forms initial matrix to be analyzed, for initial matrix,
Row i and row j represent node users, and by taking four users as an example, initial matrix is:
Data α in matrixijRepresent viscosity matching factor.The maximum preceding k node users of viscosity matching factor are selected first
As seed node, search connection sensing node for each seed node and form candidate parent nodes combination, then from k kind
Frequency and viscosity matching factor highest node N are extracted in the candidate parent nodes of child node, associating sensing establishment according to N is based on him
Node tree.The N relevant child node of institute is searched, with reference to seed node feature, extraction seed node is analyzed by Semantic Clustering
Similar child node.Its respective child nodes is obtained for seed node, by being carried out to seed node and its child nodes
The strong association child nodes of Semantic Clustering analysis extraction, and repaint nodes.By that analogy, until weak rigidity causes node tree
Untill convergence closure, final information source network is generated.
Viscosity matching factor can accurately represent the linkage between node, be obtained in its support lower leaf extraction algorithm
To accomplish low error extraction network hierarchical structure.For the social networks with hierarchical structuring, by network topology structure
Analysis extraction key node and key network, are reduced to the classification tree with hierarchical structure, its data structure is propped up
Support more Data Analysis Models.
Step S106, information source semantic tree is built for described information source network, and combine node Internet topology point
Cloth and utilization viscosity extended algorithm draw information source phrase viscosity distribution map.
For its node semantics tree of all Node extractions in information source network, and realize and all construct information source language
Justice tree.With reference to meshed network topology distribution, information source phrase viscosity distribution map is drawn using viscosity extended algorithm.
As shown in figure 5, it is information source phrase viscosity distribution map provided in an embodiment of the present invention.This information source phrase viscosity point
Butut is based on two-dimensional plane coordinate system, selects information source network root to be placed in coordinate system center O first, according to its son node number
Amount calculates misalignment angle to build reference vector, according to each child node and the viscosity matching factor of root node, determines it in benchmark
Position deviation on vector, then according to the viscosity matching factor of the association child node of the child node, determine it in vertical reference
Position deviation on vector, finally determine the position of the child node in a coordinate system, such as O1Point.By that analogy, for all sons
Node tree uses the algorithm to be extended with configuration information source phrase viscosity distribution map.
Viscosity extended algorithm realizes is converted into scatterplot distribution of the node in two-dimensional space by semantic tree, according to node tree layer
Level builds multistage coordinate system and the method that node differential location is determined based on basis vector, does node combination viscosity matching factor
It is distributed to accurate rational state.
Step S107, the semantic spy of information source is extracted from the phrase viscosity distribution map of described information source using viscosity clustering algorithm
Property phrase, and analysis is associated to the semantic tree of each node itself in information source network, extracts information source.
In phrase viscosity distribution map, such as Fig. 6, for all phrase nodes, point-rendering can include week centered on it
That encloses most multinode is just distributed very much circle, such as g (m1,σ1)、g(m2,σ2)、g(m3,σ3), and ensure number of nodes and radius of circle ratio
Meet threshold value.The K node extracted in this approach just too enclose by distribution, is permeated if Centroid occur and mutually including state
Individual new normal distribution circle, such as g (m2,σ2) and g (m3,σ3) composition is fused to new normal distribution circle, and marks new centromere
Point combines for the two original circle center node phrases that are just being distributed very much.Eventually through extraction, just too node phrase, configuration information are being enclosed in distribution
Source feature of semanteme phrase.Finally, combining information source feature of semanteme phrase, to the node language of each node itself in information source network
Justice tree is associated analysis, deletes the node that similarity is unsatisfactory for condition, extracts final information source.
In other embodiments, the ageing diffusion model and public sentiment phrase of information source network can be combined with
The transform characteristics of interaction transformation model analysis information source network itself, summarize and threaten the increased information source phrase of coefficient, by it
It is summarized in the detection and analysis scope of information source.
Fig. 7 is refer to, is the structural representation of server 10 provided in an embodiment of the present invention.The server 10 can be meter
Calculation machine or other any computing devices with data-handling capacity, including processor 101, memory 102, bus 103 and logical
Believe interface 104, the processor 101, communication interface 104 and memory 102 are connected by bus 103;Processor 101 is used to hold
The executable module stored in line storage 102, such as computer program.
Wherein, memory 102 may include high-speed random access memory (RAM:Random Access Memory),
Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage may also be included.By at least
One communication interface 103 (can be wired or wireless) realizes the communication between the system network element and at least one other network element
Connection.
Bus 104 can be isa bus, pci bus or eisa bus etc..Only represented in Fig. 3 with a four-headed arrow, but
It is not offered as only a bus or a type of bus.
Wherein, memory 102 is used for storage program, and Network Information Sources as shown in Figure 8 search device 200.The network is believed
Device 200 is searched in breath source can be stored in the memory 102 including at least one in the form of software or firmware (firmware)
In or the software function module that is solidificated in the operating system (operating system, OS) of the server 10.The place
Device 101 is managed after execute instruction is received, performs described program to realize that the Network Information Sources that the embodiment of the present invention discloses are searched
Method.
Processor 101 is probably a kind of IC chip, has the disposal ability of signal.It is above-mentioned in implementation process
Each step of method can be completed by the integrated logic circuit of the hardware in processor 101 or the instruction of software form.On
The processor 101 stated can be general processor, including central processing unit (Central Processing Unit, referred to as
CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (DSP), special
Integrated circuit (ASIC), ready-made programmable gate array (FPGA) either other PLDs, discrete gate or transistor
Logical device, discrete hardware components.
Fig. 8 is refer to, is the high-level schematic functional block diagram that Network Information Sources provided in an embodiment of the present invention search device 200.
The Network Information Sources lookup device 200 includes, probability space structure module 201, probability matrix structure module 202, node language
Justice tree structure module 203, computing module 204, acquisition module 205, drafting module 206 and extraction module 207.
Probability space builds module 201, for according to public sentiment phrase database, building public sentiment phrase probability space.
In the embodiment of the present invention, the probability space structure module 201 can perform step S101.
Probability matrix builds module 202, for extracting the phrase sequence of wall scroll public feelings information, and with reference to the public sentiment phrase
Probability space, build semantic joint probability matrix.
In the embodiment of the present invention, the probability matrix structure module 202 can perform step S102.
Node semantics tree builds module 203, for being calculated using the semantic joint probability matrix and Naive Bayes Classification
Method obtains the threat coefficient of the wall scroll public feelings information, with reference to the semantic joint probability matrix structure node semantics tree.
In the embodiment of the present invention, the node semantics tree structure module 203 can perform step S103.
Computing module 204, for obtaining node Internet topology from social meshed network by depth probe algorithm
Distribution, and bidirectional nodes incidence matrix is built, calculate viscosity according to the bidirectional nodes incidence matrix and the node semantics tree
Matching factor.
In the embodiment of the present invention, the computing module 204 can perform step S104.
Acquisition module 205, for entering row vector conversion to the bidirectional nodes incidence matrix, form initial square to be analyzed
Battle array, and obtain information source network using Multi-layer technology algorithm and the viscosity matching factor.
In the embodiment of the present invention, the acquisition module 205 can perform step S105.
Drafting module 206, for building information source semantic tree for described information source network, and combine the node Internet
Topology distribution and utilization viscosity extended algorithm draw information source phrase viscosity distribution map.
In the embodiment of the present invention, the drafting module 206 can perform step S106.
Extraction module 207, for extracting information from the phrase viscosity distribution map of described information source using viscosity clustering algorithm
Source feature of semanteme phrase, and analysis is associated to the semantic tree of each node itself in information source network, extract information source.
In the embodiment of the present invention, the extraction module 207 can perform step S107.
In summary, Network Information Sources lookup method, device and server provided in an embodiment of the present invention, are believed by public sentiment
Breath semantics recognition and social networks node viscosity association analysis make it have information source Network finding and property identification function.Relatively
Extracted in traditional keyword semantic analysis and setpoint information source relational network, this method combination phrase probability space and semantic connection
Close matrix division methods, naive Bayes classifier structure node semantics tree, the detection of node depth and vector conversion viscosity matching
Extract information source network, viscosity clustering algorithm and cross correlation and identify final information source, show as more accurate rational information source
Capture, based on identical public sentiment characteristic, have various analysis dimension, social networks analysis and the identification of public sentiment characteristic deeply,
The more intuitive advantage of data representation.The system detectio object is with strong points, can analyze data profound level feature, detect public sentiment
Source network, easily find social network information source.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, can also pass through
Other modes are realized.Device embodiment described above is only schematical, for example, flow chart and block diagram in accompanying drawing
Show the device of multiple embodiments according to the present invention, method and computer program product architectural framework in the cards,
Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code
Part, a part for the module, program segment or code include one or more and are used to realize holding for defined logic function
Row instruction.It should also be noted that at some as in the implementation replaced, the function that is marked in square frame can also with different from
The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially perform substantially in parallel, they are sometimes
It can perform in the opposite order, this is depending on involved function.It is it is also noted that every in block diagram and/or flow chart
The combination of individual square frame and block diagram and/or the square frame in flow chart, function or the special base of action as defined in performing can be used
Realize, or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each functional module in each embodiment of the present invention can integrate to form an independent portion
Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized in the form of software function module and is used as independent production marketing or in use, can be with
It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words
The part to be contributed to prior art or the part of the technical scheme can be embodied in the form of software product, the meter
Calculation machine software product is stored in a storage medium, including some instructions are causing a computer equipment (can be
People's computer, server, or network equipment etc.) perform all or part of step of each embodiment methods described of the present invention.
And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.Need
Illustrate, herein, such as first and second or the like relational terms be used merely to by an entity or operation with
Another entity or operation make a distinction, and not necessarily require or imply between these entities or operation any this reality be present
The relation or order on border.Moreover, term " comprising ", "comprising" or its any other variant are intended to the bag of nonexcludability
Contain, so that process, method, article or equipment including a series of elements not only include those key elements, but also including
The other element being not expressly set out, or also include for this process, method, article or the intrinsic key element of equipment.
In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including the key element
Process, method, other identical element also be present in article or equipment.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area
For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies
Change, equivalent substitution, improvement etc., should be included in the scope of the protection.It should be noted that:Similar label and letter exists
Similar terms is represented in following accompanying drawing, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing
It is further defined and explained.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained
Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be defined by scope of the claims.
Claims (10)
- A kind of 1. Network Information Sources lookup method, it is characterised in that including:According to public sentiment phrase database, public sentiment phrase probability space is built;The phrase sequence of wall scroll public feelings information is extracted, and with reference to the public sentiment phrase probability space, builds semantic joint probability square Battle array;The threat system of the wall scroll public feelings information is obtained using the semantic joint probability matrix and Naive Bayes Classification Algorithm Number, with reference to the semantic joint probability matrix structure node semantics tree;Node Internet topology distribution is obtained from social meshed network by depth probe algorithm, and builds bidirectional nodes pass Join matrix, viscosity matching factor is calculated according to the bidirectional nodes incidence matrix and the node semantics tree;Enter row vector conversion to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and calculate using Multi-layer technology Method and the viscosity matching factor obtain information source network;Information source semantic tree is built for described information source network, and combines node Internet topology distribution and is prolonged using viscosity Stretch algorithm and draw information source phrase viscosity distribution map;Information source feature of semanteme phrase is extracted from the phrase viscosity distribution map of described information source using viscosity clustering algorithm, and to letter The semantic tree of each node itself in breath source network is associated analysis, extracts information source.
- 2. Network Information Sources lookup method according to claim 1, it is characterised in that described according to public sentiment phrase data Storehouse, build public sentiment phrase probability space the step of also include:The reference probability of each phrase in the public sentiment phrase database is calculated, according to the phrase in the public sentiment phrase database point Cloth state computation versatility probability, the usage time distribution computational valid time system according to each phrase in the public sentiment phrase database Number;According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
- 3. Network Information Sources lookup method according to claim 1 or 2, it is characterised in that the extraction wall scroll public sentiment letter The phrase sequence of breath, and with reference to the public sentiment phrase probability space, the step of building semantic joint probability matrix, also include:Extract the phrase sequence of wall scroll public feelings information;According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, appoints according in the phrase sequence Threat weight distribution structure of the public feelings information that two phrases of anticipating are formed in phrase probability space threatens weight distribution matrix, according to Any two phrase itself threatens the integrated individual weight product matrix of structure of weight product in the phrase sequence, according to the phrase sequence Any two phrase itself probability space characteristic builds individual probability matrix in row;With reference to the frequency matrix, threat weight distribution matrix, the individual weight product matrix and the individual probability square Battle array builds semantic joint probability matrix.
- 4. Network Information Sources lookup method according to claim 3, it is characterised in that described general using the semantic joint Rate matrix and Naive Bayes Classification Algorithm obtain the threat coefficient of the wall scroll public feelings information, with reference to the semantic joint probability The step of matrix structure node semantics tree, also includes:The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, using markov random file chain joint probability Assuming that to assess the semantic reasonability of wall scroll public feelings information, according to overall reasonability and the semantic reasonability is obtained, prestige is obtained Coefficient is coerced, and with reference to the semantic joint probability matrix structure node semantics tree.
- 5. Network Information Sources lookup method according to claim 4, it is characterised in that it is described by depth probe algorithm from The step of node Internet topology distribution is obtained in social meshed network also includes:The threat coefficient of each user is the threat coefficient average value of its all public feelings information operated, each public feelings information Threaten coefficient can be with the threat coefficient of its user is operated in accumulative conversion, if user is first node, public feelings information the Two nodes, when user operates to some public feelings information will the company of generation side, diffusion is circulated with this, it is final to obtain node net Network topology distribution.
- 6. a kind of Network Information Sources search device, it is characterised in that including:Probability space builds module, for according to public sentiment phrase database, building public sentiment phrase probability space;Probability matrix builds module, for extracting the phrase sequence of wall scroll public feelings information, and it is empty with reference to the public sentiment phrase probability Between, build semantic joint probability matrix;Node semantics tree builds module, for obtaining institute using the semantic joint probability matrix and Naive Bayes Classification Algorithm The threat coefficient of wall scroll public feelings information is stated, with reference to the semantic joint probability matrix structure node semantics tree;Computing module, for obtaining node Internet topology distribution from social meshed network by depth probe algorithm, and Bidirectional nodes incidence matrix is built, viscosity matching system is calculated according to the bidirectional nodes incidence matrix and the node semantics tree Number;Acquisition module, for entering row vector conversion to the bidirectional nodes incidence matrix, form initial matrix to be analyzed, and profit Information source network is obtained with Multi-layer technology algorithm and the viscosity matching factor;Drafting module, for building information source semantic tree for described information source network, and combine node Internet topology point Cloth and utilization viscosity extended algorithm draw information source phrase viscosity distribution map;Extraction module, it is semantic special for extracting information source from the phrase viscosity distribution map of described information source using viscosity clustering algorithm Property phrase, and analysis is associated to the semantic tree of each node itself in information source network, extracts information source.
- 7. Network Information Sources according to claim 6 search device, it is characterised in that the probability space structure module is also For:The reference probability of each phrase in the public sentiment phrase database is calculated, according to the phrase in the public sentiment phrase database Distribution calculates versatility probability, the usage time distribution computational valid time system according to each phrase in the public sentiment phrase database Number;According to the reference probability, the versatility probability and timeliness coefficient structure public sentiment phrase probability space.
- 8. the Network Information Sources according to claim 6 or 7 search device, it is characterised in that the probability matrix builds mould Block is additionally operable to:Extract the phrase sequence of wall scroll public feelings information;According to any two phrase in the phrase sequence, the frequency of occurrences builds frequency matrix simultaneously, appoints according in the phrase sequence Threat weight distribution structure of the public feelings information that two phrases of anticipating are formed in phrase probability space threatens weight distribution matrix, according to Any two phrase itself threatens the integrated individual weight product matrix of structure of weight product in the phrase sequence, according to the phrase sequence Any two phrase itself probability space characteristic builds individual probability matrix in row;With reference to the frequency matrix, threat weight distribution matrix, the individual weight product matrix and the individual probability square Battle array builds semantic joint probability matrix.
- 9. Network Information Sources according to claim 8 search device, it is characterised in that the node semantics tree builds module It is additionally operable to:The overall reasonability of wall scroll public feelings information is assessed using conditional independence assumption, is combined using markov random file chain Probability is assumed, to assess the semantic reasonability of wall scroll public feelings information, according to overall reasonability and the semantic reasonability is obtained, to obtain To threat coefficient, and with reference to the semantic joint probability matrix structure node semantics tree.
- A kind of 10. server, it is characterised in that including:One or more processors;Memory, for storing one or more programs,When one or more of programs are by one or more of computing devices so that one or more of processors Realize the method as described in any in claim 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711223777.4A CN107862081B (en) | 2017-11-29 | 2017-11-29 | Network information source searching method and device and server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711223777.4A CN107862081B (en) | 2017-11-29 | 2017-11-29 | Network information source searching method and device and server |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107862081A true CN107862081A (en) | 2018-03-30 |
CN107862081B CN107862081B (en) | 2021-07-16 |
Family
ID=61704267
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711223777.4A Active CN107862081B (en) | 2017-11-29 | 2017-11-29 | Network information source searching method and device and server |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107862081B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508376A (en) * | 2020-11-30 | 2021-03-16 | 中国科学院深圳先进技术研究院 | Index system construction method |
CN112861956A (en) * | 2021-02-01 | 2021-05-28 | 浪潮云信息技术股份公司 | Water pollution model construction method based on data analysis |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001080080A2 (en) * | 2000-04-14 | 2001-10-25 | Rightnow Technologies, Inc. | Usage based strength between related help topics and context based mapping thereof in a help information retrieval system |
CN1766871A (en) * | 2004-10-29 | 2006-05-03 | 中国科学院研究生院 | The processing method of the semi-structured data extraction of semantics of based on the context |
CN1853180A (en) * | 2003-02-14 | 2006-10-25 | 尼维纳公司 | System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation |
US20070073748A1 (en) * | 2005-09-27 | 2007-03-29 | Barney Jonathan A | Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects |
CN101122909A (en) * | 2006-08-10 | 2008-02-13 | 株式会社日立制作所 | Text message indexing unit and text message indexing method |
US7702635B2 (en) * | 2002-04-04 | 2010-04-20 | Microsoft Corporation | System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities |
CN102012929A (en) * | 2010-11-26 | 2011-04-13 | 北京交通大学 | Network consensus prediction method and system |
CN102411611A (en) * | 2011-10-15 | 2012-04-11 | 西安交通大学 | Instant interactive text oriented event identifying and tracking method |
US8209331B1 (en) * | 2008-04-02 | 2012-06-26 | Google Inc. | Context sensitive ranking |
CN102521291A (en) * | 2011-11-29 | 2012-06-27 | 浙江大学 | ANTLR (Another Tool for Language Recognition)-based importing method for LDF (Log Data File) of description file of LIN (Local Interconnect Network) |
CN102789498A (en) * | 2012-07-16 | 2012-11-21 | 钱钢 | Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning |
US20140221014A1 (en) * | 2013-02-05 | 2014-08-07 | Nec (China) Co., Ltd. | Device and method for mobility pattern mining |
WO2014127673A1 (en) * | 2013-02-25 | 2014-08-28 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for acquiring hot topics |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
US20170075877A1 (en) * | 2015-09-16 | 2017-03-16 | Marie-Therese LEPELTIER | Methods and systems of handling patent claims |
CN106980385A (en) * | 2017-04-07 | 2017-07-25 | 吉林大学 | A kind of Virtual assemble device, system and method |
CN107066256A (en) * | 2017-02-24 | 2017-08-18 | 中国人民解放军海军大连舰艇学院 | A kind of object based on tense changes the modeling method of model |
-
2017
- 2017-11-29 CN CN201711223777.4A patent/CN107862081B/en active Active
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001080080A2 (en) * | 2000-04-14 | 2001-10-25 | Rightnow Technologies, Inc. | Usage based strength between related help topics and context based mapping thereof in a help information retrieval system |
US7702635B2 (en) * | 2002-04-04 | 2010-04-20 | Microsoft Corporation | System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities |
CN1853180A (en) * | 2003-02-14 | 2006-10-25 | 尼维纳公司 | System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation |
CN1766871A (en) * | 2004-10-29 | 2006-05-03 | 中国科学院研究生院 | The processing method of the semi-structured data extraction of semantics of based on the context |
US20070073748A1 (en) * | 2005-09-27 | 2007-03-29 | Barney Jonathan A | Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects |
US8131701B2 (en) * | 2005-09-27 | 2012-03-06 | Patentratings, Llc | Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects |
CN101122909A (en) * | 2006-08-10 | 2008-02-13 | 株式会社日立制作所 | Text message indexing unit and text message indexing method |
US8209331B1 (en) * | 2008-04-02 | 2012-06-26 | Google Inc. | Context sensitive ranking |
CN102012929A (en) * | 2010-11-26 | 2011-04-13 | 北京交通大学 | Network consensus prediction method and system |
CN102411611A (en) * | 2011-10-15 | 2012-04-11 | 西安交通大学 | Instant interactive text oriented event identifying and tracking method |
CN102521291A (en) * | 2011-11-29 | 2012-06-27 | 浙江大学 | ANTLR (Another Tool for Language Recognition)-based importing method for LDF (Log Data File) of description file of LIN (Local Interconnect Network) |
CN102789498A (en) * | 2012-07-16 | 2012-11-21 | 钱钢 | Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning |
US20140221014A1 (en) * | 2013-02-05 | 2014-08-07 | Nec (China) Co., Ltd. | Device and method for mobility pattern mining |
WO2014127673A1 (en) * | 2013-02-25 | 2014-08-28 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for acquiring hot topics |
US20140280242A1 (en) * | 2013-02-25 | 2014-09-18 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for acquiring hot topics |
US20170075877A1 (en) * | 2015-09-16 | 2017-03-16 | Marie-Therese LEPELTIER | Methods and systems of handling patent claims |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
CN107066256A (en) * | 2017-02-24 | 2017-08-18 | 中国人民解放军海军大连舰艇学院 | A kind of object based on tense changes the modeling method of model |
CN106980385A (en) * | 2017-04-07 | 2017-07-25 | 吉林大学 | A kind of Virtual assemble device, system and method |
Non-Patent Citations (2)
Title |
---|
GRAHAM MCDONALD ET.AL: ""Enhancing Sensitivity Classification with Semantic"", 《COMPUTER SCIENCE》 * |
冯颖: ""网络舆情敏感话题发现平台的研究"", 《中国优秀硕士学位论文全文数据库息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508376A (en) * | 2020-11-30 | 2021-03-16 | 中国科学院深圳先进技术研究院 | Index system construction method |
CN112861956A (en) * | 2021-02-01 | 2021-05-28 | 浪潮云信息技术股份公司 | Water pollution model construction method based on data analysis |
Also Published As
Publication number | Publication date |
---|---|
CN107862081B (en) | 2021-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Extreme clustering–a clustering method via density extreme points | |
CN108334574B (en) | Cross-modal retrieval method based on collaborative matrix decomposition | |
CN111612041B (en) | Abnormal user identification method and device, storage medium and electronic equipment | |
CN107153713A (en) | Overlapping community detection method and system based on similitude between node in social networks | |
WO2019041521A1 (en) | Apparatus and method for extracting user keyword, and computer-readable storage medium | |
CN106776562A (en) | A kind of keyword extracting method and extraction system | |
CN108664574A (en) | Input method, terminal device and the medium of information | |
CN106503086A (en) | The detection method of distributed local outlier | |
CN104239553A (en) | Entity recognition method based on Map-Reduce framework | |
CN112148843B (en) | Text processing method and device, terminal equipment and storage medium | |
JP5057474B2 (en) | Method and system for calculating competition index between objects | |
Lian et al. | Reverse skyline search in uncertain databases | |
Xiong et al. | Affective impression: Sentiment-awareness POI suggestion via embedding in heterogeneous LBSNs | |
Yuan et al. | Privacy‐preserving mechanism for mixed data clustering with local differential privacy | |
CN107862081A (en) | Network Information Sources lookup method, device and server | |
CN110019763B (en) | Text filtering method, system, equipment and computer readable storage medium | |
Yuan et al. | Research of deceptive review detection based on target product identification and metapath feature weight calculation | |
CN114092729A (en) | Heterogeneous electricity consumption data publishing method based on cluster anonymization and differential privacy protection | |
Liu et al. | A network-based CNN model to identify the hidden information in text data | |
Tijare et al. | Correlation between k-means clustering and topic modeling methods on twitter datasets | |
US20200142910A1 (en) | Data clustering apparatus and method based on range query using cf tree | |
CN113988878A (en) | Graph database technology-based anti-fraud method and system | |
WO2021142968A1 (en) | Multilingual-oriented semantic similarity calculation method for general place names, and application thereof | |
Hong et al. | High-quality noise detection for knowledge graph embedding with rule-based triple confidence | |
Wang et al. | Edcleaner: Data cleaning for entity information in social network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |