CN109783582A - A kind of knowledge base alignment schemes, device, computer equipment and storage medium - Google Patents
A kind of knowledge base alignment schemes, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109783582A CN109783582A CN201811474699.XA CN201811474699A CN109783582A CN 109783582 A CN109783582 A CN 109783582A CN 201811474699 A CN201811474699 A CN 201811474699A CN 109783582 A CN109783582 A CN 109783582A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- entity
- similarity
- cluster
- entities
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000013598 vector Substances 0.000 claims abstract description 84
- 238000004138 cluster model Methods 0.000 claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims description 27
- 238000013527 convolutional neural network Methods 0.000 claims description 22
- 238000004422 calculation algorithm Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 9
- 238000003062 neural network model Methods 0.000 claims 1
- 238000013139 quantization Methods 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 8
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 12
- 230000006854 communication Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000007630 basic procedure Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007175 bidirectional communication Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses method, apparatus, computer equipment and the storage mediums of a kind of alignment of knowledge base, wherein method includes the following steps: to obtain knowledge entity vector set, wherein, the knowledge entity vector set is that the vectorization of knowledge in knowledge base entity to be aligned indicates;The knowledge entity vector set is input to preset knowledge entity cluster model, obtains the cluster result of the knowledge in knowledge base entity to be aligned;According to the cluster result, selection belongs to of a sort any two knowledge entity, calculates the similarity between described two knowledge entities;When the similarity is greater than the first threshold of setting, described two knowledge entities are merged.The comparison of two knowledge entity similarities is limited in same class entity, greatly reduce calculation amount, when cluster, it is realized by artificial intelligence technology, cluster result is set more to meet expection, the calculating of similarity combines entity attributes similarity and vector similarity, keeps the calculating of similarity more reasonable, more effectively can find and remove redundancy.
Description
Technical field
The present invention relates to knowledge base processing technology fields more particularly to a kind of knowledge base alignment schemes, device, computer to set
Standby and storage medium.
Background technique
With the development of internet, every field constructs more and more knowledge bases, these knowledge bases are also extensive
Applied in the Internet applications such as search service, automatic question answering.Knowledge base is to the shared of information and propagates with positive effect.So
And the Limited information of single knowledge base, it cannot meet the needs of users in some cases;And usually knowledge base is to continue expansion
, the scale of the storage resource of occupancy also continuous enlargement, but persistently extend to the data in knowledge base there may be redundancies, it is this
Redundancy causes the waste of storage resource, meanwhile, also make to search for calculation amount increase, search result information is repeated, brought not to user
Just.
Knowledge base alignment (Knowledge Base Alignment) refers to each entity for separate sources, finds out and belongs to
The entity of same thing in reality.Here the things that entity refers to objective reality and can be mutually distinguishable, including specific people, thing, object,
Abstract concept, relationship.Therefore knowledge base alignment, i.e. extraction entity information, remove redundancy, are the passes for constructing high quality knowledge base
Key problem.
Knowledge base is aligned common method is to determine whether entity from different sources can be aligned using entity attributes information,
Since different entities data belong to user's original content (User Generated Content, UGC) type, different user editor
The quality of data it is irregular, the entity attribute information only edited by user is difficult to accurately determine whether same entity.
Summary of the invention
The present invention provides a kind of knowledge base alignment schemes, device, computer equipment and storage medium.
In order to solve the above technical problems, the present invention proposes a kind of knowledge base alignment schemes, include the following steps:
Obtain knowledge entity vector set, wherein the knowledge entity vector set is knowledge in knowledge base entity to be aligned
Vectorization indicate;
The knowledge entity vector set is input to preset knowledge entity cluster model, obtains described to be aligned knowing
Know the cluster result of knowledge entity in library;
According to the cluster result, selection belongs to of a sort any two knowledge entity, and it is real to calculate described two knowledge
Similarity between body;
When the similarity is greater than the first threshold of setting, described two knowledge entities are merged.
Optionally, further include following step before the acquisition knowledge entity vector set the step of:
Obtain the knowledge entity in knowledge base to be aligned;
The knowledge entity is based on the vectorization of IF-IDF algorithm, obtains the knowledge entity vector set.
Optionally, the preset knowledge entity cluster model uses DBSCAN density clustering algorithm.
Optionally, the preset knowledge entity cluster model uses the Clustering Model based on convolutional neural networks,
The training of the Clustering Model based on convolutional neural networks includes following step:
It obtains and is marked with the training sample that cluster judges information, the cluster of the training sample judges information for sample knowledge
The classification of entity;
Training sample input convolutional neural networks model is obtained into the Model tying of the training sample referring to information;
Sentenced by the Model tying that loss function compares different samples in the training sample referring to information and the cluster
Whether disconnected information is consistent;
When the Model tying judges that information is inconsistent referring to information and the cluster, the update institute of iterative cycles iteration
The weight in convolutional neural networks model is stated, until the Model tying is tied when judging that information is consistent with the cluster referring to information
Beam.
Optionally, described according to the cluster result, selection belongs to of a sort any two knowledge entity, calculates institute
The step of stating the similarity between two knowledge entities specifically include the following steps:
Obtain described two knowledge entity attributes, wherein the knowledge entity attributes are to describe corresponding knowledge entity
Data;
Calculate described two knowledge entity attributes similarities and vector similarity;
The weighted sum that described two knowledge entity attributes similarities and vector similarity are calculated according to following formula, obtains
Similarity between described two knowledge entities, it may be assumed that
S=aX+bY
Wherein, similarity of the S between described two knowledge entities, X are the attributes similarity, and Y is the vector phase
Like degree, a, b are respectively the weight of the attributes similarity and the vector similarity.
Optionally, in the first threshold for being greater than setting when the similarity, described two knowledge entities are merged
The step of in, further include following step:
When the similarity is greater than the second threshold of setting, wherein the second threshold is greater than the first threshold, from
Any one in described two knowledge entities is deleted in knowledge base to be aligned.
Optionally, in the first threshold for being greater than setting when the similarity, described two knowledge entities are merged
The step of in, further include following step:
A. by described two knowledge splitting objects at several fructifications;
B. any two fructification in several described fructifications is selected, is calculated similar between described two fructifications
Degree;
C. when the similarity between described two fructifications is greater than preset third threshold value, described two fructifications are deleted
In any one, wherein the third threshold value be greater than the first threshold;
D. step b and step c is repeated, until the similarity in the fructification of reservation between any two fructification is both less than
Or it is equal to preset third threshold value;
E., the fructification of the reservation is incorporated as to the alignment entity of described two knowledge entities.
To solve the above problems, the present invention also provides a kind of knowledge base alignment means, comprising:
Module is obtained, for obtaining knowledge entity vector set, wherein the knowledge entity vector set is knowledge to be aligned
The vectorization of knowledge entity indicates in library;
Processing module is obtained for the knowledge entity vector set to be input to preset knowledge entity cluster model
To the cluster result of the knowledge in knowledge base entity to be aligned;
Computing module, for according to the cluster result, selection to belong to of a sort any two knowledge entity, calculating institute
State the similarity between two knowledge entities;
Execution module merges described two knowledge entities when for being greater than the first threshold of setting when the similarity.
Optionally, the knowledge base alignment means further include:
First acquisition submodule, for obtaining the knowledge entity in knowledge base to be aligned;
It is real to obtain the knowledge for the knowledge entity to be based on the vectorization of IF-IDF algorithm for first processing submodule
Body vector set.
Optionally, preset knowledge entity cluster model is poly- using DBSCAN density in the knowledge base alignment means
Class algorithm.
Optionally, preset knowledge entity cluster model uses and is based on convolutional Neural in the knowledge base alignment means
The Clustering Model of network.
Optionally, the computing module includes:
Second acquisition submodule, for obtaining described two knowledge entity attributes, wherein the knowledge entity attributes
For the data for describing corresponding knowledge entity;
First computational submodule, for calculating described two knowledge entity attributes similarities and vector similarity;
Second computational submodule, for calculating described two knowledge entity attributes similarities and vector according to following formula
The weighted sum of similarity obtains the similarity between described two knowledge entities, it may be assumed that
S=aX+bY
Wherein, similarity of the S between described two knowledge entities, X are the attributes similarity, and Y is the vector phase
Like degree, a, b are respectively the weight of the attributes similarity and the vector similarity.
Optionally, the execution module includes:
First implementation sub-module, when for being greater than the second threshold of setting when the similarity, wherein the second threshold
Greater than the first threshold, from any one deleted in knowledge base to be aligned in described two knowledge entities.
Optionally, the execution module includes:
First segmentation submodule, is used for described two knowledge splitting objects into several fructifications;
Third computational submodule calculates described two for selecting any two fructification in several described fructifications
Similarity between a fructification;
Second implementation sub-module, for when the similarity between described two fructifications be greater than preset third threshold value when,
Delete any one in described two fructifications, wherein the third threshold value is greater than the first threshold;
First circulation submodule, for making third computational submodule and the second implementation sub-module rerun, until retaining
Fructification in similarity between any two fructification both less than or be equal to preset third threshold value;
Third implementation sub-module, the alignment for the fructification of the reservation to be incorporated as to described two knowledge entities are real
Body.
In order to solve the above technical problems, the embodiment of the present invention also provides a kind of computer equipment, including memory and processing
Device is stored with computer-readable instruction in the memory, when the computer-readable instruction is executed by the processor, so that
The processor executes the step of knowledge base alignment schemes described above.
In order to solve the above technical problems, the embodiment of the present invention also provides a kind of computer readable storage medium, the calculating
Computer-readable instruction is stored on machine readable storage medium storing program for executing, when the computer-readable instruction is executed by processor, so that institute
State the step of processor executes knowledge base alignment schemes described above.
The embodiment of the present invention has the beneficial effect that by obtaining knowledge entity vector set, by the knowledge entity vector set
It is input to preset knowledge entity cluster model, obtains the cluster result of the knowledge in knowledge base entity to be aligned, root
According to the cluster result, selection belongs to of a sort any two knowledge entity, calculates the phase between described two knowledge entities
Described two knowledge entities are merged when the similarity is greater than the first threshold of setting like degree.Two knowledge entities are similar
The comparison of degree is limited in same class entity, greatly reduces calculation amount, wherein the calculating of similarity combines entity attributes phase
Like degree and vector similarity, keeps the calculating of similarity more reasonable, more effectively can find and remove redundancy.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure
Fig. 1 is a kind of knowledge base alignment schemes basic procedure schematic diagram of the embodiment of the present invention;
Fig. 2 is for the embodiment of the present invention based on IF-IDF algorithm to the schematic diagram of knowledge entity vectorization;
Fig. 3 is Clustering Model training flow diagram of the embodiment of the present invention based on convolutional neural networks;
Fig. 4 is knowledge of embodiment of the present invention entity similarity calculation flow diagram;
Fig. 5 is that knowledge of embodiment of the present invention entity merges flow diagram;
Fig. 6 is a kind of knowledge base alignment means basic structure block diagram of the embodiment of the present invention;
Fig. 7 is that the present invention implements computer equipment basic structure block diagram.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention
Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.
In some processes of the description in description and claims of this specification and above-mentioned attached drawing, contain according to
Multiple operations that particular order occurs, but it should be clearly understood that these operations can not be what appears in this article suitable according to its
Sequence is executed or is executed parallel, and serial number of operation such as 101,102 etc. is only used for distinguishing each different operation, serial number
It itself does not represent and any executes sequence.In addition, these processes may include more or fewer operations, and these operations can
To execute or execute parallel in order.It should be noted that the description such as " first " herein, " second ", is for distinguishing not
Same message, equipment, module etc., does not represent sequencing, does not also limit " first " and " second " and be different type.
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those skilled in the art's every other implementation obtained without creative efforts
Example, shall fall within the protection scope of the present invention.
Embodiment
Those skilled in the art of the present technique are appreciated that " terminal " used herein above, " terminal device " both include wireless communication
The equipment of number receiver, only has the equipment of the wireless signal receiver of non-emissive ability, and including receiving and emitting hardware
Equipment, have on bidirectional communication link, can execute two-way communication reception and emit hardware equipment.This equipment
It may include: honeycomb or other communication equipments, shown with single line display or multi-line display or without multi-line
The honeycomb of device or other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System), can
With combine voice, data processing, fax and/or communication ability;PDA (Personal Digital Assistant, it is personal
Digital assistants), it may include radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, day
It goes through and/or GPS (Global Positioning System, global positioning system) receiver;Conventional laptop and/or palm
Type computer or other equipment, have and/or the conventional laptop including radio frequency receiver and/or palmtop computer or its
His equipment." terminal " used herein above, " terminal device " can be it is portable, can transport, be mounted on the vehicles (aviation,
Sea-freight and/or land) in, or be suitable for and/or be configured in local runtime, and/or with distribution form, operate in the earth
And/or any other position operation in space." terminal " used herein above, " terminal device " can also be communication terminal, on
Network termination, music/video playback terminal, such as can be PDA, MID (Mobile InternetDevice, mobile Internet are set
It is standby) and/or mobile phone with music/video playing function, it is also possible to the equipment such as smart television, set-top box.
Terminal in present embodiment is above-mentioned terminal.
Specifically, referring to Fig. 1, Fig. 1 is a kind of basic procedure schematic diagram of knowledge base alignment schemes of the present embodiment.
As shown in Figure 1, a kind of knowledge base alignment schemes, include the following steps:
S101, knowledge entity vector set is obtained, wherein the knowledge entity vector set is knowledge in knowledge base to be aligned
The vectorization of entity indicates;
The knowledge entity being stored in knowledge base is usually text or picture, when being aligned to knowledge entity, usually
Need the similarity between calculation knowledge entity, in order to facilitate computer disposal and understanding, need with by knowledge entity be converted into
Amount.Such as the vectorization of text indicates that bag of words (bag of words) is also referred to as by vector space model to be realized, wherein
Simplest mode is word-based one-hot coding (one-hotencoding), uses each word as dimension key, there is word
Corresponding position is 1, other are 0, and vector length is identical with dictionary size.
S102, the knowledge entity vector set is input to preset knowledge entity cluster model, obtain it is described to
It is aligned the cluster result of knowledge in knowledge base entity;
The vector set for indicating knowledge entity is input to preset knowledge entity cluster model.Wherein knowledge entity
Clustering Model uses density-based algorithms, and density-based algorithms do not need the data that cluster class is determined in advance, can
To find the cluster class of arbitrary shape, noise spot can recognize that, have preferable robustness to outlier, can detecte outlier.
DBSCAN be it is most typical in such method represent one of algorithm, core concept is exactly the first discovery higher point of density, then
Similar high density point is gradually all joined together, and then generates various clusters.Specific algorithm is realized: being circle to each data point
The heart draws a circle (referred to as neighborhood eps-neigbourhood) by radius of eps, then counts how many point in this circle, this
Number is exactly the dot density value.Then a density threshold MinPts is chosen, such as enclosing interior centre point of the points less than MinPts is
The point of low-density, and it is greater than or equal to the highdensity point (referred to as core point Corepoint) of centre point of MinPts.If there is one
A highdensity point is in the circle of another highdensity point, we just connect the two points, we can be in this way
A lot of point constantly series connection come out.Later, if there is the point of low-density is also in the circle of highdensity point, it is also connected to recently
High density point on, referred to as boundary point.All in this way points that can be connected to together are just at an a cluster, without in any high density
Low-density point in the circle of point is exactly abnormal point.
In some embodiments, cluster is realized using trained convolutional neural networks model, by convolution
Neural network is trained study manually to the feature of training sample cluster, makes convolutional neural networks model can be it is anticipated that right
Knowledge entity is clustered.
S103, according to the cluster result, selection belongs to of a sort any two knowledge entity, calculates and described two knows
Know the similarity between entity;
By step S102, the knowledge entity in knowledge base is clustered, then in same class, it is any by calculating
The similarity of two knowledge entities reduces the range that knowledge entity compares in this way, subtracts to determine whether there are the entities of redundancy
Small calculation amount, improves the efficiency for judging whether there is redundant entity.
The similarity of two knowledge entities is obtained by calculating the similarity between the vector for indicating two knowledge entities.
Similarity between two vectors can be cosine similarity.The cosine value that cosine similarity passes through the angle of two vectors of measurement
To measure the similitude between them.0 degree of cosine of an angle value is 1, and the cosine value of other any angles is all not more than 1;And
Its minimum value is -1.To which the cosine value of the angle between two vectors determines whether two vectors are pointed generally in identical side
To.When two vectors are equally directed to, the value of cosine similarity is 1;When two vector angles are 90 °, cosine similarity
Value is 0;When two vectors are directed toward exactly opposite direction, the value of cosine similarity is -1.This result is that with the length of vector without
It closes, it is only related to the pointing direction of vector.Cosine similarity is all suitable for the vector space of any dimension, and is usually used in
The higher-dimension positive space, so being suitable for the comparison of text file.
The similarity between two vectors can also be measured by calculating the Euclidean distance between vector.In order to avoid ruler
The influence of degree, is first normalized vector, seeks two point X in vector space according still further to following formula1, X2The distance between:
Wherein x1i, x2iFor X1, X2The value of each dimension after normalization.
S104, when the similarity be greater than setting first threshold when, by described two knowledge entities merge.
A threshold value is preset, referred to herein as first threshold, when the similarity of two knowledge entities is greater than setting
When first threshold, it is believed that two knowledge entity part contents repeat, and two knowledge entities are merged into an entity.
As shown in Fig. 2, being further comprised the steps of: before S101
Knowledge entity in S111, acquisition knowledge base to be aligned;
Knowledge entity is obtained by server where access knowledge base, knowledge entity may belong to same knowledge institute library,
Multiple knowledge bases can be derived from.
S112, the knowledge entity is based on the vectorization of IF-IDF algorithm, obtains the knowledge entity vector set.
By knowledge entity vectorization, in addition to it is above-mentioned based on bag of words vectorization other than, can also be based on being based on IF-
IDF algorithm is to knowledge entity vectorization.TF-IDF is a kind of statistical method, to assess a words for a file set or one
The significance level of a copy of it file in a corpus.The importance of words is directly proportional with the number that it occurs hereof
Increase, but the frequency that can occur in corpus with it simultaneously is inversely proportional decline.TF-IDF is actually: TF*IDF, TF
(Term Frequency, word frequency), IDF (Inverse Document Frequency, reverse document-frequency).TF indicates entry
The frequency occurred in document d.Using TF-IDF to text vector, a dictionary is equally constructed, with the TF-IDF of each word
It is worth the weight as the word.
As shown in figure 3, the training of the Clustering Model based on convolutional neural networks, includes the following steps:
S121, acquisition are marked with the training sample that cluster judges information, and the cluster of the training sample judges information for sample
The classification of this knowledge entity;
In the embodiment of the present invention, the training objective of convolutional neural networks is classification belonging to identification knowledge entity, convolution mind
Pass through in training learning sample through network model and manually mark class another characteristic, realizes the function to knowledge entity cluster.
S122, the Model tying reference that training sample input convolutional neural networks model is obtained to the training sample
Information;
Convolutional neural networks model is made of: convolutional layer, pond layer, full connection and classification layer.Wherein, convolutional layer is used for
Knowledge entity vector is locally perceived, and convolutional layer is usually attached in cascaded fashion, position is more rearward in cascade
Convolutional layer can perceive the information being more globalized.
Full articulamentum plays the role of " classifier " in entire convolutional neural networks.If convolutional layer, pond layer and
The operations such as activation primitive layer are that full articulamentum is then played " to be divided what is acquired if initial data to be mapped to hidden layer feature space
Cloth character representation " is mapped to the effect in sample labeling space.Full articulamentum is connected to convolutional layer output position, can perceive and know
Know the globalization feature of entity vector.
Training sample is input in convolutional neural networks model, obtains convolutional neural networks mode input cluster referring to letter
Breath.
S123, the Model tying that different samples in the training sample are compared by loss function gather referring to information with described
Class judges whether information is consistent;
Cluster is compared by loss function and judges whether information is consistent with the cluster that sample marks referring to information, and the present invention is real
It applies and uses softmax cross entropy loss function in example, specifically:
Assuming that sharing N number of training sample, the input feature vector that i-th of sample is finally layered for network is Xi, corresponding
Labeled as YiIt is final classification results, h=(h1, h2 ..., hc) is the final output of network, the i.e. prediction result of sample i.
Wherein C is the quantity of last all classification.
S124, when the Model tying judges that information is inconsistent referring to information and the cluster, iterative cycles iteration
The weight in the convolutional neural networks model is updated, until the Model tying judges that information is consistent with the cluster referring to information
When terminate.
In the training process, the weight for adjusting each node in convolutional neural networks model makes Softmax intersect entropy loss letter
Number is restrained as far as possible, that is to say, that continues to adjust weight, the value of obtained loss function no longer reduces, when increasing instead, it is believed that
Convolutional neural networks training can terminate.The adjustment of each node weights uses gradient descent method, and gradient descent method is one optimal
Change algorithm, for approaching minimum deflection model in machine learning and artificial intelligence with being used to recursiveness.
Knowledge entity is clustered by the convolutional neural networks model after training, cluster result can be made closer to use
The expection at family.
As shown in figure 4, step S103 further includes following step:
S131, described two knowledge entity attributes are obtained, wherein the knowledge entity attributes are to describe corresponding knowledge
The data of entity;
In some cases, although two knowledge entity similarities from the point of view of content are not high, two knowledge entities are all
An entity in corresponding reality, that is to say, that two knowledge entities respectively describe two parts letter of some entity in reality
Breath, for the convenience used, also it is necessary to be combined this two parts information.So introducing attributes similarity here.First obtain
Knowledge entity attributes are taken, attribute is the data for describing knowledge entity, is referred to as label.
S132, described two knowledge entity attributes similarities and vector similarity are calculated;
Attributes similarity measures the similarity between two knowledge entities in the embodiment of the present invention using editing distance.
Editing distance, refers to using character manipulation, and character string A is converted into minimal action number required for character string B.Character manipulation packet
It includes: deleting one character, one character of modification, insertion character.It is 1 that the cost operated every time is arranged herein, attribute phase
It can be calculated by the following formula like degree:
The maximum length of the propertystring of attributes similarity=(1- editing distance)/two
Vector similarity, i.e., the cosine similarity or Euclidean distance above-mentioned for measuring two knowledge entity vector similarities.
S133, the weighting that described two knowledge entity attributes similarities and vector similarity are calculated according to following formula
With obtain the similarity between described two knowledge entities, it may be assumed that
S=aX+bY
Wherein, similarity of the S between described two knowledge entities, X are the attributes similarity, and Y is the vector phase
Like degree, a, b are respectively the weight of the attributes similarity and the vector similarity.
Synthesized attribute similarity and vector similarity can find that description is same in the case where content similarity is not high
Two knowledge entities of live entities, and the knowledge entity for describing same live entities is merged, it is convenient for the user to use
With the maintenance of knowledge base.
Step S104 further includes following step:
S141, when the similarity be greater than setting second threshold when, wherein the second threshold be greater than first threshold
Value, from any one deleted in knowledge base to be aligned in described two knowledge entities.
When the similarity of two knowledge entities is very high, we set second threshold here, and second threshold is greater than above-mentioned
First threshold, such as the second threshold that sets think that two knowledge entities are essentially identical, at this moment, from knowledge base as 0.95
Delete the method that any one knowledge entity is exactly effective removal redundancy.
As shown in figure 5, step S104 further includes following step:
S151, by described two knowledge splitting objects at several fructifications;
When the similarity of two knowledge entities is greater than preset first threshold, it is believed that bulk density in two knowledge entity parts
It is multiple, in order to pick out duplicate content, two knowledge entities first can be divided into several fructifications according to certain rules,
Such as divide according to interior paragraph.
Any two fructification in several fructifications described in S152, selection, calculates between described two fructifications
Similarity;
Any two fructification after selection segmentation, calculates the similarity between two fructifications, i.e., as previously described, first will
Then fructification vectorization calculates the similarity between the vector for indicating fructification, can be cosine similarity, is also possible to Europe
Family name's distance.
S153, when the similarity between described two fructifications be greater than preset third threshold value when, delete described two sons
Any one in entity, wherein the third threshold value is greater than the first threshold;
When the similarity between two fructifications is greater than preset threshold value, referred to herein as third threshold value, it is believed that two sons
Physical contents repeat substantially, delete wherein any one.To avoid deleting excessive content, third threshold requirement is greater than above-mentioned
First threshold.
S154, step S152 and step S153 is repeated, until the phase in the fructification of reservation between any two fructification
Like degree both less than or equal to preset third threshold value;
Repeat the comparison of similarity between fructification, deletes the high fructification of registration, make in the fructification retained
The similarity of any two fructification is both less than or equal to preset third threshold value.
S155, the alignment entity that the fructification of the reservation is incorporated as to described two knowledge entities.
The alignment result of two knowledge entities to be aligned before the fructification of reservation is incorporated as.
The embodiment of the present invention also provides a kind of knowledge base alignment means to solve above-mentioned technical problem.Referring specifically to Fig. 6,
Fig. 6 is the basic structure block diagram of the present embodiment knowledge base alignment means.
As shown in fig. 6, a kind of knowledge base alignment means, comprising: obtain module 210, processing module 220, computing module 230
With execution module 240.Wherein, module 210 is obtained, for obtaining knowledge entity vector set, wherein the knowledge entity vector set
It is the vectorization expression of knowledge in knowledge base entity to be aligned;Processing module 220, for the knowledge entity vector set is defeated
Enter the cluster result that the knowledge in knowledge base entity to be aligned is obtained to preset knowledge entity cluster model;It calculates
Module 230 belongs to of a sort any two knowledge entity for selecting according to the cluster result, and calculating is described two to be known
Know the similarity between entity;Execution module 240, when for being greater than the first threshold of setting when the similarity, by described two
A knowledge entity merges.
The knowledge entity vector set is input to and presets by obtaining knowledge entity vector set by the embodiment of the present invention
Knowledge entity cluster model, obtain the cluster result of the knowledge in knowledge base entity to be aligned, according to the cluster result,
Selection belongs to of a sort any two knowledge entity, the similarity between described two knowledge entities is calculated, when described similar
When degree is greater than the first threshold of setting, described two knowledge entities are merged.The comparison of two knowledge entity similarities is limited to together
In a kind of entity, calculation amount is greatly reduced, wherein it is similar with vector that the calculating of similarity combines entity attributes similarity
Degree, keeps the calculating of similarity more reasonable, more effectively can find and remove redundancy.
In some embodiments, the knowledge base alignment means further include: the first acquisition submodule and the first processing
Module.Wherein, the first acquisition submodule, for obtaining the knowledge entity in knowledge base to be aligned;First processing submodule, is used
In the knowledge entity is based on the vectorization of IF-IDF algorithm, the knowledge entity vector set is obtained.
In some embodiments, preset knowledge entity cluster model uses in the knowledge base alignment means
DBSCAN density clustering algorithm.
In some embodiments, preset knowledge entity cluster model uses base in the knowledge base alignment means
In the Clustering Model of convolutional neural networks.
In some embodiments, the computing module 230 include: the second acquisition submodule, the first computational submodule and
Second computational submodule.Wherein, the second acquisition submodule, for obtaining described two knowledge entity attributes, wherein described to know
Knowing entity attributes is the data for describing corresponding knowledge entity;First computational submodule, for calculating described two knowledge entities
Attributes similarity and vector similarity;Second computational submodule, for calculating described two knowledge entities according to following formula
Attributes similarity and vector similarity weighted sum, obtain the similarity between described two knowledge entities, it may be assumed that
S=aX+bY
Wherein, similarity of the S between described two knowledge entities, X are the attributes similarity, and Y is the vector phase
Like degree, a, b are respectively the weight of the attributes similarity and the vector similarity.
In some embodiments, the execution module 240 includes: the first implementation sub-module, for working as the similarity
Greater than setting second threshold when, wherein the second threshold be greater than the first threshold, deleted from knowledge base to be aligned
Any one in described two knowledge entities.
In some embodiments, the execution module 240 includes: the first segmentation submodule, third computational submodule,
Two implementation sub-modules, first circulation submodule and third implementation sub-module.Wherein, the first segmentation submodule, is used for described two
A knowledge splitting object is at several fructifications;Third computational submodule, it is any in several described fructifications for selecting
Two fructifications, calculate the similarity between described two fructifications;Second implementation sub-module, for working as described two fructifications
Between similarity be greater than preset third threshold value when, delete any one in described two fructifications, wherein the third
Threshold value is greater than the first threshold;First circulation submodule, for repeating third computational submodule and the second implementation sub-module
Operation, until the similarity in the fructification of reservation between any two fructification is both less than or equal to preset third threshold value;
Third implementation sub-module, for the fructification of the reservation to be incorporated as to the alignment entity of described two knowledge entities.
In order to solve the above technical problems, the embodiment of the present invention also provides computer equipment.It is this referring specifically to Fig. 7, Fig. 7
Embodiment computer equipment basic structure block diagram.
As shown in fig. 7, the schematic diagram of internal structure of computer equipment.As shown in fig. 7, the computer equipment includes passing through to be
Processor, non-volatile memory medium, memory and the network interface of bus of uniting connection.Wherein, the computer equipment is non-easy
The property lost storage medium is stored with operating system, database and computer-readable instruction, can be stored with control information sequence in database
Column when the computer-readable instruction is executed by processor, may make processor to realize a kind of method that knowledge base is aligned.The calculating
The processor of machine equipment supports the operation of entire computer equipment for providing calculating and control ability.The computer equipment
It can be stored with computer-readable instruction in memory, when which is executed by processor, processor may make to hold
A kind of method of knowledge base alignment of row.The network interface of the computer equipment is used for and terminal connection communication.Those skilled in the art
Member is appreciated that structure shown in Fig. 7, only the block diagram of part-structure relevant to application scheme, composition pair
The restriction for the computer equipment that application scheme is applied thereon, specific computer equipment may include than as shown in the figure more
More or less component perhaps combines certain components or with different component layouts.
Processor is for executing acquisition module 210, processing module 220,230 and of computing module in Fig. 6 in present embodiment
The particular content of execution module 240, program code and Various types of data needed for memory is stored with the above-mentioned module of execution.Network connects
Mouth to the data between user terminal or server for transmitting.Memory in present embodiment is stored with knowledge base alignment side
Program code needed for executing all submodules in method and data, server is capable of the program code of invoking server and data are held
The function of all submodules of row.
The knowledge entity vector set is input to preset by computer equipment by obtaining knowledge entity vector set
Knowledge entity cluster model obtains the cluster result of the knowledge in knowledge base entity to be aligned, according to the cluster result, choosing
It selects and belongs to of a sort any two knowledge entity, the similarity between described two knowledge entities is calculated, when the similarity
Greater than setting first threshold when, by described two knowledge entities merge.The comparison of two knowledge entity similarities is limited to same
In class entity, calculation amount is greatly reduced, wherein it is similar with vector that the calculating of similarity combines entity attributes similarity
Degree, keeps the calculating of similarity more reasonable, more effectively can find and remove redundancy.
The present invention also provides a kind of storage mediums for being stored with computer-readable instruction, and the computer-readable instruction is by one
When a or multiple processors execute, so that one or more processors execute knowledge base alignment schemes described in any of the above-described embodiment
The step of.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between
In matter, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be
The non-volatile memory mediums such as magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random storage note
Recall body (Random Access Memory, RAM) etc..
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other
At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (10)
1. a kind of knowledge base alignment schemes, it is characterised in that, include the following steps:
Obtain knowledge entity vector set, wherein the knowledge entity vector set be knowledge in knowledge base entity to be aligned to
Quantization means;
The knowledge entity vector set is input to preset knowledge entity cluster model, obtains the knowledge base to be aligned
The cluster result of middle knowledge entity;
According to the cluster result, selection belongs to of a sort any two knowledge entity, calculate described two knowledge entities it
Between similarity;
When the similarity is greater than the first threshold of setting, described two knowledge entities are merged.
2. knowledge base alignment schemes according to claim 1, which is characterized in that in the acquisition knowledge entity vector set
Further include following step before step:
Obtain the knowledge entity in knowledge base to be aligned;
The knowledge entity is based on the vectorization of IF-IDF algorithm, obtains the knowledge entity vector set.
3. knowledge base alignment schemes according to claim 1, which is characterized in that the preset knowledge entity cluster
Model uses DBSCAN density clustering algorithm.
4. knowledge base alignment schemes according to claim 1, which is characterized in that the preset knowledge entity cluster
Model uses the Clustering Model based on convolutional neural networks, under the training of the Clustering Model based on convolutional neural networks includes
State step:
It obtains and is marked with the training sample that cluster judges information, the cluster of the training sample judges information for sample knowledge entity
Classification;
Training sample input convolutional neural networks model is obtained into the Model tying of the training sample referring to information;
Believed by the Model tying that loss function compares different samples in the training sample referring to information and cluster judgement
It whether consistent ceases;
When the Model tying judges that information is inconsistent referring to information and the cluster, the update of the iterative cycles iteration volume
Weight in product neural network model, until the Model tying terminates when judging that information is consistent with the cluster referring to information.
5. knowledge base alignment schemes according to claim 1, which is characterized in that described according to the cluster result, choosing
The step of selecting and belong to of a sort any two knowledge entity, calculating the similarity between described two knowledge entities specifically includes
Following step:
Obtain described two knowledge entity attributes, wherein the knowledge entity attributes are the number for describing corresponding knowledge entity
According to;
Calculate described two knowledge entity attributes similarities and vector similarity;
The weighted sum that described two knowledge entity attributes similarities and vector similarity are calculated according to following formula obtains described
Similarity between two knowledge entities, it may be assumed that
S=aX+bY
Wherein, similarity of the S between described two knowledge entities, X are the attributes similarity, and Y is the vector similarity,
A, b is respectively the weight of the attributes similarity and the vector similarity.
6. knowledge base alignment schemes according to claim 1, which is characterized in that described when the similarity is greater than setting
First threshold when, by described two knowledge entities merge the step of in, further include following step:
When the similarity is greater than the second threshold of setting, wherein the second threshold is greater than the first threshold, to right
Any one in described two knowledge entities is deleted in neat knowledge base.
7. knowledge base alignment schemes according to claim 1, which is characterized in that described when the similarity is greater than setting
First threshold when, by described two knowledge entities merge the step of in, further include following step:
A. by described two knowledge splitting objects at several fructifications;
B. any two fructification in several described fructifications is selected, the similarity between described two fructifications is calculated;
C. it when the similarity between described two fructifications is greater than preset third threshold value, deletes in described two fructifications
Any one, wherein the third threshold value is greater than the first threshold;
D. step b and step c is repeated, until the similarity in the fructification of reservation between any two fructification is both less than or is waited
In preset third threshold value;
E., the fructification of the reservation is incorporated as to the alignment entity of described two knowledge entities.
8. a kind of knowledge base alignment means characterized by comprising
Module is obtained, for obtaining knowledge entity vector set, wherein the knowledge entity vector set is in knowledge base to be aligned
The vectorization of knowledge entity indicates;
Processing module obtains institute for the knowledge entity vector set to be input to preset knowledge entity cluster model
State the cluster result of knowledge in knowledge base entity to be aligned;
Computing module, for according to the cluster result, selection to belong to of a sort any two knowledge entity, calculating described two
Similarity between a knowledge entity;
Execution module merges described two knowledge entities when for being greater than the first threshold of setting when the similarity.
9. a kind of computer equipment, including memory and processor, it is stored with computer-readable instruction in the memory, it is described
When computer-readable instruction is executed by the processor, so that the processor executes such as any one of claims 1 to 7 right
It is required that the step of knowledge base alignment schemes.
10. a kind of computer readable storage medium, it is stored with computer-readable instruction on the computer readable storage medium, institute
It states and realizes the knowledge base pair as described in any one of claims 1 to 7 claim when computer-readable instruction is executed by processor
The step of neat method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811474699.XA CN109783582B (en) | 2018-12-04 | 2018-12-04 | Knowledge base alignment method, device, computer equipment and storage medium |
PCT/CN2019/103487 WO2020114022A1 (en) | 2018-12-04 | 2019-08-30 | Knowledge base alignment method and apparatus, computer device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811474699.XA CN109783582B (en) | 2018-12-04 | 2018-12-04 | Knowledge base alignment method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109783582A true CN109783582A (en) | 2019-05-21 |
CN109783582B CN109783582B (en) | 2023-08-15 |
Family
ID=66496644
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811474699.XA Active CN109783582B (en) | 2018-12-04 | 2018-12-04 | Knowledge base alignment method, device, computer equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109783582B (en) |
WO (1) | WO2020114022A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377906A (en) * | 2019-07-15 | 2019-10-25 | 出门问问信息科技有限公司 | Entity alignment schemes, storage medium and electronic equipment |
CN110427436A (en) * | 2019-07-31 | 2019-11-08 | 北京百度网讯科技有限公司 | The method and device of entity similarity calculation |
CN111026865A (en) * | 2019-10-18 | 2020-04-17 | 平安科技(深圳)有限公司 | Relation alignment method, device and equipment of knowledge graph and storage medium |
CN111159420A (en) * | 2019-12-12 | 2020-05-15 | 西安交通大学 | Entity optimization method based on attribute calculation and knowledge template |
WO2020114022A1 (en) * | 2018-12-04 | 2020-06-11 | 平安科技(深圳)有限公司 | Knowledge base alignment method and apparatus, computer device and storage medium |
CN111488461A (en) * | 2020-03-24 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Data processing method and device, electronic equipment and storage medium |
CN111563192A (en) * | 2020-04-28 | 2020-08-21 | 腾讯科技(深圳)有限公司 | Entity alignment method and device, electronic equipment and storage medium |
CN112541054A (en) * | 2020-12-15 | 2021-03-23 | 平安科技(深圳)有限公司 | Method, device, equipment and storage medium for governing questions and answers of knowledge base |
CN112579770A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Knowledge graph generation method, device, storage medium and equipment |
CN112699909A (en) * | 2019-10-23 | 2021-04-23 | 中移物联网有限公司 | Information identification method and device, electronic equipment and computer readable storage medium |
CN113536796A (en) * | 2021-07-15 | 2021-10-22 | 北京明略昭辉科技有限公司 | Entity alignment auxiliary method, device, equipment and storage medium |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112445876B (en) * | 2020-11-25 | 2023-12-26 | 中国科学院自动化研究所 | Entity alignment method and system for fusing structure, attribute and relationship information |
CN112541360A (en) * | 2020-12-07 | 2021-03-23 | 国泰君安证券股份有限公司 | Cross-platform anomaly identification and translation method, device, processor and storage medium for clustering by using hyper-parametric self-adaptive DBSCAN (direct media Access controller area network) |
CN113095948B (en) * | 2021-03-24 | 2023-06-06 | 西安交通大学 | Multi-source heterogeneous network user alignment method based on graph neural network |
CN113361263B (en) * | 2021-06-04 | 2023-10-20 | 中国人民解放军战略支援部队信息工程大学 | Character entity attribute alignment method and system based on attribute value distribution |
CN114329003A (en) * | 2021-12-27 | 2022-04-12 | 北京达佳互联信息技术有限公司 | Media resource data processing method and device, electronic equipment and storage medium |
CN114676267A (en) * | 2022-04-01 | 2022-06-28 | 北京明略软件系统有限公司 | Method and device for entity alignment and electronic equipment |
CN115563350A (en) * | 2022-10-22 | 2023-01-03 | 山东浪潮新基建科技有限公司 | Alignment and completion method and system for multi-source heterogeneous power grid equipment data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104239553A (en) * | 2014-09-24 | 2014-12-24 | 江苏名通信息科技有限公司 | Entity recognition method based on Map-Reduce framework |
CN105279277A (en) * | 2015-11-12 | 2016-01-27 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and device |
CN108154198A (en) * | 2018-01-25 | 2018-06-12 | 北京百度网讯科技有限公司 | Knowledge base entity normalizing method, system, terminal and computer readable storage medium |
CN108363810A (en) * | 2018-03-09 | 2018-08-03 | 南京工业大学 | A kind of file classification method and device |
CN108804567A (en) * | 2018-05-22 | 2018-11-13 | 平安科技(深圳)有限公司 | Method, equipment, storage medium and device for improving intelligent customer service response rate |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9430738B1 (en) * | 2012-02-08 | 2016-08-30 | Mashwork, Inc. | Automated emotional clustering of social media conversations |
CN103699663B (en) * | 2013-12-27 | 2017-02-08 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN109783582B (en) * | 2018-12-04 | 2023-08-15 | 平安科技(深圳)有限公司 | Knowledge base alignment method, device, computer equipment and storage medium |
CN109739939A (en) * | 2018-12-29 | 2019-05-10 | 颖投信息科技(上海)有限公司 | The data fusion method and device of knowledge mapping |
-
2018
- 2018-12-04 CN CN201811474699.XA patent/CN109783582B/en active Active
-
2019
- 2019-08-30 WO PCT/CN2019/103487 patent/WO2020114022A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104239553A (en) * | 2014-09-24 | 2014-12-24 | 江苏名通信息科技有限公司 | Entity recognition method based on Map-Reduce framework |
CN105279277A (en) * | 2015-11-12 | 2016-01-27 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and device |
CN108154198A (en) * | 2018-01-25 | 2018-06-12 | 北京百度网讯科技有限公司 | Knowledge base entity normalizing method, system, terminal and computer readable storage medium |
CN108363810A (en) * | 2018-03-09 | 2018-08-03 | 南京工业大学 | A kind of file classification method and device |
CN108804567A (en) * | 2018-05-22 | 2018-11-13 | 平安科技(深圳)有限公司 | Method, equipment, storage medium and device for improving intelligent customer service response rate |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020114022A1 (en) * | 2018-12-04 | 2020-06-11 | 平安科技(深圳)有限公司 | Knowledge base alignment method and apparatus, computer device and storage medium |
CN110377906A (en) * | 2019-07-15 | 2019-10-25 | 出门问问信息科技有限公司 | Entity alignment schemes, storage medium and electronic equipment |
CN110427436B (en) * | 2019-07-31 | 2022-03-22 | 北京百度网讯科技有限公司 | Method and device for calculating entity similarity |
CN110427436A (en) * | 2019-07-31 | 2019-11-08 | 北京百度网讯科技有限公司 | The method and device of entity similarity calculation |
CN112579770A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Knowledge graph generation method, device, storage medium and equipment |
WO2021072891A1 (en) * | 2019-10-18 | 2021-04-22 | 平安科技(深圳)有限公司 | Knowledge graph relationship alignment method, apparatus and device, and storage medium |
CN111026865A (en) * | 2019-10-18 | 2020-04-17 | 平安科技(深圳)有限公司 | Relation alignment method, device and equipment of knowledge graph and storage medium |
CN111026865B (en) * | 2019-10-18 | 2023-07-21 | 平安科技(深圳)有限公司 | Knowledge graph relationship alignment method, device, equipment and storage medium |
CN112699909A (en) * | 2019-10-23 | 2021-04-23 | 中移物联网有限公司 | Information identification method and device, electronic equipment and computer readable storage medium |
CN112699909B (en) * | 2019-10-23 | 2024-03-19 | 中移物联网有限公司 | Information identification method, information identification device, electronic equipment and computer readable storage medium |
CN111159420A (en) * | 2019-12-12 | 2020-05-15 | 西安交通大学 | Entity optimization method based on attribute calculation and knowledge template |
CN111159420B (en) * | 2019-12-12 | 2023-04-28 | 西安交通大学 | Entity optimization method based on attribute calculation and knowledge template |
CN111488461A (en) * | 2020-03-24 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Data processing method and device, electronic equipment and storage medium |
CN111563192A (en) * | 2020-04-28 | 2020-08-21 | 腾讯科技(深圳)有限公司 | Entity alignment method and device, electronic equipment and storage medium |
CN112541054A (en) * | 2020-12-15 | 2021-03-23 | 平安科技(深圳)有限公司 | Method, device, equipment and storage medium for governing questions and answers of knowledge base |
CN112541054B (en) * | 2020-12-15 | 2023-08-29 | 平安科技(深圳)有限公司 | Knowledge base question and answer management method, device, equipment and storage medium |
CN113536796A (en) * | 2021-07-15 | 2021-10-22 | 北京明略昭辉科技有限公司 | Entity alignment auxiliary method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109783582B (en) | 2023-08-15 |
WO2020114022A1 (en) | 2020-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109783582A (en) | A kind of knowledge base alignment schemes, device, computer equipment and storage medium | |
US9542454B2 (en) | Object-based information storage, search and mining system | |
US20100088342A1 (en) | Incremental feature indexing for scalable location recognition | |
CN113127632B (en) | Text summarization method and device based on heterogeneous graph, storage medium and terminal | |
CN110222709A (en) | A kind of multi-tag intelligence marking method and system | |
CN111353303A (en) | Word vector construction method and device, electronic equipment and storage medium | |
CN112199600A (en) | Target object identification method and device | |
CN114329029B (en) | Object retrieval method, device, equipment and computer storage medium | |
CN112131261B (en) | Community query method and device based on community network and computer equipment | |
CN114706987B (en) | Text category prediction method, device, equipment, storage medium and program product | |
CN114065048A (en) | Article recommendation method based on multi-different-pattern neural network | |
CN115600017A (en) | Feature coding model training method and device and media object recommendation method and device | |
Wang et al. | Latency-aware adaptive video summarization for mobile edge clouds | |
CN116703531B (en) | Article data processing method, apparatus, computer device and storage medium | |
CN113095901A (en) | Recommendation method, training method of related model, electronic equipment and storage device | |
CN112765481A (en) | Data processing method and device, computer and readable storage medium | |
US20240005170A1 (en) | Recommendation method, apparatus, electronic device, and storage medium | |
Vrigkas et al. | Active privileged learning of human activities from weakly labeled samples | |
Fushimi et al. | Accelerating Greedy K-Medoids Clustering Algorithm with Distance by Pivot Generation | |
JP4963341B2 (en) | Document relationship visualization method, visualization device, visualization program, and recording medium recording the program | |
CN110688508A (en) | Image-text data expansion method and device and electronic equipment | |
CN115455306B (en) | Push model training method, information push device and storage medium | |
CN113392257B (en) | Image retrieval method and device | |
CN117312533B (en) | Text generation method, device, equipment and medium based on artificial intelligent model | |
US20230306291A1 (en) | Methods, apparatuses and computer program products for generating synthetic data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |