CN111400448A - Method and device for analyzing incidence relation of objects - Google Patents
Method and device for analyzing incidence relation of objects Download PDFInfo
- Publication number
- CN111400448A CN111400448A CN202010169167.6A CN202010169167A CN111400448A CN 111400448 A CN111400448 A CN 111400448A CN 202010169167 A CN202010169167 A CN 202010169167A CN 111400448 A CN111400448 A CN 111400448A
- Authority
- CN
- China
- Prior art keywords
- analyzed
- text data
- keywords
- target object
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000010224 classification analysis Methods 0.000 claims abstract description 35
- 238000004458 analytical method Methods 0.000 claims abstract description 18
- 238000004590 computer program Methods 0.000 claims description 18
- 238000000354 decomposition reaction Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000007635 classification algorithm Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 5
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 238000012097 association analysis method Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 125000003118 aryl group Chemical group 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 206010063385 Intellectualisation Diseases 0.000 description 2
- 238000012098 association analyses Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an object incidence relation analysis method and device, wherein the method comprises the following steps: acquiring text data of a target object and an object to be analyzed, wherein the text data is text data exchanged between the target object and the object to be analyzed; extracting a plurality of keywords in the text data; carrying out community classification analysis on the target object and the object to be analyzed; and if the community classification analysis result is that the target object and the object to be analyzed are classified in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed. The method and the device can judge the incidence relation of the target object and have high accuracy.
Description
Technical Field
The invention relates to the technical field of object data analysis, in particular to an incidence relation analysis method and device of an object.
Background
The analysis of the association relationship of the object is more and more important in the current enterprises, for example, in the management of the human resources of the enterprises, the staff relationship in the association relationship of the object is an important component, and the good staff relationship can make the staff psychologically satisfied, is beneficial to improving the working efficiency and the positive initiative of the staff, and can also ensure the effective execution of the strategy and the target of the enterprises to a certain extent.
Employee relationships are key factors that affect employee behavioral attitudes, work efficiency, and execution capacity. However, the current rules for determining the relationship types, relationship names and various relationships between employees are relatively not standardized, the rules are respectively defined by related organizations, the whole integration and comparison are not facilitated, the validity of the relationship information of the employees is not high enough, and the accuracy of the finally determined relationship between the employees is not high.
Disclosure of Invention
The embodiment of the invention provides an incidence relation analysis method of an object, which is used for judging the incidence relation of a target object and has high accuracy, and the method comprises the following steps:
acquiring text data of a target object and an object to be analyzed, wherein the text data is text data exchanged between the target object and the object to be analyzed;
extracting a plurality of keywords in the text data;
carrying out community classification analysis on the target object and the object to be analyzed;
and if the community classification analysis result is that the target object and the object to be analyzed are classified in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed.
The embodiment of the invention provides an incidence relation analysis device of an object, which is used for judging the incidence relation of a target object and has high accuracy, and the device comprises:
the data acquisition module is used for acquiring text data of a target object and an object to be analyzed, wherein the text data is text data exchanged between the target object and the object to be analyzed;
the keyword extraction module is used for extracting a plurality of keywords in the text data;
the community classification analysis module is used for carrying out community classification analysis on the target object and the object to be analyzed;
and the relation analysis module is used for matching the extracted keywords with a predefined object relation type and determining the incidence relation between the target object and the object to be analyzed if the community classification analysis result indicates that the target object and the object to be analyzed are classified in the same community.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the incidence relation analysis method of the objects when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program for executing the method for analyzing association between objects is stored in the computer-readable storage medium.
In the embodiment of the invention, text data of a target object and an object to be analyzed are obtained, wherein the text data is text data exchanged between the target object and the object to be analyzed; extracting a plurality of keywords in the text data; carrying out community classification analysis on the target object and the object to be analyzed; and if the community classification analysis result indicates that the target object and the object to be analyzed are in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed. In the process, a plurality of keywords in the text data are extracted, the target object and the object to be analyzed are determined to be in the same community classification, the first judgment is carried out, the extracted keywords are matched with the predefined object relation type only when the target object and the object to be analyzed are classified in the same community, the relation between the target object and the communication object is determined, the second judgment is carried out, and the association relation between the target object and the object to be analyzed can be determined more accurately through the two judgments.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a flowchart of a method for analyzing an association relationship between objects according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart of an association analysis method of objects according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an apparatus for analyzing association between objects according to an embodiment of the present invention;
FIG. 4 is a diagram of a computer device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
In the description of the present specification, the terms "comprising," "including," "having," "containing," and the like are used in an open-ended fashion, i.e., to mean including, but not limited to. Reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the embodiments is for illustrative purposes to illustrate the implementation of the present application, and the sequence of steps is not limited and can be adjusted as needed.
Fig. 1 is a flowchart of an association analysis method of an object in an embodiment of the present invention, as shown in fig. 1, the method includes:
step 102, extracting a plurality of keywords in text data;
103, carrying out community classification analysis on the target object and the object to be analyzed;
and 104, if the community classification analysis result shows that the target object and the object to be analyzed are in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed.
In the embodiment of the invention, a plurality of keywords in text data are extracted, the target object and the object to be analyzed are firstly determined to be in the same community classification, which is the first judgment, the extracted keywords are matched with the predefined object relationship type only when the target object and the object to be analyzed are classified in the same community, the relationship between the target object and the communication object is determined, which is the second judgment, and the association relationship between the target object and the object to be analyzed can be more accurately determined through two judgments.
In step 101, text data of a target object and an object to be analyzed may be obtained, for example, the target object may be an employee, the object to be analyzed may be an object to be communicated with the employee, may be an employee inside a company, or may be an external client outside the company, and the text data may be an incoming mail record, a chat record, or the like of the employee and the object to be analyzed.
There are various methods for extracting a plurality of keywords from the text data in step 102, and one example is given below.
In an embodiment, before extracting the plurality of keywords in the text data, the method further includes:
preprocessing the text data to obtain the text data which accords with a preset standard;
segmenting words of the text data which accord with the preset specification to obtain a plurality of decomposed words;
extracting a plurality of keywords in the text data, including:
a plurality of keywords are extracted from the plurality of decomposed words.
In the above embodiment, the pre-setting specification includes removing unnecessary words, and the pre-processing the text data includes:
s1: removing the name and time of the person in the text data;
for example, python programming may be used to remove names and time "Lix _ Total" from text data
2019-01-02 16:01:52”;
S2: filtering English words in the text data and leaving the Chinese content of the text data;
s3: removing some nonsense words and punctuation marks in the text data by using the word stock bag;
for example, remove "go", "got", "? "and the like.
Through the steps, the text data meeting the preset specification is obtained.
And then, segmenting the text data meeting the preset specification to obtain a plurality of decomposed words, and particularly segmenting the text data meeting the preset specification by adopting a Chinese segmentation model cws. When dividing words, the problem of inaccurate individual words is encountered, wherein names of people are the most common, such as: the term "Liu Yuan aromatic mountain" can be classified into "Liu Yuan", "aromatic" and "mountain", but actually "Liu Yuan" is a person name, and "aromatic mountain" is a word, and at this time, the word can be solved through a custom dictionary word library Customword.
Through the word segmentation, a plurality of decomposed words can be obtained, keyword extraction can be performed at this time, and various extraction methods are provided, and one embodiment is given below.
In one embodiment, extracting a plurality of keywords from a plurality of decomposed words comprises:
calculating the TF-IDF value of each decomposed word by adopting a TF-IDF algorithm;
sequencing the decomposed words from large to small according to the TF-IDF value;
and determining a preset number of decomposition words in the sorted decomposition words as the keywords.
In the above embodiment, the calculation formula of the TF-IDF value is as follows:
wherein n isijTo decompose the word tiIn document djThe denominator is in the file djThe sum of the occurrence times of all the decomposed words in the Chinese character;
| D | is the total number of text data; for example, a plurality of text data may be obtained, each text data being participled into a plurality of decomposed words;
|{j:ti∈djis taken to contain a word tiIf the decomposed word is not in any text data, the total number of text data of (a) will result in a dividend of zero, so 1+ | { j: t, is typically usedi∈dj}|。
If one text data set has 100 total decomposed words and the word "eat" has 5 occurrences, then the word frequency of the word "eat" in the file is 5/100 to 0.05. one method of calculating the text data frequency (IDF) is to determine how many text data have the word "eat" and then divide by the total number of text data contained in the text data set, the word "eat" has 65 text data occurrences and the total number of text data is 10,00, and the inverse file frequency is log (10,00/65) to 1.1847. the final TF-IDF value is 0.05 × 1.1847 to 0.0592345.
After obtaining the TF-IDF value of each decomposed word, sorting the TF-IDF values from large to small; and determining a preset number of decomposition words in the sorted decomposition words as the keywords. For example, if the preset number is 30, the top 30 sorted decomposed words may be determined as keywords, where the preset number is related to a predefined object relationship type, and the predefined object relationship type includes multiple types, for example, one of the predefined object relationship types is a work relationship type, and the work relationship type includes 30 keywords, and then the preset number is 30.
In one embodiment, the community classification analysis of the target object and the object to be analyzed includes:
and carrying out community classification analysis on the target object and the object to be analyzed by adopting a community classification algorithm.
In an embodiment, matching the extracted keywords with a predefined object relationship type, and determining an association relationship between the target object and the object to be analyzed includes:
for each predefined object relationship type, analyzing the similarity of a plurality of keywords and the object relationship type;
and determining the object relationship type with the maximum similarity as the incidence relationship between the target object and the object to be analyzed.
In the above embodiments, a plurality of object relationship types are predefined in the embodiments of the present invention, and table 1 is an example of the plurality of object relationship types.
TABLE 1 examples of multiple object relationship types
In each object relationship type in table 1, each object relationship type includes a plurality of keywords, and the number of keywords in the extracted text data corresponds to the number of keywords in each object relationship type, so that the similarity can be calculated. And if the community classification analysis result is that the target object and the object to be analyzed are not in any community classification, the target object and the object to be analyzed have no association relation.
In one embodiment, the similarity is a cosine similarity;
analyzing the similarity of a plurality of keywords and the object relation type, comprising:
calculating the similarity between a plurality of keywords and the object relation type by adopting the following formula:
wherein cos (θ) is the similarity;
n is the total number of the keywords in the text data and the keywords in the object relation type;
xithe number of times of occurrence of the ith keyword;
yithe number of times of occurrence of the ith keyword in the object relationship type.
In the above-described embodiment, for example, the number of occurrences of each keyword among the keywords in the extracted text data is [1, 1, 1, 1, 1, 1, 1, 1, 1,0, 0,0, 0,0, 0,0, 0, 1, 1,0, 0]The number of occurrences of a keyword in an object relationship type is listbcodeeonehot ═ 0,0, 1,0, 0,0, 0,0, 0,0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]Then, then
And finally, determining the object relation type with the maximum similarity as the incidence relation between the target object and the object to be analyzed, and determining the incidence relation between the target object and the object to be analyzed only by obtaining the incidence relation without carrying out community classification analysis on the target object and the object to be analyzed.
In the above embodiment, the association relationship between the target object and the plurality of objects to be analyzed may be determined, where the determined relationship (i.e. the relationship is direct and accurate) is also referred to as a first-degree relationship, but there may be a suspicious relationship, for example, at this time, a community classification algorithm and a second-degree relationship model may be applied to perform second-degree mining on the existing relationship.
A second degree relation model: if A has some association with B, A has some association with C, B has no one-degree relationship with C, and the two relationships are the same, then B and C can be considered to have a suspected relationship.
For example, the target object "note", "li star right", "zhao star" is in the same community classification category, there is a school friend relationship between "note" and "li star right", "li star right" and "zhao star" are in a school friend relationship, and there is no one-degree relationship between "note" and "zhao star". In this regard, it can be said that "Zhang Xin" and "Zhao Xin" exist "are dubious" school friend relationships. The probability score of the "suspected" school friend relationship can be calculated by combining the number of the relationship edges of "zhang xi xiang" and "li xi quan", the number of the relationship edges of "li xi quan" and "zhao xi bi", the number of the relationship edges of "li xi quan" and "zhao quan", and the like.
After the incidence relations between the target object and the objects to be analyzed are obtained, the incidence relations can be displayed, the displaying mode comprises single-node displaying, multi-node displaying and knowledge graph displaying, wherein the single-node displaying shows all the incidence relations of one object, the multi-node displaying shows the incidence relations among a selected number of objects, and the knowledge graph displaying shows the incidence relations among all the objects which are logged in the human resource system at present.
Based on the above embodiment, the present invention provides the following embodiment to explain a detailed flow of an association relationship analysis method of an object, and fig. 2 is a detailed flow chart of an association relationship analysis method of an object according to an embodiment of the present invention, as shown in fig. 2, including:
and step 211, determining that the target object has no association relation with the object to be analyzed.
Of course, it is understood that other variations of the above detailed flow can be made, and all such variations are intended to fall within the scope of the present invention.
In summary, in the method provided in the embodiment of the present invention, text data of a target object and an object to be analyzed is obtained, where the text data is text data exchanged between the target object and the object to be analyzed; extracting a plurality of keywords in the text data; carrying out community classification analysis on the target object and the object to be analyzed; and if the community classification analysis result indicates that the target object and the object to be analyzed are in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed. In the process, a plurality of keywords in the text data are extracted, the target object and the object to be analyzed are determined to be in the same community classification, the first judgment is carried out, the extracted keywords are matched with the predefined object relation type only when the target object and the object to be analyzed are classified in the same community, the relation between the target object and the communication object is determined, the second judgment is carried out, and the association relation between the target object and the object to be analyzed can be determined more accurately through the two judgments. In addition, the method of the embodiment of the invention can adapt to continuously changing business requirements, analyze the application and display of the staff relation terminal, and improve the intellectualization, the scientization and the accuracy of the human resource management. The method of the invention is expandable and is suitable for analyzing the incidence relation between the objects which are continuously added; the method can meet the related requirements of employee relationship management in various services to a certain extent; the method can provide a relatively friendly application and display interface, and enables a user to operate simply, conveniently and easily.
The embodiment of the present invention further provides an apparatus for analyzing an association relationship between objects, which has a similar principle to the method for analyzing an association relationship between objects and is not described herein again.
Fig. 3 is a schematic diagram of an association analysis apparatus for objects according to an embodiment of the present invention, as shown in fig. 3, including:
the data obtaining module 301 is configured to obtain text data of a target object and an object to be analyzed, where the text data is text data exchanged between the target object and the object to be analyzed;
a keyword extraction module 302, configured to extract a plurality of keywords in the text data;
the community classification analysis module 303 is configured to perform community classification analysis on the target object and the object to be analyzed;
and the relationship analysis module 304 is configured to, if the community classification analysis result indicates that the target object and the object to be analyzed are classified in the same community, match the extracted multiple keywords with a predefined object relationship type, and determine an association relationship between the target object and the object to be analyzed.
In one embodiment, the apparatus further comprises a preprocessing module 305 for:
preprocessing the text data to obtain the text data which accords with a preset standard;
segmenting words of the text data which accord with the preset specification to obtain a plurality of decomposed words;
the keyword extraction module 302 is specifically configured to:
a plurality of keywords are extracted from the plurality of decomposed words.
In an embodiment, the keyword extraction module 302 is specifically configured to:
calculating the TF-IDF value of each decomposed word by adopting a TF-IDF algorithm;
sequencing the decomposed words from large to small according to the TF-IDF value;
and determining a preset number of decomposition words in the sorted decomposition words as the keywords.
In an embodiment, the community classification analysis module 303 is specifically configured to:
and carrying out community classification analysis on the target object and the object to be analyzed by adopting a community classification algorithm.
In an embodiment, the relationship analysis module 304 is specifically configured to:
for each predefined object relationship type, analyzing the similarity of a plurality of keywords and the object relationship type;
and determining the object relationship type with the maximum similarity as the incidence relationship between the target object and the object to be analyzed.
In one embodiment, the similarity is a cosine similarity;
the relationship analysis module 304 is specifically configured to:
calculating the similarity between a plurality of keywords and the object relation type by adopting the following formula:
wherein cos (θ) is the similarity;
xithe number of times of occurrence of the ith keyword;
yithe number of times of occurrence of the ith keyword in the object relationship type.
In summary, in the apparatus provided in the embodiment of the present invention, text data of a target object and an object to be analyzed is obtained, where the text data is text data exchanged between the target object and the object to be analyzed; extracting a plurality of keywords in the text data; carrying out community classification analysis on the target object and the object to be analyzed; and if the community classification analysis result indicates that the target object and the object to be analyzed are in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed. In the process, a plurality of keywords in the text data are extracted, the target object and the object to be analyzed are determined to be in the same community classification, the first judgment is carried out, the extracted keywords are matched with the predefined object relation type only when the target object and the object to be analyzed are classified in the same community, the relation between the target object and the communication object is determined, the second judgment is carried out, and the association relation between the target object and the object to be analyzed can be determined more accurately through the two judgments. In addition, the method of the embodiment of the invention can adapt to continuously changing business requirements, analyze the application and display of the staff relation terminal, and improve the intellectualization, the scientization and the accuracy of the human resource management. The device is expandable and is suitable for analyzing the association relationship between objects which are increased continuously; the device can meet the related requirements of employee relationship management in various services to a certain extent; the device can provide a friendly application and display interface, and enables a user to operate easily, conveniently and easily.
An embodiment of the present application further provides a computer device, and fig. 4 is a schematic diagram of the computer device in the embodiment of the present invention, where the computer device is capable of implementing all steps in the association relationship analysis method of the object in the embodiment, and the electronic device specifically includes the following contents:
a processor (processor)401, a memory (memory)402, a communication interface (communications interface)403, and a bus 404;
the processor 401, the memory 402 and the communication interface 403 complete mutual communication through the bus 404; the communication interface 403 is used for implementing information transmission between related devices such as server-side devices, detection devices, and user-side devices;
the processor 401 is configured to call the computer program in the memory 402, and when the processor executes the computer program, the processor implements all the steps in the association analysis method of the object in the above embodiments.
An embodiment of the present application further provides a computer-readable storage medium, which can implement all the steps in the association relationship analysis of the object in the above embodiment, and the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps of the association relationship analysis method of the object in the above embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (14)
1. An object association analysis method is characterized by comprising the following steps:
acquiring text data of a target object and an object to be analyzed, wherein the text data is text data exchanged between the target object and the object to be analyzed;
extracting a plurality of keywords in the text data;
carrying out community classification analysis on the target object and the object to be analyzed;
and if the community classification analysis result is that the target object and the object to be analyzed are classified in the same community, matching the extracted keywords with a predefined object relation type, and determining the incidence relation between the target object and the object to be analyzed.
2. The method for analyzing the association relationship of the objects according to claim 1, further comprising, before extracting the plurality of keywords in the text data:
preprocessing the text data to obtain the text data which accords with a preset standard;
segmenting words of the text data which accord with the preset specification to obtain a plurality of decomposed words;
extracting a plurality of keywords in the text data, including:
a plurality of keywords are extracted from the plurality of decomposed words.
3. The method of analyzing the association relationship of the objects according to claim 2, wherein extracting a plurality of keywords from a plurality of decomposed words comprises:
calculating the TF-IDF value of each decomposed word by adopting a TF-IDF algorithm;
sequencing the decomposed words from large to small according to the TF-IDF value;
and determining a preset number of decomposition words in the sorted decomposition words as the keywords.
4. The method for analyzing association relationship between objects according to claim 1, wherein performing community classification analysis on the target object and the object to be analyzed comprises:
and carrying out community classification analysis on the target object and the object to be analyzed by adopting a community classification algorithm.
5. The method of analyzing association between objects according to claim 1, wherein the step of matching the extracted keywords with predefined object relationship types to determine the association between the target object and the object to be analyzed comprises:
for each predefined object relationship type, analyzing the similarity of a plurality of keywords and the object relationship type;
and determining the object relationship type with the maximum similarity as the incidence relationship between the target object and the object to be analyzed.
6. The method of analyzing the association of an object according to claim 5, wherein the similarity is a cosine similarity;
analyzing the similarity of a plurality of keywords and the object relation type, comprising:
calculating the similarity between a plurality of keywords and the object relation type by adopting the following formula:
wherein cos (θ) is the similarity;
n is the total number of the keywords in the text data and the keywords in the object relation type;
xithe number of times of occurrence of the ith keyword;
yithe number of times of occurrence of the ith keyword in the object relationship type.
7. An apparatus for analyzing an association relation of objects, comprising:
the data acquisition module is used for acquiring text data of a target object and an object to be analyzed, wherein the text data is text data exchanged between the target object and the object to be analyzed;
the keyword extraction module is used for extracting a plurality of keywords in the text data;
the community classification analysis module is used for carrying out community classification analysis on the target object and the object to be analyzed;
and the relation analysis module is used for matching the extracted keywords with a predefined object relation type and determining the incidence relation between the target object and the object to be analyzed if the community classification analysis result indicates that the target object and the object to be analyzed are classified in the same community.
8. The apparatus for analyzing the association relationship of objects according to claim 7, further comprising a preprocessing module for:
preprocessing the text data to obtain the text data which accords with a preset standard;
segmenting words of the text data which accord with the preset specification to obtain a plurality of decomposed words;
the keyword extraction module is specifically configured to:
a plurality of keywords are extracted from the plurality of decomposed words.
9. The apparatus for analyzing association between objects according to claim 7, wherein the keyword extraction module is specifically configured to:
calculating the TF-IDF value of each decomposed word by adopting a TF-IDF algorithm;
sequencing the decomposed words from large to small according to the TF-IDF value;
and determining a preset number of decomposition words in the sorted decomposition words as the keywords.
10. The apparatus for analyzing association relationship between objects according to claim 7, wherein the community classification analysis module is specifically configured to:
and carrying out community classification analysis on the target object and the object to be analyzed by adopting a community classification algorithm.
11. The apparatus for analyzing association relationship between objects according to claim 7, wherein the relationship analysis module is specifically configured to:
for each predefined object relationship type, analyzing the similarity of a plurality of keywords and the object relationship type;
and determining the object relationship type with the maximum similarity as the incidence relationship between the target object and the object to be analyzed.
12. The apparatus for analyzing the association relationship of objects according to claim 11, wherein the similarity is a cosine similarity;
the relationship analysis module is specifically configured to:
calculating the similarity between a plurality of keywords and the object relation type by adopting the following formula:
wherein cos (θ) is the similarity;
xithe number of times of occurrence of the ith keyword;
yithe number of times of occurrence of the ith keyword in the object relationship type.
13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 6 when executing the computer program.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010169167.6A CN111400448A (en) | 2020-03-12 | 2020-03-12 | Method and device for analyzing incidence relation of objects |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010169167.6A CN111400448A (en) | 2020-03-12 | 2020-03-12 | Method and device for analyzing incidence relation of objects |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111400448A true CN111400448A (en) | 2020-07-10 |
Family
ID=71434201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010169167.6A Pending CN111400448A (en) | 2020-03-12 | 2020-03-12 | Method and device for analyzing incidence relation of objects |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111400448A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112560480A (en) * | 2020-12-24 | 2021-03-26 | 北京百度网讯科技有限公司 | Task community discovery method, device, equipment and storage medium |
CN112883733A (en) * | 2020-12-09 | 2021-06-01 | 成都中科大旗软件股份有限公司 | Analysis method for quickly constructing event relation based on text entity extraction |
CN114416990A (en) * | 2022-01-17 | 2022-04-29 | 北京百度网讯科技有限公司 | Object relationship network construction method and device and electronic equipment |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140188830A1 (en) * | 2012-12-27 | 2014-07-03 | Sas Institute Inc. | Social Community Identification for Automatic Document Classification |
CN106649422A (en) * | 2016-06-12 | 2017-05-10 | 中国移动通信集团湖北有限公司 | Keyword extraction method and apparatus |
CN107436875A (en) * | 2016-05-25 | 2017-12-05 | 华为技术有限公司 | File classification method and device |
CN107729466A (en) * | 2017-10-12 | 2018-02-23 | 杭州中奥科技有限公司 | Construction method, device and the electronic equipment of relational network |
CN108280458A (en) * | 2017-01-05 | 2018-07-13 | 腾讯科技(深圳)有限公司 | Group relation kind identification method and device |
CN109543078A (en) * | 2018-10-18 | 2019-03-29 | 深圳云天励飞技术有限公司 | Social relationships determine method, apparatus, equipment and computer readable storage medium |
CN110188191A (en) * | 2019-04-08 | 2019-08-30 | 北京邮电大学 | A kind of entity relationship map construction method and system for Web Community's text |
CN110390039A (en) * | 2019-07-25 | 2019-10-29 | 广州汇智通信技术有限公司 | Social networks analysis method, device and the equipment of knowledge based map |
CN110457603A (en) * | 2019-08-16 | 2019-11-15 | 中国电子信息产业集团有限公司第六研究所 | Customer relationship abstracting method, device, electronic equipment and readable storage medium storing program for executing |
CN110532451A (en) * | 2019-06-26 | 2019-12-03 | 平安科技(深圳)有限公司 | Search method and device for policy text, storage medium, electronic device |
CN110555172A (en) * | 2019-08-30 | 2019-12-10 | 京东数字科技控股有限公司 | user relationship mining method and device, electronic equipment and storage medium |
CN110610434A (en) * | 2019-09-04 | 2019-12-24 | 成都威嘉软件有限公司 | Community discovery method based on artificial intelligence |
CN110647590A (en) * | 2019-09-23 | 2020-01-03 | 税友软件集团股份有限公司 | Target community data identification method and related device |
CN110705301A (en) * | 2019-09-30 | 2020-01-17 | 京东城市(北京)数字科技有限公司 | Entity relationship extraction method and device, storage medium and electronic equipment |
-
2020
- 2020-03-12 CN CN202010169167.6A patent/CN111400448A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140188830A1 (en) * | 2012-12-27 | 2014-07-03 | Sas Institute Inc. | Social Community Identification for Automatic Document Classification |
CN107436875A (en) * | 2016-05-25 | 2017-12-05 | 华为技术有限公司 | File classification method and device |
CN106649422A (en) * | 2016-06-12 | 2017-05-10 | 中国移动通信集团湖北有限公司 | Keyword extraction method and apparatus |
CN108280458A (en) * | 2017-01-05 | 2018-07-13 | 腾讯科技(深圳)有限公司 | Group relation kind identification method and device |
CN107729466A (en) * | 2017-10-12 | 2018-02-23 | 杭州中奥科技有限公司 | Construction method, device and the electronic equipment of relational network |
CN109543078A (en) * | 2018-10-18 | 2019-03-29 | 深圳云天励飞技术有限公司 | Social relationships determine method, apparatus, equipment and computer readable storage medium |
CN110188191A (en) * | 2019-04-08 | 2019-08-30 | 北京邮电大学 | A kind of entity relationship map construction method and system for Web Community's text |
CN110532451A (en) * | 2019-06-26 | 2019-12-03 | 平安科技(深圳)有限公司 | Search method and device for policy text, storage medium, electronic device |
CN110390039A (en) * | 2019-07-25 | 2019-10-29 | 广州汇智通信技术有限公司 | Social networks analysis method, device and the equipment of knowledge based map |
CN110457603A (en) * | 2019-08-16 | 2019-11-15 | 中国电子信息产业集团有限公司第六研究所 | Customer relationship abstracting method, device, electronic equipment and readable storage medium storing program for executing |
CN110555172A (en) * | 2019-08-30 | 2019-12-10 | 京东数字科技控股有限公司 | user relationship mining method and device, electronic equipment and storage medium |
CN110610434A (en) * | 2019-09-04 | 2019-12-24 | 成都威嘉软件有限公司 | Community discovery method based on artificial intelligence |
CN110647590A (en) * | 2019-09-23 | 2020-01-03 | 税友软件集团股份有限公司 | Target community data identification method and related device |
CN110705301A (en) * | 2019-09-30 | 2020-01-17 | 京东城市(北京)数字科技有限公司 | Entity relationship extraction method and device, storage medium and electronic equipment |
Non-Patent Citations (1)
Title |
---|
刘锦文等: "基于信息关联拓扑的互联网社交关系挖掘", 《计算机应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112883733A (en) * | 2020-12-09 | 2021-06-01 | 成都中科大旗软件股份有限公司 | Analysis method for quickly constructing event relation based on text entity extraction |
CN112560480A (en) * | 2020-12-24 | 2021-03-26 | 北京百度网讯科技有限公司 | Task community discovery method, device, equipment and storage medium |
CN112560480B (en) * | 2020-12-24 | 2023-08-15 | 北京百度网讯科技有限公司 | Task community discovery method, device, equipment and storage medium |
CN114416990A (en) * | 2022-01-17 | 2022-04-29 | 北京百度网讯科技有限公司 | Object relationship network construction method and device and electronic equipment |
CN114416990B (en) * | 2022-01-17 | 2024-05-21 | 北京百度网讯科技有限公司 | Method and device for constructing object relation network and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9477747B2 (en) | Method and apparatus for acquiring hot topics | |
CN110929125B (en) | Search recall method, device, equipment and storage medium thereof | |
US20150302197A1 (en) | Apparatus and Method for Identifying Similarity Via Dynamic Decimation of Token Sequence N-Grams | |
CN108491388B (en) | Data set acquisition method, classification method, device, equipment and storage medium | |
CN107872454B (en) | Threat information monitoring and analyzing system and method for ultra-large Internet platform | |
CN111400448A (en) | Method and device for analyzing incidence relation of objects | |
CN106874253A (en) | Recognize the method and device of sensitive information | |
CN107145516B (en) | Text clustering method and system | |
CN110362601B (en) | Metadata standard mapping method, device, equipment and storage medium | |
CN108549723B (en) | Text concept classification method and device and server | |
CN113076735A (en) | Target information acquisition method and device and server | |
CN112199588A (en) | Public opinion text screening method and device | |
CN111680498B (en) | Entity disambiguation method, device, storage medium and computer equipment | |
CN113486664A (en) | Text data visualization analysis method, device, equipment and storage medium | |
CN112328805A (en) | Entity mapping method of vulnerability description information and database table based on NLP | |
CN114817243A (en) | Method, device and equipment for establishing database joint index and storage medium | |
CN112579781B (en) | Text classification method, device, electronic equipment and medium | |
CN114116997A (en) | Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium | |
CN108073567B (en) | Feature word extraction processing method, system and server | |
CN113705164A (en) | Text processing method and device, computer equipment and readable storage medium | |
CN105512270B (en) | Method and device for determining related objects | |
CN104462439A (en) | Event recognizing method and device | |
CN116680401A (en) | Document processing method, document processing device, apparatus and storage medium | |
CN107992501B (en) | Social network information identification method, processing method and device | |
CN112435151B (en) | Government information data processing method and system based on association analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220920 Address after: 25 Financial Street, Xicheng District, Beijing 100033 Applicant after: CHINA CONSTRUCTION BANK Corp. Address before: 25 Financial Street, Xicheng District, Beijing 100033 Applicant before: CHINA CONSTRUCTION BANK Corp. Applicant before: Jianxin Financial Science and Technology Co.,Ltd. |
|
TA01 | Transfer of patent application right |