CN111581384B - Enterprise policy text clustering method - Google Patents

Enterprise policy text clustering method Download PDF

Info

Publication number
CN111581384B
CN111581384B CN202010367581.8A CN202010367581A CN111581384B CN 111581384 B CN111581384 B CN 111581384B CN 202010367581 A CN202010367581 A CN 202010367581A CN 111581384 B CN111581384 B CN 111581384B
Authority
CN
China
Prior art keywords
enterprise
agent
psz
guided
setting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010367581.8A
Other languages
Chinese (zh)
Other versions
CN111581384A (en
Inventor
郭肇禄
陈远存
谭力江
张文生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oking Information Industry Co ltd
Original Assignee
Guangdong Oking Information Industry Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oking Information Industry Co ltd filed Critical Guangdong Oking Information Industry Co ltd
Priority to CN202010367581.8A priority Critical patent/CN111581384B/en
Publication of CN111581384A publication Critical patent/CN111581384A/en
Application granted granted Critical
Publication of CN111581384B publication Critical patent/CN111581384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/10Tax strategies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Technology Law (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for clustering enterprise-benefiting policy texts, and relates to the technical field of text clustering. The method comprises the steps of firstly collecting the favorable enterprise policy text, then preprocessing the favorable enterprise policy text, extracting the characteristic vector, and then optimizing the clustering center of the favorable enterprise policy text by utilizing a guided sine and cosine algorithm. In the guided sine and cosine algorithm, the guided crossing rate is adaptively adjusted according to the searched feedback information, the guided searching direction is generated by combining the guided crossing rate, and then the performance of the algorithm is improved by utilizing the guided searching direction. The method realizes clustering of the enterprise-preference policy text by utilizing the guided sine and cosine algorithm, and can improve the clustering precision of the enterprise-preference policy text.

Description

Enterprise policy text clustering method
Technical Field
The invention relates to the technical field of text clustering, in particular to a method for clustering enterprise-benefited policy texts.
Background
In order to better serve small and medium-sized enterprises and accelerate economic construction, various related departments at all levels have issued a plurality of enterprise-benefiting policies. The enterprise-benefiting policies comprise tax-free policies, tax-reducing policies, interest-bearing support policies, yield-increasing and efficiency-increasing reward policies and the like. However, with the successive business of various types of favorable-enterprise policies, it is often difficult for many small and medium-sized enterprises to find favorable-enterprise policies that meet their own conditions. How to read the favorable enterprise policy for small and medium-sized enterprises is a very challenging task. Therefore, researchers try to recommend an enterprise-benefiting policy meeting development requirements for medium-sized and small enterprises according to the characteristics of the medium-sized and small enterprises by using an artificial intelligence technology.
In order to better help small and medium-sized enterprises to recommend proper enterprise-preference policies, the enterprise-preference policy texts need to be classified into clusters. The manual class clustering of a plurality of enterprise-benefit policy texts usually consumes a great deal of manpower. Therefore, researchers propose to perform class clustering on the preferential enterprise policy text by using a text clustering technology. However, when the traditional text clustering technology is applied to clustering of the text of the enterprise-preference policy, the shortcoming of low clustering precision is easy to occur.
Disclosure of Invention
The invention provides a method for clustering enterprise-preference policy texts, which overcomes the defect that the clustering precision is not high easily when the traditional text clustering technology is applied to clustering of enterprise-preference policy texts to a certain extent, and can improve the accuracy of the enterprise-preference policy text clustering.
The technical scheme of the invention is as follows: a method for clustering the text of a preferential enterprise policy comprises the following steps:
step 1, collecting a preferential enterprise policy text;
step 2, preprocessing the preferential enterprise policy text;
step 3, extracting the feature vector of the preferential enterprise policy text;
step 4, setting the obtained feature vector of the enterprise-promoting policy text as an enterprise-promoting policy text data set;
step 5, optimizing a clustering center of the enterprise-preference policy text data set by using a guided sine and cosine algorithm;
step 6, carrying out class cluster division on the enterprise-favorable policy text data set by using the obtained clustering center, namely obtaining a clustering result of the enterprise-favorable policy text;
wherein, the optimizing the clustering center of the preferential enterprise policy text data set by using the guided sine and cosine algorithm in the step 5 comprises the following steps:
step 5.1, setting the number PSZ of agents and setting the maximum iteration number MaxIT;
step 5.2, setting the current iteration times CIt to be 0;
step 5.3, setting the number CCN of the text type clusters of the enterprise-benefiting policy;
step 5.4, generating PSZ intelligent agent AC randomlyiWherein each agent stores CCN cluster centers, agent index i ═ 1,2, …, PSZ;
step 5.5, forming the generated PSZ intelligent agents into a population;
step 5.6, calculating the adaptive values of PSZ agents in the population according to the formula (1):
Figure BDA0002477118460000021
afv thereiniAn adaptation value representing the ith executing agent; si is a sample subscript; cluster-like subscripts ci ═ 1,2, …, CCN; CXsiRepresenting the sih sample in the set of the preferential enterprise policy text data; DC (direct current)ciRepresenting the ci-th class cluster; ACi,ciRepresenting the ci-th cluster center stored by the ith agent;
step 5.7, finding out the intelligent agent with the minimum adaptation value from PSZ intelligent agents of the population, and storing the found intelligent agent with the minimum adaptation value to the optimal intelligent agent gBA;
step 5.8, initializing the retention cross rate KCRi=0.5;
Step 5.9, generating PSZ guided Agents DIAiThe generation method is setting DIAi=ACiWherein the agent subscript i ═ 1,2, …, PSZ;
step 5.10, setting temporary storage intelligent agent TIAi=DIAiWherein the agent subscript i ═ 1,2, …, PSZ;
step 5.11, setting a counter tsi to 1;
step 5.12 at [1, PSZ]Randomly generating a positive integer ei within the range; then setting the ei temporary storage intelligent agent TIAei=gBA;
Step 5.13, setting a counter tsi to tsi + 1;
step 5.14, if the counter tsi is less than PSZ × 0.1, go to step 5.12, otherwise go to step 5.15;
step 5.15, calculating the guided crossing rate DCR according to the formula (2)i
Figure BDA0002477118460000031
Wherein rand represents a random real number generating function, tep is a random real number between [0,1 ];
step 5.16, calculating the NIA of the foreground intelligent agent according to the formula (3)i
Figure BDA0002477118460000032
Wherein rid is a random positive integer between [1, PSZ ]; atp is a random real number between [0,1 ]; trp is a random real number between [0,1 ];
step 5.17, if the foreground agent NIAiIs smaller than the guiding agent DIAiThe adapted value of (D), then the guided agent DIA is seti=NIAiOtherwise, the guiding agent DIA is maintainediThe change is not changed;
step 5.18, executing a guided sine and cosine operator according to the formula (4):
Figure BDA0002477118460000033
wherein
Figure BDA0002477118460000034
r2 is [0, 2X π]Random real number in between, and pi is the circumferential ratio; r3 is [0,2 ]]Random real numbers in between; r4 is [0,1]]Random real numbers in between; sin is a sine function; cos is a cosine function; GXiA sampling agent;
step 5.19, if sampling agent GXiAdapted value ratio of ACiIs smaller, the AC is seti=GXiOtherwise, keeping ACiThe change is not changed;
step 5.20, if sampling the intelligent GXiAdapted value ratio of ACiIs smaller, the retention cross rate KCR is seti=DCRiOtherwise, keeping the retention cross rate KCRiThe change is not changed;
step 5.21, finding out the intelligent agent with the minimum adaptive value in the population and storing the intelligent agent to the optimal intelligent agent gBA;
step 5.22, setting the current iteration times CIt to CIt + 1; if the current iteration number CIt is less than the maximum iteration number MaxIT, go to step 5.10, otherwise go to step 5.23;
and 5.23, extracting the clustering center stored in the optimal agent gBA, namely obtaining the clustering center of the favorable enterprise policy text data set.
The method optimizes the clustering center of the enterprise-preference policy text by using the guided sine and cosine algorithm, and performs cluster division on the enterprise-preference policy text by using the obtained clustering center to realize clustering of the enterprise-preference policy text. In the guided sine and cosine algorithm, an adaptive adjustment mechanism of the guided intersection rate is designed, guided information is generated by utilizing the guided intersection rate, and the performance of the sine and cosine algorithm is improved, so that the clustering precision of the enterprise-benefiting policy text is improved.
Drawings
FIG. 1 is a flow chart of the guided sine and cosine algorithm of the present invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b):
in this embodiment, with reference to the accompanying drawings, the specific implementation steps of the present invention are as follows:
step 1, acquiring an enterprise-favorable policy text, wherein the enterprise-favorable policy text comprises but is not limited to a tax-free policy text, a tax-reducing policy text, a interest support policy text and a yield-increasing and efficiency-increasing reward policy text;
step 2, preprocessing the preferential enterprise policy text, wherein the preprocessing comprises but is not limited to deleting messy code characters, removing punctuation marks, segmenting words and removing stop words;
step 3, extracting the feature vector of the preferential enterprise policy text, wherein the method for extracting the feature vector of the preferential enterprise policy text comprises but is not limited to a method of utilizing Word frequency-inverse file frequency (TF-IDF), Word2Vec and LDA;
step 4, setting the obtained feature vector of the enterprise-promoting policy text as an enterprise-promoting policy text data set; wherein, a line of the preferential enterprise policy text data set represents a feature vector of a preferential enterprise policy text;
step 5, optimizing a clustering center of the enterprise-preference policy text data set by using a guided sine and cosine algorithm;
step 6, sequentially calculating the Euclidean distance between the feature vector of each booby policy text in the booby policy text data set and each obtained clustering center; dividing the attribute vector of the preferential enterprise policy text into clusters with the smallest Euclidean distance from the cluster center to obtain the clustering result of the preferential enterprise policy text;
wherein, the optimizing the clustering center of the preferential enterprise policy text data set by using the guided sine and cosine algorithm in the step 5 comprises the following steps:
step 5.1, setting the intelligent agent quantity PSZ to 120 and setting the maximum iteration time MaxIT to 2000;
step 5.2, setting the current iteration times CIt to be 0;
step 5.3, setting the number CCN of the text clusters of the enterprise-benefiting policy as 4;
step 5.4, generating PSZ intelligent agent AC randomlyiWherein each agent stores CCN cluster centers, agent index i ═ 1,2, …, PSZ;
step 5.5, forming the generated PSZ intelligent agents into a population;
step 5.6, calculating the adaptive values of PSZ agents in the population according to the formula (1):
Figure BDA0002477118460000051
afv thereiniAn adaptation value representing the ith executing agent; si is a sample subscript; cluster-like subscripts ci ═ 1,2, …, CCN; CXsiRepresenting the sih sample in the set of the preferential enterprise policy text data; DC (direct current)ciRepresenting the ci-th class cluster; ACi,ciRepresenting the ci-th cluster center stored by the ith agent;
step 5.7, finding out the intelligent agent with the minimum adaptation value from PSZ intelligent agents of the population, and storing the found intelligent agent with the minimum adaptation value to the optimal intelligent agent gBA;
step 5.8, initializing the retention cross rate KCRi=0.5;
Step 5.9, generating PSZ guided Agents DIAiThe generation method is setting DIAi=ACiWherein the agent subscript i ═ 1,2, …, PSZ;
step 5.10, setting temporary storage intelligent agent TIAi=DIAiWherein the agent subscript i ═ 1,2, …, PSZ;
step 5.11, setting a counter tsi to 1;
step 5.12 at [1, PSZ]Randomly generating a positive integer ei within the range; then setting the ei temporary storage intelligent agent TIAei=gBA;
Step 5.13, setting a counter tsi to tsi + 1;
step 5.14, if the counter tsi is less than PSZ × 0.1, go to step 5.12, otherwise go to step 5.15;
step 5.15, calculating the guided crossing rate DCR according to the formula (2)i
Figure BDA0002477118460000061
Wherein rand represents a random real number generating function, tep is a random real number between [0,1 ];
step 5.16, calculating the NIA of the foreground intelligent agent according to the formula (3)i
Figure BDA0002477118460000062
Wherein rid is a random positive integer between [1, PSZ ]; atp is a random real number between [0,1 ]; trp is a random real number between [0,1 ];
step 5.17, if the foreground agent NIAiIs smaller than the guiding agent DIAiThe adapted value of (D), then the guided agent DIA is seti=NIAiOtherwise, the guiding agent DIA is maintainediThe change is not changed;
and 5.18, executing a guided sine and cosine operation operator according to the formula (4):
Figure BDA0002477118460000063
wherein
Figure BDA0002477118460000064
r2 is [0, 2X π]Random real number in between, and pi is a circumferential ratio; r3 is [0,2 ]]Random real numbers in between; r4 is [0,1]]Random real numbers in between; sin is a sine function; cos is a cosine function; GXiA sampling agent;
step 5.19, if sampling agent GXiAdapted value ratio of ACiIs smaller, the AC is seti=GXiOtherwise, keeping ACiThe change is not changed;
step 5.20, if sampling the intelligent GXiAdapted value ratio of ACiIs smaller, the retention cross rate KCR is seti=DCRiOtherwise, keeping the retention cross rate KCRiThe change is not changed;
step 5.21, finding out the intelligent agent with the minimum adaptive value in the population and storing the intelligent agent to the optimal intelligent agent gBA;
step 5.22, setting the current iteration times CIt to CIt + 1; if the current iteration number CIt is less than the maximum iteration number MaxIT, go to step 5.10, otherwise go to step 5.23;
and 5.23, extracting the clustering center stored in the optimal agent gBA, namely obtaining the clustering center of the favorable enterprise policy text data set.
The method optimizes the clustering center of the enterprise-preference policy text by using the guided sine and cosine algorithm, and performs cluster division on the enterprise-preference policy text by using the obtained clustering center to realize clustering of the enterprise-preference policy text. In the guided sine and cosine algorithm, an adaptive adjustment mechanism of the guided intersection rate is designed, guided information is generated by utilizing the guided intersection rate, and the performance of the sine and cosine algorithm is improved, so that the clustering precision of the enterprise-benefiting policy text is improved.
The technical principle of the present invention is described above in connection with specific embodiments. The description is made for the purpose of illustrating the principles of the invention and should not be construed in any way as limiting the scope of the invention. Based on the explanations herein, those skilled in the art will be able to conceive of other embodiments of the present invention without inventive effort, which would fall within the scope of the present invention.

Claims (1)

1. A method for clustering the text of a preferential enterprise policy is characterized by comprising the following steps:
step 1, collecting a preferential enterprise policy text;
step 2, preprocessing the preferential enterprise policy text;
step 3, extracting the feature vector of the preferential enterprise policy text;
step 4, setting the obtained feature vector of the enterprise-promoting policy text as an enterprise-promoting policy text data set;
step 5, optimizing a clustering center of the enterprise-preference policy text data set by using a guided sine and cosine algorithm;
step 6, performing cluster division on the enterprise-benefiting policy text data set by using the obtained clustering center to obtain a clustering result of the enterprise-benefiting policy text;
wherein, the optimizing the clustering center of the preferential enterprise policy text data set by using the guided sine and cosine algorithm in the step 5 comprises the following steps:
step 5.1, setting the number PSZ of agents and setting the maximum iteration number MaxIT;
step 5.2, setting the current iteration number CIt to be 0;
step 5.3, setting the number CCN of the text type clusters of the enterprise-benefiting policy;
step 5.4, generating PSZ intelligent agent AC randomlyiWherein each agent stores CCN cluster centers, agent index i ═ 1,2, …, PSZ;
step 5.5, forming the generated PSZ intelligent agents into a population;
step 5.6, calculating the adaptive values of PSZ agents in the population according to the formula (1):
Figure FDA0002477118450000011
afv thereiniAn adaptation value representing the ith executing agent; si is a sample subscript; cluster-like subscripts ci ═ 1,2, …, CCN; CXsiRepresenting the sih sample in the set of the preferential enterprise policy text data; DC (direct current)ciRepresenting the ci-th class cluster; ACi,ciRepresenting the ci-th cluster center stored by the ith agent;
step 5.7, finding out the intelligent agent with the minimum adaptation value from the PSZ intelligent agents of the population, and storing the found intelligent agent with the minimum adaptation value to the optimal intelligent agent gBA;
step 5.8, initializing the retention cross rate KCRi=0.5;
Step 5.9, generating PSZ guided Agents DIAiThe generation method is setting DIAi=ACiWherein the agent subscript i ═ 1,2, …, PSZ;
step 5.10, setting temporary storage intelligent agent TIAi=DIAiWherein agent subscript i ═ 1,2, …, PSZ;
step 5.11, setting a counter tsi to 1;
step 5.12 at [1, PSZ]Randomly generating a positive integer ei within the range; then setting the ei temporary storage intelligent agent TIAei=gBA;
Step 5.13, setting a counter tsi to tsi + 1;
step 5.14, if the counter tsi is less than PSZ × 0.1, go to step 5.12, otherwise go to step 5.15;
step 5.15, calculating the guided crossing rate DCR according to the formula (2)i
Figure FDA0002477118450000021
Wherein rand represents a random real number generating function, tep is a random real number between [0,1 ];
step 5.16, calculating the NIA of the foreground intelligent agent according to the formula (3)i
Figure FDA0002477118450000022
Wherein rid is a random positive integer between [1, PSZ ]; atp is a random real number between [0,1 ]; trp is a random real number between [0,1 ];
step 5.17, if the foreground agent NIAiIs smaller than the guiding agent DIAiThe adapted value of (D), then the guided agent DIA is seti=NIAiOtherwise, the guiding agent DIA is maintainediThe change is not changed;
step 5.18, executing a guided sine and cosine operator according to the formula (4):
Figure FDA0002477118450000023
wherein
Figure FDA0002477118450000024
r2 is [0, 2X π]Random real number in between, and pi is the circumferential ratio; r3 is [0,2 ]]Random real numbers in between; r4 is [0,1]]Random real numbers in between; sin is a sine function; cos is a cosine function; GXiA sampling agent;
step 5.19, if sampling agent GXiAdapted value ratio of ACiIs smaller, the AC is seti=GXiOtherwise, maintain ACiThe change is not changed;
step 5.20, if the intelligent agent GX is samplediAdapted value ratio of ACiIs smaller, the retention cross rate KCR is seti=DCRiOtherwise, keeping the retention cross rate KCRiThe change is not changed;
step 5.21, finding out the intelligent agent with the minimum adaptive value in the population and storing the intelligent agent to the optimal intelligent agent gBA;
step 5.22, setting the current iteration times CIt to CIt + 1; if the current iteration number CIt is less than the maximum iteration number MaxIT, go to step 5.10, otherwise go to step 5.23;
and 5.23, extracting the clustering center stored in the optimal agent gBA, namely obtaining the clustering center of the favorable enterprise policy text data set.
CN202010367581.8A 2020-04-30 2020-04-30 Enterprise policy text clustering method Active CN111581384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010367581.8A CN111581384B (en) 2020-04-30 2020-04-30 Enterprise policy text clustering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010367581.8A CN111581384B (en) 2020-04-30 2020-04-30 Enterprise policy text clustering method

Publications (2)

Publication Number Publication Date
CN111581384A CN111581384A (en) 2020-08-25
CN111581384B true CN111581384B (en) 2022-06-10

Family

ID=72120370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010367581.8A Active CN111581384B (en) 2020-04-30 2020-04-30 Enterprise policy text clustering method

Country Status (1)

Country Link
CN (1) CN111581384B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902129A (en) * 2019-01-25 2019-06-18 平安科技(深圳)有限公司 Insurance agent's classifying method and relevant device based on big data analysis
CN110263156A (en) * 2019-05-22 2019-09-20 广东奥博信息产业股份有限公司 Intelligent worksheet processing method towards government and enterprises' service big data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101776765B (en) * 2009-11-16 2012-09-05 北京航空航天大学 Multisystem compatible receiver frequency point selecting method
CN110472046B (en) * 2019-07-11 2022-02-22 广东奥博信息产业股份有限公司 Government and enterprise service text clustering method
CN111061871B (en) * 2019-11-26 2022-02-22 广东奥博信息产业股份有限公司 Method for analyzing tendency of government and enterprise service text

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902129A (en) * 2019-01-25 2019-06-18 平安科技(深圳)有限公司 Insurance agent's classifying method and relevant device based on big data analysis
CN110263156A (en) * 2019-05-22 2019-09-20 广东奥博信息产业股份有限公司 Intelligent worksheet processing method towards government and enterprises' service big data

Also Published As

Publication number Publication date
CN111581384A (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN112434169B (en) Knowledge graph construction method and system and computer equipment thereof
CN113722509B (en) Knowledge graph data fusion method based on entity attribute similarity
CN107862070B (en) Online classroom discussion short text instant grouping method and system based on text clustering
US8457950B1 (en) System and method for coreference resolution
CN111046187B (en) Sample knowledge graph relation learning method and system based on confrontation type attention mechanism
Cao et al. Data mining for business applications
CN111767716B (en) Method and device for determining enterprise multi-level industry information and computer equipment
CN111008266A (en) Training method and device of text analysis model and text analysis method and device
CN114385933B (en) Semantic-considered geographic information resource retrieval intention identification method
CN111198970A (en) Resume matching method and device, electronic equipment and storage medium
US11409958B2 (en) Polar word embedding
CN112507912A (en) Method and device for identifying illegal picture
CN116361487A (en) Multi-source heterogeneous policy knowledge graph construction and storage method and system
Zhao et al. How to represent paintings: A painting classification using artistic comments
CN110390104B (en) Irregular text transcription method and system for voice dialogue platform
CN117349420A (en) Reply method and device based on local knowledge base and large language model
Chen et al. Label distribution‐based noise correction for multiclass crowdsourcing
CN111581384B (en) Enterprise policy text clustering method
CN113837307A (en) Data similarity calculation method and device, readable medium and electronic equipment
Chen et al. Gaussian mixture embeddings for multiple word prototypes
Alsammak et al. An enhanced performance of K-nearest neighbor (K-NN) classifier to meet new big data necessities
Lei et al. Multi-category events driven stock price trends prediction
CN111984872B (en) Multi-modal information social media popularity prediction method based on iterative optimization strategy
CN111061871B (en) Method for analyzing tendency of government and enterprise service text
CN114357137A (en) Knowledge graph-based question-answering method, knowledge graph-based question-answering equipment, knowledge graph-based storage medium and question-answering robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant