CN117149859B - Urban waterlogging point information recommendation method based on government user portrait - Google Patents

Urban waterlogging point information recommendation method based on government user portrait Download PDF

Info

Publication number
CN117149859B
CN117149859B CN202311403044.4A CN202311403044A CN117149859B CN 117149859 B CN117149859 B CN 117149859B CN 202311403044 A CN202311403044 A CN 202311403044A CN 117149859 B CN117149859 B CN 117149859B
Authority
CN
China
Prior art keywords
government
user
vector
interest
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311403044.4A
Other languages
Chinese (zh)
Other versions
CN117149859A (en
Inventor
孟昭辉
吴凡松
郭婉茜
郭亮
张宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Municipal Engineering Design and Research Institute Co Ltd
Original Assignee
North China Municipal Engineering Design and Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Municipal Engineering Design and Research Institute Co Ltd filed Critical North China Municipal Engineering Design and Research Institute Co Ltd
Priority to CN202311403044.4A priority Critical patent/CN117149859B/en
Publication of CN117149859A publication Critical patent/CN117149859A/en
Application granted granted Critical
Publication of CN117149859B publication Critical patent/CN117149859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a city easy waterlogging point information recommendation method based on government user portrait, which comprises the steps of constructing a government user feature tag and an interest tag system, extracting government user basic feature tags based on government user basic information, calculating a government user basic feature vector, a long-term interest vector and a short-term interest vector of the government user on city easy waterlogging point information, and carrying out weighted fusion on the government user long-term interest vector and the government user short-term interest vector to obtain an interest vector of the government user on city easy waterlogging point information; connecting an urban waterlogging point information database to construct a data item feature vector; and calculating the similarity of the user interests and the data characteristics by using interest vectors and data item characteristic vectors of the government users on the urban waterlogging point information, and providing personalized recommendation for the government users based on the vector similarity.

Description

Urban waterlogging point information recommendation method based on government user portrait
Technical Field
The invention belongs to the technical field of data mining and information recommendation, and particularly relates to an urban waterlogging point information recommendation method based on government user portraits.
Background
The user portrait technology symbolizes the basic characteristics of the user and the dynamic behavior thereof, and generally carries out labeling operation on the user, thereby being an effective method for clustering the user and searching for the target user. With the development of the emerging technologies such as the internet of things, cloud computing and artificial intelligence, the user portrait technology is beginning to be applied to the field of article recommendation, in the past recommendation algorithms, for example, a collaborative filtering algorithm based on users is used for recommending articles which are liked by other users with similar user interests by calculating the similarity among users, and a collaborative filtering algorithm based on contents is used for recommending other articles which are similar to the liked articles by calculating the similarity among the articles.
The urban easy waterlogging point information is a general term for various state information of the urban easy waterlogging point, comprises urban easy waterlogging point distribution data, ground elevation data, terrain data, traffic condition data and the like, and is an objective requirement for government users to carry out emergency management on the urban easy waterlogging point under the frequent background of urban waterlogging. Because the urban easy waterlogging point information has wide data sources, large data volume and multidimensional data, government users have great difficulty and low efficiency in acquiring the required urban easy waterlogging point information from mass data.
In the invention, the similarity between the user interest preference and the data item characteristic is directly established by constructing the user portrait, personalized recommendation is carried out to the user, and the problem of cold start of the new user is solved based on the user similarity. The method and the system can greatly improve the efficiency of government users in acquiring the urban waterlogging point information and assist in urban waterlogging emergency management.
Disclosure of Invention
Aiming at the defects of the prior art, the application provides a city waterlogging point information recommendation method based on government affair user portrait.
In a first aspect, the application provides a city waterlogging point information recommendation method based on government user portrait, which comprises the following steps:
Acquiring basic information, explicit feedback information and implicit feedback information of a government user, and storing data in a government user information database;
constructing a government user feature tag system and an interest tag system, extracting a government user basic feature tag according to the government user feature tag system, and respectively extracting a long-term interest tag and a short-term interest tag of the government user based on the interest tag system;
calculating a long-term interest vector and a short-term interest vector of the urban easy-waterlogging point information by a government user, and carrying out weighted fusion on the long-term interest vector and the short-term interest vector of the government user to obtain the interest vector of the government user on the urban easy-waterlogging point information;
connecting an urban waterlogging point information database, acquiring multi-source heterogeneous data of the urban waterlogging points, identifying data item features by adopting a data mining method, and constructing a data item feature vector;
calculating the similarity of the user interests and the data characteristics by using interest vectors and data item characteristic vectors of the government users on the urban waterlogging point information, and providing personalized recommendation for the government users based on the vector similarity;
calculating basic feature vectors of government users, calculating user similarity according to the basic feature vectors of the government users, extracting interest vectors of the previous TOP-N similar users, carrying out weighted fusion to serve as interest vectors of new users, and then calculating the similarity of the interests of the users and the data features to serve as the basis of information recommendation during cold start of the users.
In some optional implementations of some embodiments, the acquiring government user basic information, explicit feedback information, and implicit feedback information, storing data in a government user information database includes:
acquiring basic information data of government users through government user registration information of the urban waterlogging point information management system, and storing the data in a user information database;
acquiring explicit feedback information of government users through user preference questionnaire data collected by the urban waterlogging point information management system and scoring of the users on data items;
acquiring implicit feedback information of government users through log information of the government users of the urban waterlogging point information management system;
the implicit feedback information of the government users at least comprises: the user accesses the access log of the related data of the urban waterlogging-prone point, wherein the access log comprises access behaviors, access duration and access time intervals.
In some alternative implementations of some embodiments, the constructing a government user feature tagging system and an interest tagging system includes:
extracting text data from the basic information data of the government users and the access logs of the government users to the relevant data of the urban waterlogging points respectively, preprocessing the text data to obtain a first word segmentation set of the text data comprising the basic information of the government users and a second word segmentation set of the text data of the access logs of the government users to the relevant data of the urban waterlogging points;
Respectively inputting the first word segmentation set and the second word segmentation set into a label construction model LDA to obtain probability distribution of a first main label word corresponding to the first word segmentation set and probability distribution of a second main label word corresponding to the second word segmentation set;
determining a first main label signature according to probability distribution of a first main label word corresponding to the first word segmentation set, and determining a second main label signature according to probability distribution of a second main label word corresponding to the second word segmentation set;
generating a first tag data source of a government user feature tag class to which the first main tag name belongs according to the first main tag name, and generating a second tag data source of an interest tag class to which the second main tag name belongs according to the second main tag name;
constructing a government user characteristic tag system by using a preset tag system construction method according to the first tag data source, and constructing an interest tag system by using a preset tag system according to the second tag data source;
in some optional implementations of some embodiments, the extracting the basic feature labels of the government users according to the feature label system of the government users, extracting the long-term interest labels and the short-term interest labels of the government users based on the interest label system respectively includes:
Extracting a government user basic feature tag representing a government user from a government user feature tag system, wherein the government user basic feature tag comprises: work areas, work institutions, work departments and authority of government users;
extracting a long-term interest tag and a short-term interest tag of a government user from the interest tag system respectively, wherein the long-term interest tag and the short-term interest tag of the government user comprise: the method comprises the following steps of government affair user attribute characteristics, an operation area of a business page in a city waterlogging point information management system of a government affair user, browsing speed, page size and request duration.
In some optional implementations of some embodiments, the long-term interest vector and the short-term interest vector of the computer government users in the urban waterlogging point information include:
extracting core semantics of the government user attribute features in the long-term interest tag and the short-term interest tag to obtain attribute feature semantics;
calculating attribute feature frequency of the attribute feature semantics by using a feature frequency formula;
selecting attribute feature semantics with highest attribute feature frequency as calculation factors of long-term interest vectors and short-term interest vectors;
performing time length formula calculation by using the operation areas, the browsing speed, the page size and the request time length in the long-term interest tags and the short-term interest tags to obtain the browsing time length of the government users, wherein the browsing time length of the government users comprises the longest browsing time length and the shortest browsing time length;
And calculating a long-term interest vector according to the longest browsing duration and the attribute feature semantic with the highest attribute feature frequency, and calculating a short-term interest vector according to the shortest browsing duration and the attribute feature semantic with the highest attribute feature frequency.
In some optional implementations of some embodiments, the weighting and fusing the long-term interest vector and the short-term interest vector of the government users to obtain the interest vector of the government users for the urban waterlogging point information includes:
calculating a first weighting factor of the long-term interest vector relative to a short-term interest vector based on a multi-label classifier;
calculating a second weighting factor of the short-term interest vector relative to the long-term interest vector based on the multi-label classifier;
and weighting the long-term interest vector by a first weighting factor, weighting the short-term interest vector by a second weighting factor, and fusing the weighted long-term interest vector and short-term interest vector to obtain an interest vector of the government users on urban waterlogging point information.
In some optional implementations of some embodiments, the connecting the urban easy waterlogging point information database to obtain multi-source heterogeneous data of the urban easy waterlogging point, identifying data item features by adopting a data mining method, and constructing a data item feature vector includes:
Connecting an urban waterlogging point information database to obtain multi-source heterogeneous data of the urban waterlogging points;
adopting a label system which is the same as the interest label of the government users for the data item;
the method comprises the steps of extracting weights of data item feature tags from text information of urban waterlogging-prone point data items by utilizing an NLP technology, wherein the text information comprises: title, description information, manually noted tags/keywords, contributors, and authoring time for the data item;
identifying data item characteristics from the picture information of the urban waterlogging-prone point data item by using a deep learning method;
and combining the weight extraction result of the data item feature tag with the recognition result of the data item feature to complete the construction of the data item feature vector.
In some optional implementations of some embodiments, the calculating the similarity between the user interests and the data features by using the interest vector and the data item feature vector of the city waterlogging point information of the government users, and providing personalized recommendation for the government users based on the vector similarity includes:
calculating the similarity of interest vectors of government users on the urban easy waterlogging point information and feature vectors of the urban easy waterlogging point information data items by adopting cosine similarity;
and sequencing the data items according to the calculated vector similarity, and recommending TOP-N item data with highest similarity to the user.
In some optional implementations of some embodiments, the computer government user basic feature vector includes:
extracting keywords from basic feature labels of government users by adopting a word frequency inverse text frequency index algorithm in a natural language processing algorithm, and mining weights to obtain a keyword-weight matrix;
and performing feature training on the keyword-weight matrix by using a Word2vec method in deep learning to obtain a government user basic feature vector.
In some optional implementations of some embodiments, the calculating the user similarity according to the government user basic feature vector, extracting interest vectors of the previous TOP-N similar users to perform weighted fusion as interest vectors of new users, and then calculating the similarity of the user interests and the data features as the basis of information recommendation when the users are cold started includes:
step J1: calculating the vector similarity of the basic features of the new user and the basic features of other users based on the basic feature vectors of the government users;
step J2: sorting similar users according to the vector similarity, and selecting the previous TOP-N similar users to obtain interest vectors;
step J3: performing weighted fusion on interest vectors of similar users to serve as interest vectors of new users, wherein the weight of the weighted fusion method uses the user similarity calculated in the step J1;
Step J4: and D, calculating the similarity between the new user interest vector and the data item feature vector obtained in the step J3, and recommending TOP-N item data with the highest similarity to the new user.
The invention has the beneficial effects that:
acquiring basic information and implicit feedback information of government users, and storing data in a user information database; constructing a government user feature tag and interest tag system, extracting a government user basic feature tag based on government user basic information, and extracting a government user long-term interest tag based on implicit feedback information; the method comprises the steps of carrying out weighted fusion on government users portrait, including calculation of basic feature vectors of the government users and long-term interest vectors and short-term interest vectors of the government users on urban easy waterlogging point information, and obtaining interest vectors of the government users on the urban easy waterlogging point information; connecting an urban waterlogging point information database, acquiring multi-source heterogeneous data of the urban waterlogging points, identifying data item features by adopting a data mining method, and constructing a data item feature vector; calculating the similarity of the user interests and the data characteristics by using interest vectors and data item characteristic vectors of the government users on the urban waterlogging point information, and providing personalized recommendation for the government users based on the vector similarity; aiming at the problem of user cold start, user similarity is calculated based on government user basic feature vectors, interest vectors of the previous TOP-N similar users are extracted to be weighted and fused to serve as interest vectors of new users, and then similarity of user interests and data features is calculated to serve as the basis of information recommendation during user cold start.
Drawings
Fig. 1 is a general flow chart of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In a first aspect, the application provides a city waterlogging point information recommendation method based on government user portrait, as shown in fig. 1, the method includes the following steps:
s100: acquiring basic information, explicit feedback information and implicit feedback information of a government user, and storing data in a government user information database;
in some optional implementations of some embodiments, the acquiring government user basic information, explicit feedback information, and implicit feedback information, storing data in a government user information database includes:
acquiring basic information data of government users through government user registration information of the urban waterlogging point information management system, and storing the data in a user information database;
Acquiring explicit feedback information of government users through user preference questionnaire data collected by the urban waterlogging point information management system and scoring of the users on data items;
acquiring implicit feedback information of government users through log information of the government users of the urban waterlogging point information management system;
the implicit feedback information of the government users at least comprises: the user accesses the access log of the related data of the urban waterlogging-prone point, wherein the access log comprises access behaviors, access duration and access time intervals.
S200: constructing a government user feature tag system and an interest tag system, extracting a government user basic feature tag according to the government user feature tag system, and respectively extracting a long-term interest tag and a short-term interest tag of the government user based on the interest tag system;
in some alternative implementations of some embodiments, the constructing a government user feature tagging system and an interest tagging system includes:
extracting text data from the basic information data of the government users and the access logs of the government users to the relevant data of the urban waterlogging points respectively, preprocessing the text data to obtain a first word segmentation set of the text data comprising the basic information of the government users and a second word segmentation set of the text data of the access logs of the government users to the relevant data of the urban waterlogging points;
The repeated, incomplete and error data in the text data are screened out, the screened text data are subjected to word segmentation through a preset word segmentation device, word segmentation representing the basic characteristics of a user is selected to form a first word segmentation set, word segmentation representing interest data items of government users on urban waterlogging points is selected to form a second word segmentation set, and a Stanford word segmentation package can be selected by the word segmentation device.
In one implementation, after the word segmentation set is formed, the stop words can be screened out according to a preset stop word stock, and the subsequent label system creating process is performed. The stop words are words of single words in respective word segmentation sets obtained after the stop words are screened out so as to reduce the number of words in the word segmentation sets and further simplify the subsequent process of creating a label system, wherein the stop words are used for saving storage space and improving search efficiency, and certain words or words are automatically filtered out when text data are processed. The stop words are used for automatically filtering out certain words or words when processing text data in order to save storage space and improve searching efficiency.
Respectively inputting the first word segmentation set and the second word segmentation set into a label construction model LDA to obtain probability distribution of a first main label word corresponding to the first word segmentation set and probability distribution of a second main label word corresponding to the second word segmentation set;
The probability distribution is understood to be the frequency of occurrence of a main tag word in the associated word segment set. LDA is a three-layer Bayesian probability model, where the role is topic generation of text labels;
the number of labels constructing a label system is determined according to the similarity of the LDA, and the formula is as follows:
m is the number of the segmented words contained in the segmented word set; p (w) is the probability distribution of a main label word, P (w) of a main label word is obtained by multiplying the probability distribution P (z|d) of the word in a word segmentation set and the probability distribution P (w|z) of text data where the word is located, and the calculation formula is as follows:
the smaller the similarity value is, the better the training effect of the representative model is, and the number of the final label divisions can be determined by comparing the similarity line graphs when the number of the calculated labels is 25, 35 and 45 … … respectively. The number of the labels is accurate, the label dimension of the text data can be comprehensively mined, and the construction of a more comprehensive label system is facilitated.
Determining a first main label signature according to probability distribution of a first main label word corresponding to the first word segmentation set, and determining a second main label signature according to probability distribution of a second main label word corresponding to the second word segmentation set;
Generating a first tag data source of a government user feature tag class to which the first main tag name belongs according to the first main tag name, and generating a second tag data source of an interest tag class to which the second main tag name belongs according to the second main tag name;
the tag data source can be understood as a corresponding government user interest tag and a government user basic feature tag in a tag system;
constructing a government user characteristic tag system by using a preset tag system construction method according to the first tag data source, and constructing an interest tag system by using a preset tag system according to the second tag data source;
aiming at the requirements of the government users on the interest characteristics of the related data of the urban waterlogging points and the basic characteristic information of the government users, important basic information concepts and interest elements in a constructed tag system are screened out by combining corresponding tag data sources; and then, according to concepts and relations in the open source ontology lexicon, carrying out relation carding and grouping on the selected concepts and terms, and carding out main label words with stronger correlation to form sub-fields. Starting from the platform top-level label concept by adopting a top-down method, combing the label lower-level branches downwards and adding the sub-class refinement concept. All tags are organized into respective corresponding tag systems with hierarchical structures by using a tree structure, wherein the preset tag system construction method comprises the following steps: skeleton method Skeletal Methodolody.
In some optional implementations of some embodiments, the extracting the basic feature labels of the government users according to the feature label system of the government users, extracting the long-term interest labels and the short-term interest labels of the government users based on the interest label system respectively includes:
extracting a government user basic feature tag representing a government user from a government user feature tag system, wherein the government user basic feature tag comprises: work areas, work institutions, work departments and authority of government users;
extracting a long-term interest tag and a short-term interest tag of a government user from the interest tag system respectively, wherein the long-term interest tag and the short-term interest tag of the government user comprise: the method comprises the following steps of government affair user attribute characteristics, an operation area of a business page in a city waterlogging point information management system of a government affair user, browsing speed, page size and request duration.
Further, as another embodiment of step S200: constructing a government user feature tag system and an interest tag system, extracting a government user basic feature tag according to the government user feature tag system, and respectively extracting a long-term interest tag and a short-term interest tag of the government user based on the interest tag system; the method further comprises the following implementation steps:
S201: constructing a government user feature tag system and an interest tag system of the government user to the urban waterlogging point information according to the government user basic features and the data item features;
s202: mining the weight of the basic feature labels of the government users from the basic information data of the government users by adopting a Natural Language Processing (NLP) technology;
s203: and dividing a long-term session and a short-term session of the government users on the urban waterlogging point information according to the feedback time of the implicit feedback information of the government users, and respectively extracting the weight of interest labels of the government users from the long-term session and the short-term session based on the NLP technology.
S204: generating a government user basic feature vector by using the weight extraction result in the step S202;
s205: and generating a long-term interest vector and a short-term interest vector of the government users by using the weight extraction result in the step S203, and carrying out weighted fusion on the long-term interest vector and the short-term interest vector of the government users to obtain the interest vector of the government users on the urban waterlogging point information, wherein the weight of the weighted fusion method is determined by training the expression effect of a plurality of models.
S300: calculating a long-term interest vector and a short-term interest vector of the urban easy-waterlogging point information by a government user, and carrying out weighted fusion on the long-term interest vector and the short-term interest vector of the government user to obtain the interest vector of the government user on the urban easy-waterlogging point information;
In some optional implementations of some embodiments, the long-term interest vector and the short-term interest vector of the computer government users in the urban waterlogging point information include:
extracting core semantics of the government user attribute features in the long-term interest tag and the short-term interest tag to obtain attribute feature semantics;
calculating attribute feature frequency of the attribute feature semantics by using a feature frequency formula;
selecting attribute feature semantics with highest attribute feature frequency as calculation factors of long-term interest vectors and short-term interest vectors;
performing time length formula calculation by using the operation areas, the browsing speed, the page size and the request time length in the long-term interest tags and the short-term interest tags to obtain the browsing time length of the government users, wherein the browsing time length of the government users comprises the longest browsing time length and the shortest browsing time length;
and calculating a long-term interest vector according to the longest browsing duration and the attribute feature semantic with the highest attribute feature frequency, and calculating a short-term interest vector according to the shortest browsing duration and the attribute feature semantic with the highest attribute feature frequency.
The semantic analysis model which can be built in advance carries out core semantic extraction on the target information to obtain information semantics. Wherein the semantic analysis model includes, but is not limited to, an NLP (Natural Language Processing ) model, HMM (Hidden Markov Model, hidden markov model). For example, the attribute feature semantics are convolved, pooled and the like by utilizing a pre-constructed semantic analysis model to extract low-dimensional feature expressions of the attribute feature semantics, the extracted low-dimensional feature expressions are mapped to a pre-constructed high-dimensional space to obtain high-dimensional feature expressions of the low-dimensional features, and the high-dimensional feature expressions are selectively output by utilizing a preset activation function to obtain the attribute feature semantics.
Specifically, calculating the attribute feature frequency of the attribute feature semantics according to the feature frequency formula, and inducing the feature preference of the government users according to the attribute feature frequency. In general, the number of occurrences of a word in the history data, the proportion of the word used in the history data, etc. can represent the characteristic preference degree of the government users, therefore, the attribute characteristic frequency is calculated based on the characteristic frequency formula to selectThe attribute feature semantics of the target government users can be represented, namely, the higher the frequency of occurrence of one attribute feature semantics in an attribute document is, the attribute feature semantics can be more similar to the preference of the government users, and the feature frequency formula is as follows:
wherein f is the attribute characteristic frequency, kw (k it ) Semantic k for attribute features t The number of occurrences in the ith tag, dw (k t ) For the occurrence of the attribute feature semantics k t N is the total number of labels in the label system, log is a log function;
the formula of the duration is as follows:
wherein,for the browsing duration of the kth said operating region,/for the duration of the browsing of the kth said operating region>Page size for the ith said operating area,/->For the browsing speed of the ith said operating area,/->For following the operating region->Is the next operating region of- >The request duration of the kth operation area is the request duration of the kth operation area;
and determining the page operation intention of the target user according to the browsing time length.
In detail, according to the basic behavior characteristics, an operation area where the target user may perform on the business page can be known, for example, when the basic behavior characteristics of the administrative user are that the data page with easy waterlogging is queried, the operation area of the target user on the business is an operation area such as the easy waterlogging area, a precipitation histogram or precipitation data graphs of each area;
further, generating interest characteristics of government users on a business page according to the page operation intention, extracting interest keywords of the interest characteristics, wherein the interest keywords are selectedAs the time factor of the long-term interest vector, combining the attribute characteristic frequency f of the government users, classifying the interest characteristics of the government users to obtain the long-term interest class, and selecting +.>As the time factor of the short-term interest vector, combining the attribute characteristic frequency f of the government users, classifying the interest characteristics of the government users to obtain short-term interest categories, and respectively carrying out vector conversion on the long-term interest categories and the short-term interest categories to obtain long-term interest vectors and short-term interest vectors; the interest degree of the government users on the webpage is closely related to the browsing behaviors of the government users on the webpage, and many browsing behaviors of the government users suggest preference and interest of the government users, such as inquiring, browsing the webpage, marking bookmarks, feeding back information and the like. The residence time, access times, storage and other actions of the government users when accessing the page also represent the interests of the government users, namely, according to the page operation intention of the government users in the business page, the operation area which is interested in by the government users in the page can be known, and the user can know the interest in the page >Representing the longest browsing duration of the user in the kth said operating area,/for example>Representing the shortest browsing time length of the user in the kth operation area, and determining which part of the content is interested in for a long time/short time according to the browsing time length and the attribute characteristics of the government usersOf interest.
In some optional implementations of some embodiments, the weighting and fusing the long-term interest vector and the short-term interest vector of the government users to obtain the interest vector of the government users for the urban waterlogging point information includes:
calculating a first weighting factor of the long-term interest vector relative to a short-term interest vector based on a multi-label classifier;
calculating a second weighting factor of the short-term interest vector relative to the long-term interest vector based on the multi-label classifier;
and weighting the long-term interest vector by a first weighting factor, weighting the short-term interest vector by a second weighting factor, and fusing the weighted long-term interest vector and short-term interest vector to obtain an interest vector of the government users on urban waterlogging point information.
The calculation formula of the first weighting factor is as follows:
the calculation formula of the second weighting factor is as follows:
Wherein the method comprises the steps ofRepresenting the long-term interest vector->Representing the short term interest vector->The point-of-view is indicated,representing probability values under each label obtained after the interest vector passes through the multi-label classifier, andrepresenting the summation of probability values under each label +.>Representing the distance between long and short term interest vectors, exp () represents the exponential operation of the vector representing the calculation of the natural exponential function value raised to a power by the eigenvalue of each position in the vector.
Specifically, after the first weighting factor is obtainedAnd a second weighting factor->After that, the weight is further added>And->Respectively to long-term interest vector V 1 With the short-term interest vector V 2 And weighting, and fusing the weighted first weighting factor and the weighted second weighting factor to obtain an interest vector of the government users on the urban waterlogging point information. Accordingly, in one specific example, a location-wise weighted sum of the weighted long-term interest vector and the weighted short-term interest vector may be calculated to obtain an interest vector for the government users in the city waterlogging point information. It should be understood that the weighting factors enable the vectors to perform space interaction through a self-attention mechanism among the vectors, and the interest vectors of government users on the urban waterlogging point information are obtained through measurement fusion of feature dissimilarity among the interest vectors.
S400: connecting an urban waterlogging point information database, acquiring multi-source heterogeneous data of the urban waterlogging points, identifying data item features by adopting a data mining method, and constructing a data item feature vector;
in some optional implementations of some embodiments, the connecting the urban easy waterlogging point information database to obtain multi-source heterogeneous data of the urban easy waterlogging point, identifying data item features by adopting a data mining method, and constructing a data item feature vector includes:
connecting an urban waterlogging point information database to obtain multi-source heterogeneous data of the urban waterlogging points;
adopting a label system which is the same as the interest label of the government users for the data item;
the method comprises the steps of extracting weights of data item feature tags from text information of urban waterlogging-prone point data items by utilizing an NLP technology, wherein the text information comprises: title, description information, manually labeled tags/keywords, contributors, creation time, etc. of the data item;
identifying data item characteristics from the picture information of the urban waterlogging-prone point data item by using a deep learning method;
and combining the weight extraction result of the data item feature tag with the recognition result of the data item feature to complete the construction of the data item feature vector.
The method comprises the following steps: connecting an urban waterlogging point information database to obtain urban waterlogging point related information data; extracting characteristics of text class data of urban waterlogging points by using a TF-IDF algorithm and a Word2vec method, and generating text information characteristic vectors of data items, wherein the text class data comprises: title, description information, manually-noted tags/keywords, contributors, and authoring time of the data item; extracting features of urban waterlogging-prone point picture data by using a convolutional neural network algorithm (CNN) in deep learning, generating a feature tag set, and generating a picture information feature vector of a data item based on a Word2vec method, wherein the features of the picture data comprise: the method comprises the steps of carrying out weighted fusion on text information feature vectors and picture information feature vectors of data items of urban waterlogging point information data to generate final data item feature vectors, wherein the text information feature vectors and the picture information feature vectors are color features, shape features, texture features and spatial relationship features;
S500: calculating the similarity of the user interests and the data characteristics by using interest vectors and data item characteristic vectors of the government users on the urban waterlogging point information, and providing personalized recommendation for the government users based on the vector similarity;
in some optional implementations of some embodiments, the calculating the similarity between the user interests and the data features by using the interest vector and the data item feature vector of the city waterlogging point information of the government users, and providing personalized recommendation for the government users based on the vector similarity includes:
calculating the similarity of interest vectors of government users on the urban easy waterlogging point information and feature vectors of the urban easy waterlogging point information data items by adopting cosine similarity;
and sequencing the data items according to the calculated vector similarity, and recommending TOP-N item data with highest similarity to the user.
Wherein,
wherein,interest feature vector representing government users, +.>And the characteristic vector represents the urban waterlogging point information data item.
S600: calculating basic feature vectors of government users, calculating user similarity according to the basic feature vectors of the government users, extracting interest vectors of the previous TOP-N similar users, carrying out weighted fusion to serve as interest vectors of new users, and then calculating the similarity of the interests of the users and the data features to serve as the basis of information recommendation during cold start of the users.
In some optional implementations of some embodiments, the computer government user basic feature vector includes:
extracting keywords from basic feature labels of government users and mining weights by adopting a word frequency inverse text frequency index algorithm (TF-IDF) in a natural language processing algorithm (NLP) to obtain a keyword-weight matrix;
and performing feature training on the keyword-weight matrix by using a Word2vec method in deep learning to obtain a government user basic feature vector.
Specifically, firstly, keyword extraction and weight mining are carried out on basic feature labels of government users by adopting a TF-IDF algorithm, and word frequency (TF) represents the frequency of occurrence of words in a text and is calculated according to the following formula:
wherein,the denominator is the sum of the occurrence times of all words in the text j, which is the occurrence times of the words i in the text j.
The reverse text frequency (IDF) represents the occurrence frequency of words in the whole text library, and is used for representing the universality of word occurrence, and the specific calculation formula is as follows:
wherein, the numerator is the number of all texts in the text library, and the denominator is the number of texts containing the word i in the text library.
Weighting of TF-IDFBased on TF and IDF, the specific formula is as follows: / >
Based on TF-IDF, obtaining key words and weights of basic feature labels of government users, and generating key wordsAnd carrying out feature training on the keyword-weight matrix by using a Word2vec method to generate a government user basic feature vector.
In some optional implementations of some embodiments, the calculating the user similarity according to the government user basic feature vector, extracting interest vectors of the previous TOP-N similar users to perform weighted fusion as interest vectors of new users, and then calculating the similarity of the user interests and the data features as the basis of information recommendation when the users are cold started includes:
step J1: calculating the vector similarity of the basic features of the new user and the basic features of other users based on the basic feature vectors of the government users;
step J2: sorting similar users according to the vector similarity, and selecting the previous TOP-N similar users to obtain interest vectors;
step J3: performing weighted fusion on interest vectors of similar users to serve as interest vectors of new users, wherein the weight of the weighted fusion method uses the user similarity calculated in the step J1;
step J4: and D, calculating the similarity between the new user interest vector and the data item feature vector obtained in the step J3, and recommending TOP-N item data with the highest similarity to the new user.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and improvements made by those skilled in the art without departing from the present technical solution shall be considered as falling within the scope of the claims.

Claims (8)

1. A city waterlogging point information recommendation method based on government user portraits is characterized by comprising the following steps:
acquiring basic information, explicit feedback information and implicit feedback information of a government user, and storing data in a government user information database;
constructing a government user feature tag system and an interest tag system, extracting a government user basic feature tag according to the government user feature tag system, and respectively extracting a long-term interest tag and a short-term interest tag of the government user based on the interest tag system;
the method comprises the specific steps of calculating a long-term interest vector and a short-term interest vector of a government user on urban easy waterlogging point information, carrying out weighted fusion on the long-term interest vector and the short-term interest vector of the government user to obtain the interest vector of the government user on the urban easy waterlogging point information, wherein the specific steps of calculating the long-term interest vector and the short-term interest vector of the government user on the urban easy waterlogging point information comprise: extracting core semantics of the government user attribute features in the long-term interest tag and the short-term interest tag to obtain attribute feature semantics; calculating attribute feature frequency of the attribute feature semantics by using a feature frequency formula; selecting attribute feature semantics with highest attribute feature frequency as calculation factors of long-term interest vectors and short-term interest vectors; performing time length formula calculation by using the operation areas, the browsing speed, the page size and the request time length in the long-term interest tags and the short-term interest tags to obtain the browsing time length of the government users, wherein the browsing time length of the government users comprises the longest browsing time length and the shortest browsing time length; calculating a long-term interest vector according to the longest browsing duration and the attribute feature semantic with the highest attribute feature frequency, calculating a short-term interest vector according to the shortest browsing duration and the attribute feature semantic with the highest attribute feature frequency, calculating attribute feature frequency based on the feature frequency formula, and selecting attribute feature semantic representing a target government user, wherein the feature frequency formula is as follows:
Wherein f is the attribute characteristic frequency, kw (k it ) Semantic k for attribute features t The number of occurrences in the ith tag, dw (k t ) For the occurrence of the attribute feature semantics k t N is the total number of labels in the label system, log is a log function;
the formula of the duration is as follows:
wherein (1)>For the browsing duration of the kth said operating region,/for the duration of the browsing of the kth said operating region>Page size for the ith said operating area,/->For the browsing speed of the ith said operating area,/->For following the operating region->Is the next operating region of->The request duration of the kth operation area is the request duration of the kth operation area;
determining the page operation intention of a target user according to the browsing time length, generating interest features of government users on service pages according to the page operation intention, extracting interest keywords of the interest features, wherein selectingAs the time factor of the long-term interest vector, combining the attribute characteristic frequency f of the government users, classifying the interest characteristics of the government users to obtain the long-term interest class, and selecting +.>As the time factor of the short-term interest vector, combining the attribute characteristic frequency f of the government users, classifying the interest characteristics of the government users to obtain short-term interest categories, and respectively carrying out vector conversion on the long-term interest categories and the short-term interest categories to obtain long-term interest vectors and short-term interest vectors;
The method comprises the specific steps of carrying out weighted fusion on a long-term interest vector and a short-term interest vector of a government user to obtain the interest vector of the government user on urban waterlogging point information, wherein the specific steps comprise:
calculating a first weighting factor of the long-term interest vector relative to a short-term interest vector based on a multi-label classifier;
calculating a second weighting factor of the short-term interest vector relative to the long-term interest vector based on the multi-label classifier;
weighting the long-term interest vector through a first weighting factor, weighting the short-term interest vector through a second weighting factor, and fusing the weighted long-term interest vector and short-term interest vector to obtain an interest vector of the government users on urban waterlogging point information;
the calculation formula of the first weighting factor is as follows:
the calculation formula of the second weighting factor is as follows:
wherein->Representing the long-term interest vector->Representing the short term interest vector->Representing dot product->Representing the probability value under each label obtained after the interest vector passes through the multi-label classifier, and +.>Representing the summation of probability values under each label +.>Representing the distance between long and short term interest vectors, exp () representing the exponential operation of the vector representing the calculation of a natural exponential function value that exponentially uses the eigenvalues of each position in the vector;
After obtaining the firstA weighting factorAnd a second weighting factor->After that, further add->And->Respectively to long-term interest vector V 1 With the short-term interest vector V 2 Weighting is carried out, and the weighted first weighting factors and the weighted second weighting factors are fused to obtain interest vectors of government users on urban waterlogging point information; calculating the weighted long-term interest vector and the weighted short-term interest vector according to the position weighted sum to obtain an interest vector of the government users on urban waterlogging point information;
connecting an urban waterlogging point information database, acquiring multi-source heterogeneous data of the urban waterlogging points, identifying data item features by adopting a data mining method, and constructing a data item feature vector;
calculating the similarity of the user interests and the data characteristics by using interest vectors and data item characteristic vectors of the government users on the urban waterlogging point information, and providing personalized recommendation for the government users based on the vector similarity;
calculating basic feature vectors of government users, calculating user similarity according to the basic feature vectors of the government users, extracting interest vectors of the previous TOP-N similar users, carrying out weighted fusion to serve as interest vectors of new users, and then calculating the similarity of the interests of the users and the data features to serve as the basis of information recommendation during cold start of the users.
2. The method according to claim 1, characterized in that: the acquiring the basic information, the explicit feedback information and the implicit feedback information of the government users, and storing the data in a government user information database comprises the following steps:
acquiring basic information data of government users through government user registration information of the urban waterlogging point information management system, and storing the data in a user information database;
acquiring explicit feedback information of government users through user preference questionnaire data collected by the urban waterlogging point information management system and scoring of the users on data items;
acquiring implicit feedback information of government users through log information of the government users of the urban waterlogging point information management system;
the implicit feedback information of the government users at least comprises: the user accesses the access log of the related data of the urban waterlogging-prone point, wherein the access log comprises access behaviors, access duration and access time intervals.
3. The method according to claim 2, characterized in that: the construction of the government user characteristic tag system and the interest tag system comprises the following steps:
extracting text data from the basic information data of the government users and the access logs of the government users to the relevant data of the urban waterlogging points respectively, preprocessing the text data to obtain a first word segmentation set of the text data comprising the basic information of the government users and a second word segmentation set of the text data of the access logs of the government users to the relevant data of the urban waterlogging points;
Respectively inputting the first word segmentation set and the second word segmentation set into a label construction model LDA to obtain probability distribution of a first main label word corresponding to the first word segmentation set and probability distribution of a second main label word corresponding to the second word segmentation set;
determining a first main label signature according to probability distribution of a first main label word corresponding to the first word segmentation set, and determining a second main label signature according to probability distribution of a second main label word corresponding to the second word segmentation set;
generating a first tag data source of a government user feature tag class to which the first main tag name belongs according to the first main tag name, and generating a second tag data source of an interest tag class to which the second main tag name belongs according to the second main tag name;
and constructing a government user characteristic tag system by using a preset tag system construction method according to the first tag data source, and constructing an interest tag system by using a preset tag system according to the second tag data source.
4. A method according to claim 3, characterized in that: the method for extracting the basic feature labels of the government users according to the feature label system of the government users, and extracting the long-term interest labels and the short-term interest labels of the government users based on the interest label system respectively comprises the following steps:
Extracting a government user basic feature tag representing a government user from a government user feature tag system, wherein the government user basic feature tag comprises: work areas, work institutions, work departments and authority of government users;
extracting a long-term interest tag and a short-term interest tag of a government user from the interest tag system respectively, wherein the long-term interest tag and the short-term interest tag of the government user comprise: the method comprises the following steps of government affair user attribute characteristics, an operation area of a business page in a city waterlogging point information management system of a government affair user, browsing speed, page size and request duration.
5. The method according to claim 4, wherein: the method for constructing the data item feature vector comprises the following steps of:
connecting an urban waterlogging point information database to obtain multi-source heterogeneous data of the urban waterlogging points;
adopting a label system which is the same as the interest label of the government users for the data item;
the method comprises the steps of extracting weights of data item feature tags from text information of urban waterlogging-prone point data items by utilizing an NLP technology, wherein the text information comprises: title, description information, manually noted tags/keywords, contributors, and authoring time for the data item;
Identifying data item characteristics from the picture information of the urban waterlogging-prone point data item by using a deep learning method;
and combining the weight extraction result of the data item feature tag with the recognition result of the data item feature to complete the construction of the data item feature vector.
6. The method according to claim 5, wherein: the method for calculating the similarity of the user interests and the data features by using the interest vector and the data item feature vector of the government users on the urban waterlogging point information provides personalized recommendation for the government users based on the vector similarity comprises the following steps:
calculating the similarity of interest vectors of government users on the urban easy waterlogging point information and feature vectors of the urban easy waterlogging point information data items by adopting cosine similarity;
and sequencing the data items according to the calculated vector similarity, and recommending TOP-N item data with highest similarity to the user.
7. The method according to claim 6, wherein: the basic feature vector of the computer government affair user comprises the following components:
extracting keywords from basic feature labels of government users by adopting a word frequency inverse text frequency index algorithm in a natural language processing algorithm, and mining weights to obtain a keyword-weight matrix;
and performing feature training on the keyword-weight matrix by using a Word2vec method in deep learning to obtain a government user basic feature vector.
8. The method according to claim 7, wherein: the method for calculating the user similarity according to the government user basic feature vector, extracting interest vectors of the previous TOP-N similar users to be weighted and fused to be used as interest vectors of new users, and then calculating the similarity of the user interests and the data features to be used as the basis of information recommendation during cold start of the users comprises the following steps:
step J1: calculating the vector similarity of the basic features of the new user and the basic features of other users based on the basic feature vectors of the government users;
step J2: sorting similar users according to the vector similarity, and selecting the previous TOP-N similar users to obtain interest vectors;
step J3: performing weighted fusion on interest vectors of similar users to serve as interest vectors of new users, wherein the weight of the weighted fusion method uses the user similarity calculated in the step J1;
step J4: and D, calculating the similarity between the new user interest vector and the data item feature vector obtained in the step J3, and recommending TOP-N item data with the highest similarity to the new user.
CN202311403044.4A 2023-10-27 2023-10-27 Urban waterlogging point information recommendation method based on government user portrait Active CN117149859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311403044.4A CN117149859B (en) 2023-10-27 2023-10-27 Urban waterlogging point information recommendation method based on government user portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311403044.4A CN117149859B (en) 2023-10-27 2023-10-27 Urban waterlogging point information recommendation method based on government user portrait

Publications (2)

Publication Number Publication Date
CN117149859A CN117149859A (en) 2023-12-01
CN117149859B true CN117149859B (en) 2024-02-23

Family

ID=88912385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311403044.4A Active CN117149859B (en) 2023-10-27 2023-10-27 Urban waterlogging point information recommendation method based on government user portrait

Country Status (1)

Country Link
CN (1) CN117149859B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110221A (en) * 2019-03-22 2019-08-09 浙江非线数联科技有限公司 Government data intelligent recommendation method and system
CN112084334A (en) * 2020-09-04 2020-12-15 中国平安财产保险股份有限公司 Corpus label classification method and device, computer equipment and storage medium
CN112380426A (en) * 2020-10-23 2021-02-19 南京邮电大学 Interest point recommendation method and system based on graph embedding and user long-term and short-term interest fusion
CN115310425A (en) * 2022-10-08 2022-11-08 浙江浙里信征信有限公司 Policy text analysis method based on policy text classification and key information identification
CN115719164A (en) * 2022-11-22 2023-02-28 中国长江三峡集团有限公司 Urban waterlogging-prone point identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110221A (en) * 2019-03-22 2019-08-09 浙江非线数联科技有限公司 Government data intelligent recommendation method and system
CN112084334A (en) * 2020-09-04 2020-12-15 中国平安财产保险股份有限公司 Corpus label classification method and device, computer equipment and storage medium
CN112380426A (en) * 2020-10-23 2021-02-19 南京邮电大学 Interest point recommendation method and system based on graph embedding and user long-term and short-term interest fusion
CN115310425A (en) * 2022-10-08 2022-11-08 浙江浙里信征信有限公司 Policy text analysis method based on policy text classification and key information identification
CN115719164A (en) * 2022-11-22 2023-02-28 中国长江三峡集团有限公司 Urban waterlogging-prone point identification method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
付小康 基于社交媒体主题演化和多模态特征融合的城市暴雨灾害态势感知;付小康;《中国博士学位论文全文数据库 工程科技二辑》;摘要、2.3、3.2、5.2、6.2节 *

Also Published As

Publication number Publication date
CN117149859A (en) 2023-12-01

Similar Documents

Publication Publication Date Title
Yang et al. A hybrid retrieval-generation neural conversation model
CN110222160B (en) Intelligent semantic document recommendation method and device and computer readable storage medium
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN106997382B (en) Innovative creative tag automatic labeling method and system based on big data
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN114238573B (en) Text countercheck sample-based information pushing method and device
Kaushik et al. A comprehensive study of text mining approach
CN111680173A (en) CMR model for uniformly retrieving cross-media information
CN111950273A (en) Network public opinion emergency automatic identification method based on emotion information extraction analysis
CN111061939B (en) Scientific research academic news keyword matching recommendation method based on deep learning
CN113962293A (en) LightGBM classification and representation learning-based name disambiguation method and system
Anoop et al. A topic modeling guided approach for semantic knowledge discovery in e-commerce
Sandhiya et al. A review of topic modeling and its application
Hu et al. Retrieval-based language model adaptation for handwritten Chinese text recognition
CN112685440B (en) Structural query information expression method for marking search semantic role
CN117149859B (en) Urban waterlogging point information recommendation method based on government user portrait
CN111625722B (en) Talent recommendation method, system and storage medium based on deep learning
CN111339303B (en) Text intention induction method and device based on clustering and automatic abstracting
Nsaif et al. Political Post Classification based on Firefly and XG Boost
CN113516202A (en) Webpage accurate classification method for CBL feature extraction and denoising
Foong et al. Document clustering using hybrid lda-kmeans
Li et al. Deep recommendation based on dual attention mechanism
Khondokar et al. Boosting Text Classification Performance for Unlabeled Data with Semi-Supervised Learning
Guo et al. CNN-Based Model for Chinese Information Processing and Its Application in Large-Scale Book Purchasing
Gupta Progress in Information Retrieval: An Extensive Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant