CN113722582A - Recommendation method, system, program product and medium based on pet feature tag - Google Patents

Recommendation method, system, program product and medium based on pet feature tag Download PDF

Info

Publication number
CN113722582A
CN113722582A CN202110867851.6A CN202110867851A CN113722582A CN 113722582 A CN113722582 A CN 113722582A CN 202110867851 A CN202110867851 A CN 202110867851A CN 113722582 A CN113722582 A CN 113722582A
Authority
CN
China
Prior art keywords
pet
lost
similarity
feature
keywords
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110867851.6A
Other languages
Chinese (zh)
Inventor
王星伟
付磊
张可欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heilongjiang Advanced Information Technology Co ltd
Original Assignee
Heilongjiang Advanced Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heilongjiang Advanced Information Technology Co ltd filed Critical Heilongjiang Advanced Information Technology Co ltd
Priority to CN202110867851.6A priority Critical patent/CN113722582A/en
Publication of CN113722582A publication Critical patent/CN113722582A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Abstract

The invention discloses a recommendation method, a recommendation system, a program product and a medium based on pet feature labels, wherein the method comprises the following steps: acquiring a lost pet feature tag based on lost pet information input by a user; calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database; if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature labels in the lost pet database to a user; the invention aims to accelerate the speed of finding lost pets.

Description

Recommendation method, system, program product and medium based on pet feature tag
Technical Field
The invention relates to the field of big data processing, in particular to a recommendation method, a recommendation system, a recommendation program product and a recommendation medium based on pet feature labels.
Background
Pets have been increasingly incorporated into our lives and become friends, and when they are unfortunately lost, we often look around urgently. With the rapid development of science and technology, people can publish information on the internet by means of technical means to increase the search strength, but the pet search application widely applied in the market at present cannot extract and filter out the information most meeting the requirements of users from massive information data resources accurately according to the pet information provided by the users, so that the problem of information overload cannot be solved.
Because the pet information with higher similarity with the lost pet cannot be recommended to the user more accurately, the help cannot be given to the user efficiently, the time for screening massive pet information by the user is prolonged to a great extent, the speed for searching the pet is reduced, and the utilization rate of the user is reduced to a certain extent.
Disclosure of Invention
In view of this, embodiments of the present application provide a recommendation method, system, program product and medium based on pet feature tags, aiming to accelerate the speed of finding lost pets.
The embodiment of the application provides a recommendation method based on pet feature labels, which comprises the following steps:
acquiring a lost pet feature tag based on lost pet information input by a user;
calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database;
and if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature tag in the lost pet database to a user.
In one embodiment, the obtaining of the lost pet feature tag based on the lost pet information input by the user includes:
performing text analysis basic processing on lost pet information input by a user to obtain candidate keywords;
extracting keywords from the candidate keywords by using a keyword extraction algorithm;
and classifying the keywords by using a machine learning-based classification algorithm to obtain the lost pet feature labels.
In an embodiment, the extracting keywords from the candidate keywords by using a keyword extraction algorithm includes:
counting the word frequency of the candidate keywords in the lost pet information;
calculating the reverse character frequency of the candidate keywords in the lost pet information;
calculating the product of the word frequency and the reverse character frequency as the weight of the candidate keyword;
constructing a graph model based on the candidate keywords, and performing normalization processing on the weights of the candidate keywords to serve as initial weights of the graph model;
iterating the initial weight until the initial weight is converged to obtain a final weight value;
and performing reverse sorting based on the final weight value, and taking a preset number of candidate keywords sorted at the top as the keywords.
In an embodiment, the classifying the keyword by using a machine learning-based classification algorithm to obtain the missing pet feature tag includes:
classifying the keywords by Gaussian naive Bayes and binomial distribution to obtain the lost pet feature labels; the method specifically comprises the following steps:
acquiring prior probability, marginal likelihood and likelihood of the keywords;
calculating to obtain the posterior probability of the keyword for each category by using a Gaussian naive Bayes calculation formula based on the prior probability, the marginal likelihood and the likelihood of the keyword;
and attributing the keywords to the category corresponding to the maximum posterior probability to obtain the lost pet characteristic label.
In an embodiment, the performing text analysis based processing on the lost pet information input by the user to obtain candidate keywords includes:
utilizing a word segmentation technology matched with the character strings to segment the lost pet information input by the user to generate a word segmentation result;
performing part-of-speech tagging on the word segmentation result by using a random tagging algorithm of machine learning to generate a part-of-speech tagging result;
and removing stop words in the part of speech tagging results to obtain the candidate keywords.
In an embodiment, the performing part-of-speech tagging on the word segmentation result by using a machine learning random tagging algorithm to generate a part-of-speech tagging result includes:
performing part-of-speech tagging on the word segmentation result by using a hidden Markov model to generate a part-of-speech tagging result; the method specifically comprises the following steps:
acquiring an initial state probability vector, a state transition probability matrix and an observation probability matrix in a part-of-speech tagging database;
calculating a state sequence with the maximum part-of-speech probability of the word segmentation result by utilizing a Viterbi algorithm based on the initial state probability vector, the state transition probability matrix and the observation probability matrix;
and performing part-of-speech tagging on the word segmentation result by using the state sequence to generate a part-of-speech tagging result.
In one embodiment, said calculating the similarity between said lost pet feature tag and the pet feature tags in the lost pet database comprises:
and calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database by using a cosine similarity algorithm.
In order to achieve the above object, there is also provided a recommendation system based on pet feature tags, the system including:
the data acquisition module is used for acquiring a lost pet feature tag based on lost pet information input by a user;
the similarity calculation module is used for calculating the similarity between the lost pet feature tag and the pet feature tags in the lost pet database;
and the recommending module is used for pushing the pet information corresponding to the pet feature label in the lost pet database to a user if the similarity is higher than a similarity threshold value.
To achieve the above object, there is also provided a computer program product comprising a computer program, which when executed by a processor, implements the steps of any of the pet characteristic tag-based recommendation methods described above.
In order to achieve the above object, there is also provided a computer storage medium having a program for a recommendation method based on pet feature tags stored thereon, wherein the program for a recommendation method based on pet feature tags is executed by a processor to implement any of the steps of the recommendation method based on pet feature tags.
One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages: acquiring a lost pet feature tag based on lost pet information input by a user; by carrying out related operations of natural language processing on lost pet information input by a user, the lost pet feature label is correctly obtained, and the correctness of subsequent similarity calculation is ensured.
Calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database; by utilizing the cosine similarity method, the similarity between the lost pet feature tag and the pet feature tag in the lost pet database is correctly calculated, so that the correctness of the qualified pet information in the screened lost pet database is ensured.
And if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature tag in the lost pet database to a user. And through the screening of the similarity threshold, the pet information meeting the conditions in the lost pet database is pushed to the user, so that the speed of finding the lost pet by the user is increased.
Drawings
FIG. 1 is a schematic flowchart illustrating a first embodiment of a recommendation method based on pet feature tags according to the present application;
FIG. 2 is a flowchart illustrating a specific implementation step of step S110 in a first embodiment of the recommendation method based on pet feature tags according to the present application;
FIG. 3 is a flowchart illustrating the detailed implementation steps of step S112 of the recommendation method based on pet feature tags according to the present application;
FIG. 4 is a flowchart illustrating the detailed implementation steps of step S113 of the recommendation method based on pet feature tags according to the present application;
FIG. 5 is a flowchart illustrating the detailed implementation steps of step S111 of the recommendation method based on pet feature tags according to the present application;
FIG. 6 is a flowchart illustrating the detailed implementation steps of step S1112 of the recommendation method based on pet feature tags according to the present application;
FIG. 7 is a flowchart illustrating the detailed implementation step of step S120 in the first embodiment of the recommendation method based on pet feature tags according to the present application;
FIG. 8 is a schematic diagram of a pet-specific tag-based recommendation system of the present application;
FIG. 9 is a schematic diagram of a recommendation device based on pet feature tags according to the present application.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main solution of the embodiment of the invention is as follows: acquiring a lost pet feature tag based on lost pet information input by a user; calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database; if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature labels in the lost pet database to a user; the invention aims to accelerate the speed of finding lost pets.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
Referring to fig. 1, fig. 1 is a first embodiment of a recommendation method based on pet feature tags according to the present application, and the method includes:
step S110: and acquiring the lost pet feature label based on the lost pet information input by the user.
Specifically, the user can input text information of the lost pet in the application program; or the voice description of the lost pet by the user can be converted into the text information of the lost pet through the voice recognition technology; or the lost pet characteristic label can be obtained by executing an image recognition technology on the lost pet picture; for example, the user inputs the picture of the lost pet into the application program, and extracts the feature labels of the lost pet, such as variety, color, hair length, height, weight, age, and the like, by using the image recognition technology.
Step S120: and calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database.
Specifically, the lost pet database may be a database constructed from a large amount of lost pet data, and may include textual descriptions of lost pets and pictures of lost pets.
Specifically, in an embodiment, the information of the pet lost by the current user is obtained, the lost pet feature tag is extracted, meanwhile, the pet feature tags in the lost pet database are extracted, the similarity between the lost pet feature tag of the current user and the pet feature tags in the lost pet database is calculated, the pet information in the lost pet database is screened according to the similarity, the lost pet which meets the condition better is matched, and the retrieval experience of the user is further improved.
Step S130: and if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature tag in the lost pet database to a user.
Specifically, in this embodiment, the similarity threshold may be 0.7, but the similarity threshold is not limited herein and may be dynamically selected according to the selection of the user; the higher the similarity threshold is set, the more accurate the screened pet information is; the lower the similarity threshold is set, the more extensive the information of the screened pet is. Through similarity calculation, information which best meets the requirements of users is extracted and filtered from massive information data resources in the lost pet database, and the finding speed of the lost pet is further increased.
In the above embodiment, there are advantageous effects of: acquiring a lost pet feature tag based on lost pet information input by a user; by carrying out related operations of natural language processing on lost pet information input by a user, the lost pet feature label is correctly obtained, and the correctness of subsequent similarity calculation is ensured.
Calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database; by utilizing the cosine similarity method, the similarity between the lost pet feature tag and the pet feature tag in the lost pet database is correctly calculated, so that the correctness of the qualified pet information in the screened lost pet database is ensured.
And if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature tag in the lost pet database to a user. And through the screening of the similarity threshold, the pet information meeting the conditions in the lost pet database is pushed to the user, so that the speed of finding the lost pet by the user is increased.
Referring to fig. 2, fig. 2 is a specific implementation step of step S110 in a first embodiment of the recommendation method based on pet feature tags according to the present application, where the obtaining of the missing pet feature tag based on the missing pet information input by the user includes:
step S111: and performing text analysis basic processing on the lost pet information input by the user to obtain candidate keywords.
Specifically, the text analysis basic processing can be a Chinese word segmentation operation, a part-of-speech tagging operation and a stop word removal operation, and candidate word keywords are obtained through a series of operations.
Step S112: and extracting keywords from the candidate keywords by using a keyword extraction algorithm.
Specifically, the keyword extraction method may be a statistical feature-based keyword extraction method, a word graph model-based keyword extraction method, and a topic model-based keyword extraction; the above-mentioned keyword extraction methods may be combined for keyword extraction.
Step S113: and classifying the keywords by using a machine learning-based classification algorithm to obtain the lost pet feature labels.
Specifically, the classification algorithm based on Machine learning may be naive bayes classification, SVM algorithm (support Vector Machine), KNN (k-Nearest Neighbor ) based algorithm, and artificial neural network algorithm; an Artificial Neural Network (ANN) is an operational model, and is formed by a large number of nodes (or called neurons) connected with each other. Each node represents a particular output function, called the excitation function. Every connection between two nodes represents a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the artificial neural network. The output of the network is different according to the connection mode of the network, the weight value and the excitation function. Specifically, in this embodiment, the classification algorithm based on the neural network may be a classification algorithm based on a recurrent neural network, or a classification algorithm based on a recurrent neural network, etc., which is not limited herein and is dynamically adjusted according to the service requirement.
In the above embodiment, there are advantageous effects of: extracting lost pet feature labels from natural language input by a user, and performing in a lost pet database with the lost pet feature labels
Referring to fig. 3, fig. 3 is a specific implementation step of step S112 of the recommendation method based on pet feature tags according to the present application, where the extracting keywords from the candidate keywords by using a keyword extraction algorithm includes:
step S1121: and counting the word frequency of the candidate keywords in the lost pet information.
In particular, the amount of the solvent to be used,
Figure BDA0003187680160000071
step S1122: and calculating the reverse character frequency of the candidate keywords in the lost pet information in the lost pet database.
In particular, the amount of the solvent to be used,
Figure BDA0003187680160000072
step S1123: and calculating the product of the word frequency and the reverse character frequency as the weight of the candidate keyword.
Step S1124: and constructing a graph model based on the candidate keywords, and performing normalization processing on the weights of the candidate keywords to obtain the initial weights of the graph model.
Specifically, a graph model is built by using candidate keywords by using a processing method of TextRank, and meanwhile, normalization processing is performed on the weights of the candidate keywords to serve as initial weights of the graph model. The normalization process may be to limit the data to be processed (by some algorithm) to a certain range that you need. Firstly, normalization is for the convenience of data processing later, and secondly, convergence is accelerated when the program runs. In this embodiment, the weight of the candidate keyword may be converted into a decimal between (0, 1) after being normalized.
It should be further noted that, in another embodiment, the word frequency of the candidate keyword may be normalized to serve as the initial weight of the graph model; or performing normalization processing on the reverse character frequency of the candidate keywords in the lost pet database to serve as the initial weight of the graph model.
Step S1125: and iterating the initial weight until the initial weight is converged to obtain a final weight value.
Specifically, if the initial weight is not converged, iteration is performed on the initial weight again; if the initial weight converges, a final weight value is obtained.
Step S1126: and performing reverse sorting based on the final weight value, and taking a preset number of candidate keywords sorted at the top as the keywords.
Specifically, the top 4 candidate keywords may be taken as keywords, but the specific number of the preset number is not limited herein, and may be adjusted according to the user requirement.
In the above embodiment, there are advantageous effects of: the keywords are extracted by combining the TF-IDF algorithm and the TextRank algorithm, so that the accuracy of keyword extraction is improved.
Referring to fig. 4, fig. 4 is a specific implementation step of step S113 of the recommendation method based on pet feature labels according to the present application, where the step of classifying the keywords by using a machine learning-based classification algorithm to obtain the missing pet feature labels includes:
classifying the keywords by Gaussian naive Bayes and binomial distribution to obtain the lost pet feature labels; the method specifically comprises the following steps:
specifically, in this embodiment, the lost pet feature labels can also be obtained by classifying the keywords using gaussian naive bayes and normal distribution.
Step S1131: and acquiring the prior probability, marginal likelihood and likelihood of the keywords.
Specifically, the prior probability p (class) is calculated as the number of keywords/total number of keywords in each classification;
calculating marginal likelihood p (data) as the number of words similar to the keyword/total number of keywords;
calculating likelihood P (data | class) — the number of similar words to the keyword per each class/the total number of words in each class;
step S1132: and calculating to obtain the posterior probability of the keyword for each category by using a Gaussian naive Bayes calculation formula based on the prior probability, the marginal likelihood and the likelihood of the keyword.
Specifically, a posterior probability P (class | data) of the keyword for each category is calculated by using a Gaussian naive Bayes calculation formula; wherein, the Gaussian naive Bayes calculation formula is as follows:
Figure BDA0003187680160000081
step S1133: and attributing the keywords to the category corresponding to the maximum posterior probability to obtain the lost pet characteristic label.
In the above embodiment, there are advantageous effects of: through classifying the keywords, more accurate pet feature labels are obtained, so that the correctness of similarity calculation is ensured, the pet information recommended to a user is ensured to meet the user requirements better, and the search experience of the user is improved.
Referring to fig. 5, fig. 5 is a specific implementation step of step S111 of the recommendation method based on pet feature tags according to the present application, where the text analysis basic processing is performed on the lost pet information input by the user to obtain candidate keywords, and the step includes:
step S1111: and performing word segmentation on the lost pet information input by the user by using a word segmentation technology matched with the character strings to generate a word segmentation result.
Specifically, the word segmentation can be performed by adopting a forward maximum matching algorithm or a reverse maximum matching algorithm.
The forward maximum matching algorithm comprises the following steps:
acquiring a word list corresponding to lost pet information input by a user;
obtaining the maximum length according to the length of the word in the word list;
starting from a first character of the lost pet information input by a user, obtaining characters in the lost pet information input by the user according to maximum length forward rolling, and when the characters with the maximum length are successfully matched with at least one word in a word list, dividing the characters with the maximum length behind the characters with the maximum length;
and when the matching of the characters with the maximum length and at least one word in the word list is unsuccessful and the last character of the lost pet information input by the user is slid according to the maximum length, subtracting one from the maximum length, and repeating the steps until the maximum length is zero.
The reverse maximum matching algorithm comprises the following steps:
acquiring a word list corresponding to lost pet information input by a user;
obtaining the maximum length according to the length of the word in the word list;
the method comprises the steps that characters in lost pet information input by a user are obtained through reverse rolling according to the maximum length from the last character of the lost pet information input by the user, and when the characters with the maximum length are successfully matched with at least one word in a word list, the characters with the maximum length are divided behind the characters with the maximum length;
and when the matching of the characters with the maximum length and at least one word in the word list is unsuccessful and the first character of the lost pet information input by the user is slid according to the maximum length, subtracting one from the maximum length, and repeating the steps until the maximum length is zero.
In this embodiment, the lost pet information input by the user may also be divided according to a combination of a forward maximum matching algorithm and a reverse maximum matching algorithm, and specifically, if the word segmentation result obtained by using the forward maximum matching algorithm and the word segmentation result obtained by using the reverse maximum matching algorithm have the same word segmentation, the lost pet information input by the user may be divided based on the same word segmentation.
Step S1112: and performing part-of-speech tagging on the word segmentation result by using a random tagging algorithm of machine learning to generate a part-of-speech tagging result.
In particular, part-of-speech tagging is essentially a classification problem, classifying words in a corpus by part-of-speech. The part-of-speech of a word is determined by its meaning, morphology and grammatical function in the language to which it belongs. Taking the Chinese language as an example, the word system of the Chinese language has 18 subcategories, including 7 types of body words, 4 types of predicates, 5 types of fictional words, pronouns and exclamatory words.
Specifically, in this embodiment, the part-of-speech tagging machine learning algorithm may be a sequence Model, and may be at least one of HMM, Maximum Entropy Markov Model (MEMM), Conditional Random Fields (CRFs), and other generalized Markov Model members, and a deep learning algorithm represented by a Recurrent Neural Network (RNN). In addition, some Machine-learned conventional classifiers, such as Support Vector Machines (SVMs), may also be used for part-of-speech tagging after refinement.
In this embodiment, the part-of-speech tagging method is not limited to the above-described method, and may be set according to specific situations.
It should be further noted that after the part-of-speech tagging is performed on the segmentation result, the part-of-speech tagging of the segmentation result can also be used for screening candidate keywords; for example, the noun or the adjective in the lost pet information input by the user can better represent the characteristics of the lost pet, and when the candidate keyword is selected in the subsequent process, the noun or the adjective in the lost pet information input by the user is selected in an emphasized mode, so that the correctness of the candidate keyword is guaranteed.
Step S1113: and removing stop words in the part of speech tagging results to obtain the candidate keywords.
Specifically, the part-of-speech tagging result may be compared with the deactivated vocabulary, and if the participle in the part-of-speech tagging result is the same as the word in the deactivated vocabulary, the participle is removed from the part-of-speech tagging result. The words of tone and words of stop such as "what", "o", etc. in the lost pet information input by the user may be removed.
In the above embodiment, there are advantageous effects of: through word segmentation operation, part-of-speech tagging operation and stop word removing operation, candidate keywords are correctly obtained, so that correct obtaining of the lost pet feature tag is guaranteed.
Referring to fig. 6, fig. 6 is a specific implementation step of step S1112 of the recommendation method based on pet feature labels according to the present application, where the performing part-of-speech tagging on the word segmentation result by using a machine learning random tagging algorithm to generate a part-of-speech tagging result includes:
performing part-of-speech tagging on the word segmentation result by using a hidden Markov model to generate a part-of-speech tagging result; the method specifically comprises the following steps:
in particular, Hidden Markov Models (HMMs) are statistical models that are used to describe a Markov process with Hidden unknown parameters. The difficulty is to determine the implicit parameters of the process from the observable parameters.
Step S11121: and acquiring an initial state probability vector, a state transition probability matrix and an observation probability matrix in the part-of-speech tagging database.
Specifically, the hidden Markov model is trained by utilizing a part-of-speech tagging database, and an initial state probability vector, a state transition probability and an observation probability matrix are obtained through statistics.
Step S11122: and calculating to obtain a state sequence with the maximum part-of-speech probability of the word segmentation result by utilizing a Viterbi algorithm based on the initial state probability vector, the state transition probability matrix and the observation probability matrix.
Specifically, the solving process finds the state sequence with the highest part-of-speech probability using the viterbi algorithm (viterbi).
Step S11123: and performing part-of-speech tagging on the word segmentation result by using the state sequence to generate a part-of-speech tagging result.
Specifically, each participle in the participle result is labeled as a part of speech with the largest probability, so that a part of speech labeling result is generated.
In the above embodiment, there are advantageous effects of: the accurate part-of-speech tagging is carried out on the segmentation result by utilizing a hidden Markov model, so that the correct extraction of the candidate keywords is ensured.
Referring to fig. 7, fig. 7 is a specific implementation step of step S120 in a first embodiment of the recommendation method based on pet feature tags according to the present application, where the calculating the similarity between the lost pet feature tag and the pet feature tags in the lost pet database includes:
step S121: and calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database by using a cosine similarity algorithm.
Specifically, a vector of the lost pet feature tag and a vector of the pet feature tag in the lost pet database can be obtained;
calculating the similarity by using the following cosine similarity calculation formula;
Figure BDA0003187680160000111
wherein x isiA vector of missing pet feature labels; y isiIs a vector of pet feature labels in the lost pet database.
In the above embodiment, there are advantageous effects of: the similarity between the lost pet feature label and the pet feature label in the lost pet database is obtained through a cosine similarity algorithm, and the accuracy of obtaining the similarity is ensured, so that pet information which meets the requirements of users better is provided, and the finding speed of the lost pet is accelerated.
The present application further protects a recommendation system based on pet trait tags, the system comprising:
the data acquisition module is used for acquiring a lost pet feature tag based on lost pet information input by a user;
the similarity calculation module is used for calculating the similarity between the lost pet feature tag and the pet feature tags in the lost pet database;
and the recommending module is used for pushing the pet information corresponding to the pet feature label in the lost pet database to a user if the similarity is higher than a similarity threshold value.
The recommendation system 20 based on pet feature tags shown in fig. 8 includes a data obtaining module 21, a similarity calculating module 22, and a recommending module 23, the system 20 may execute the method of the embodiment shown in fig. 1 to 7, and reference may be made to the related description of the embodiment shown in fig. 1 to 7 for a part of the embodiment not described in detail. The implementation process and technical effect of the technical solution refer to the descriptions in the embodiments shown in fig. 1 to fig. 7, and are not described herein again.
The application also protects a computer program product comprising a computer program which when executed by a processor implements the steps of any of the pet characteristic tag-based recommendation methods described above.
The application also protects a computer storage medium, wherein a recommendation method program based on the pet feature tag is stored on the computer storage medium, and when being executed by a processor, the recommendation method program based on the pet feature tag realizes any one of the steps of the recommendation method based on the pet feature tag.
The application relates to a recommendation device 10 based on pet feature tags, which comprises the following components as shown in figure 9: at least one processor 12, a memory 11.
The processor 12 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 12. The processor 12 described above may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 11, and the processor 12 reads the information in the memory 11 and completes the steps of the method in combination with the hardware thereof.
It will be appreciated that memory 11 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double data rate Synchronous Dynamic random access memory (ddr DRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 11 of the systems and methods described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A recommendation method based on pet feature labels is characterized by comprising the following steps:
acquiring a lost pet feature tag based on lost pet information input by a user;
calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database;
and if the similarity is higher than a similarity threshold value, pushing pet information corresponding to the pet feature tag in the lost pet database to a user.
2. The recommendation method based on pet feature tag of claim 1, wherein the obtaining of the missing pet feature tag based on the missing pet information inputted by the user comprises:
performing text analysis basic processing on lost pet information input by a user to obtain candidate keywords;
extracting keywords from the candidate keywords by using a keyword extraction algorithm;
and classifying the keywords by using a machine learning-based classification algorithm to obtain the lost pet feature labels.
3. The pet feature tag-based recommendation method of claim 2, wherein said extracting keywords from said candidate keywords using a keyword extraction algorithm comprises:
counting the word frequency of the candidate keywords in the lost pet information;
calculating the reverse character frequency of the candidate keywords in the lost pet information;
calculating the product of the word frequency and the reverse character frequency as the weight of the candidate keyword;
constructing a graph model based on the candidate keywords, and performing normalization processing on the weights of the candidate keywords to serve as initial weights of the graph model;
iterating the initial weight until the initial weight is converged to obtain a final weight value;
and performing reverse sorting based on the final weight value, and taking a preset number of candidate keywords sorted at the top as the keywords.
4. The pet feature tag-based recommendation method of claim 2, wherein said classifying said keyword using a machine learning-based classification algorithm to obtain said missing pet feature tag comprises:
classifying the keywords by Gaussian naive Bayes and binomial distribution to obtain the lost pet feature labels; the method specifically comprises the following steps:
acquiring prior probability, marginal likelihood and likelihood of the keywords;
calculating to obtain the posterior probability of the keyword for each category by using a Gaussian naive Bayes calculation formula based on the prior probability, the marginal likelihood and the likelihood of the keyword;
and attributing the keywords to the category corresponding to the maximum posterior probability to obtain the lost pet characteristic label.
5. The pet-feature-tag-based recommendation method according to claim 2, wherein the performing text analysis based processing on the lost pet information inputted by the user to obtain candidate keywords comprises:
utilizing a word segmentation technology matched with the character strings to segment the lost pet information input by the user to generate a word segmentation result;
performing part-of-speech tagging on the word segmentation result by using a random tagging algorithm of machine learning to generate a part-of-speech tagging result;
and removing stop words in the part of speech tagging results to obtain the candidate keywords.
6. The pet feature tag-based recommendation method of claim 5, wherein the generating a part-of-speech tagging result by part-of-speech tagging of the word segmentation result using a machine learning stochastic tagging algorithm comprises:
performing part-of-speech tagging on the word segmentation result by using a hidden Markov model to generate a part-of-speech tagging result; the method specifically comprises the following steps:
acquiring an initial state probability vector, a state transition probability matrix and an observation probability matrix in a part-of-speech tagging database;
calculating a state sequence with the maximum part-of-speech probability of the word segmentation result by utilizing a Viterbi algorithm based on the initial state probability vector, the state transition probability matrix and the observation probability matrix;
and performing part-of-speech tagging on the word segmentation result by using the state sequence to generate a part-of-speech tagging result.
7. The pet signature tag-based recommendation method of claim 1, wherein said calculating the similarity of said missing pet signature tag to pet signature tags in a missing pet database comprises:
and calculating the similarity between the lost pet feature label and the pet feature labels in the lost pet database by using a cosine similarity algorithm.
8. A recommendation system based on pet characteristic tags, the system comprising:
the data acquisition module is used for acquiring a lost pet feature tag based on lost pet information input by a user;
the similarity calculation module is used for calculating the similarity between the lost pet feature tag and the pet feature tags in the lost pet database;
and the recommending module is used for pushing the pet information corresponding to the pet feature label in the lost pet database to a user if the similarity is higher than a similarity threshold value.
9. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the pet characteristic tag-based recommendation method of any one of claims 1 to 7.
10. A computer storage medium, wherein the computer storage medium stores a pet feature tag-based recommendation method program, and when the pet feature tag-based recommendation method program is executed by a processor, the steps of the pet feature tag-based recommendation method of any one of claims 1-7 are implemented.
CN202110867851.6A 2021-07-29 2021-07-29 Recommendation method, system, program product and medium based on pet feature tag Pending CN113722582A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110867851.6A CN113722582A (en) 2021-07-29 2021-07-29 Recommendation method, system, program product and medium based on pet feature tag

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110867851.6A CN113722582A (en) 2021-07-29 2021-07-29 Recommendation method, system, program product and medium based on pet feature tag

Publications (1)

Publication Number Publication Date
CN113722582A true CN113722582A (en) 2021-11-30

Family

ID=78674438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110867851.6A Pending CN113722582A (en) 2021-07-29 2021-07-29 Recommendation method, system, program product and medium based on pet feature tag

Country Status (1)

Country Link
CN (1) CN113722582A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572775A (en) * 2013-10-28 2015-04-29 深圳市腾讯计算机系统有限公司 Advertisement classification method, device and server
CN108287916A (en) * 2018-02-11 2018-07-17 北京方正阿帕比技术有限公司 A kind of resource recommendation method
CN110348920A (en) * 2018-04-02 2019-10-18 中移(杭州)信息技术有限公司 A kind of method and device of recommended products
WO2021114810A1 (en) * 2020-05-29 2021-06-17 平安科技(深圳)有限公司 Graph structure-based official document recommendation method, apparatus, computer device, and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572775A (en) * 2013-10-28 2015-04-29 深圳市腾讯计算机系统有限公司 Advertisement classification method, device and server
CN108287916A (en) * 2018-02-11 2018-07-17 北京方正阿帕比技术有限公司 A kind of resource recommendation method
CN110348920A (en) * 2018-04-02 2019-10-18 中移(杭州)信息技术有限公司 A kind of method and device of recommended products
WO2021114810A1 (en) * 2020-05-29 2021-06-17 平安科技(深圳)有限公司 Graph structure-based official document recommendation method, apparatus, computer device, and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
游凤芹;钟芳;周展;: "中文多类别情感分类模型中特征选择方法", 计算机应用, no. 2, pages 243 - 244 *

Similar Documents

Publication Publication Date Title
CN106649818B (en) Application search intention identification method and device, application search method and server
CN110580335B (en) User intention determining method and device
CN106328147B (en) Speech recognition method and device
CN111177374B (en) Question-answer corpus emotion classification method and system based on active learning
CN106951422B (en) Webpage training method and device, and search intention identification method and device
CN107209861B (en) Optimizing multi-category multimedia data classification using negative data
US20150095017A1 (en) System and method for learning word embeddings using neural language models
CN111159363A (en) Knowledge base-based question answer determination method and device
US20170076152A1 (en) Determining a text string based on visual features of a shred
CN110134777B (en) Question duplication eliminating method and device, electronic equipment and computer readable storage medium
CN111522908A (en) Multi-label text classification method based on BiGRU and attention mechanism
CN110858217A (en) Method and device for detecting microblog sensitive topics and readable storage medium
Elayidom et al. Text classification for authorship attribution analysis
Patel et al. Dynamic lexicon generation for natural scene images
CN114416979A (en) Text query method, text query equipment and storage medium
Ranjan et al. Document classification using lstm neural network
WO2022183991A1 (en) Document classification method and apparatus, and electronic device
CN113627151B (en) Cross-modal data matching method, device, equipment and medium
CN111523311B (en) Search intention recognition method and device
CN109284392B (en) Text classification method, device, terminal and storage medium
CN112632956A (en) Text matching method, device, terminal and storage medium
CN115906835B (en) Chinese question text representation learning method based on clustering and contrast learning
CN109446321B (en) Text classification method, text classification device, terminal and computer readable storage medium
Oriola et al. Improved semi-supervised learning technique for automatic detection of South African abusive language on Twitter
CN113408282B (en) Method, device, equipment and storage medium for topic model training and topic prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination