CN113569118B - Self-media pushing method, device, computer equipment and storage medium - Google Patents

Self-media pushing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN113569118B
CN113569118B CN202110741715.2A CN202110741715A CN113569118B CN 113569118 B CN113569118 B CN 113569118B CN 202110741715 A CN202110741715 A CN 202110741715A CN 113569118 B CN113569118 B CN 113569118B
Authority
CN
China
Prior art keywords
media
self
target
score
public opinion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110741715.2A
Other languages
Chinese (zh)
Other versions
CN113569118A (en
Inventor
刘杨
熊焕卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donson Times Information Technology Co ltd
Original Assignee
Donson Times Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donson Times Information Technology Co ltd filed Critical Donson Times Information Technology Co ltd
Priority to CN202110741715.2A priority Critical patent/CN113569118B/en
Publication of CN113569118A publication Critical patent/CN113569118A/en
Application granted granted Critical
Publication of CN113569118B publication Critical patent/CN113569118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a self-media pushing method, a self-media pushing device, computer equipment and a storage medium, wherein the self-media pushing method comprises the following steps: the invention can improve the accuracy of target self-media pushing by acquiring public opinion data of works of target self-media on each basic platform, obtaining the public opinion data of the target self-media on each basic platform, analyzing each public opinion data to obtain an analysis result, determining target classification crowd corresponding to the target self-media based on the analysis result, determining comprehensive scores of the target self-media based on preset scoring weights corresponding to each basic platform, determining marketing indexes of the target self-media based on the comprehensive scores, determining target platforms to be recommended from each basic platform based on the marketing indexes, and recommending the target self-media to the target classification crowd in the target platforms.

Description

Self-media pushing method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a self-media pushing method, device, computer device, and medium.
Background
The internet can provide abundant information resources for users, along with the rapid development of internet technology, more and more self-media (personal or brand) provide information sharing for users through various platforms by means of own resource characteristics, the existing mode usually evaluates and positions the self-media according to the performance of the self-media in a single platform and pushes the self-media to correspondingly classified crowds of the platform according to the evaluation and positioning result, and in the process of realizing the invention, the inventor realizes that the existing mode has at least the following problems: when a self-media performs information sharing on a plurality of platforms, how to accurately position and recommend the self-media is a difficult problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a self-media pushing method, a self-media pushing device, computer equipment and a storage medium, so as to improve the accuracy of self-media pushing.
In order to solve the above technical problems, an embodiment of the present application provides a self-media pushing method, including:
collecting public opinion data of works of target self-media on each basic platform to obtain the public opinion data of the target self-media on each basic platform;
analyzing each piece of public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining the comprehensive score of the target self-media based on a preset scoring weight corresponding to each basic platform;
determining a marketing index of the target self-media according to the comprehensive score;
and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowd in the target platform.
Optionally, the composite score representation is in the form of a multi-dimensional score graph.
Optionally, the collecting public opinion data of works of the target self-media on each basic platform, and the obtaining the public opinion data of the target self-media on each basic platform includes:
Acquiring a uniform resource locator corresponding to each basic platform;
aiming at each basic platform, carrying out crawling analysis on the page files corresponding to the uniform resource locators in a web crawler mode to obtain the page files of the target self-media corresponding works as target pages;
and extracting public opinion information related to the target self-media corresponding works from contents contained in the target page in a fuzzy matching mode aiming at each basic platform, and taking the public opinion information as public opinion data of the target self-media on the basic platform.
Optionally, the public opinion data includes at least one of interactive data, work content and comment data.
Optionally, the analyzing each piece of public opinion data to obtain an analysis result includes:
according to the preset weight of each interactive data, carrying out statistical weighting on the interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to the analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and scoring according to the obtained semantic recognition result to obtain a third score;
And determining the evaluation information of the public opinion data based on the first score, the second score and the third score as the analysis result.
Optionally, the performing semantic recognition on the comment data, scoring according to the obtained semantic recognition result, and obtaining the third score includes:
for the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of user evaluations corresponding to the user name does not exceed the preset threshold, taking each user evaluation corresponding to the user name as one effective evaluation;
carrying out evaluation emotion analysis on each effective evaluation by adopting a semantic analysis mode to obtain the corresponding approval degree of each effective evaluation;
and comprehensively evaluating the corresponding thought degree of each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the performing evaluation emotion analysis on each of the effective evaluations by using a semantic analysis manner, and obtaining the approval degree corresponding to each of the effective evaluations includes:
Extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
performing cluster analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a cluster analysis result;
and calculating the Euclidean distance between the clustering analysis result and each preset approval degree in a preset approval degree set, and taking the preset approval degree with the smallest Euclidean distance value as the approval degree corresponding to the effective evaluation.
In order to solve the above technical problem, an embodiment of the present application further provides a self-media pushing device, including:
the data acquisition module is used for acquiring public opinion data of works of target self-media on each basic platform to obtain the public opinion data of the target self-media on each basic platform;
the data evaluation module is used for analyzing each piece of public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining the comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
The index determining module is used for determining the marketing index of the target self-media according to the comprehensive score;
and the target recommendation module is used for determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowd in the target platform.
Optionally, the data acquisition module includes:
the resource positioning unit is used for acquiring the uniform resource locator corresponding to each basic platform;
the page determining module is used for carrying out crawling analysis on the page files corresponding to the uniform resource locators in a web crawler mode aiming at each basic platform to obtain the page files of the target self-media corresponding works as target pages;
and the data crawling unit is used for extracting public opinion information related to the target self-media corresponding works from the contents contained in the target pages in a fuzzy matching mode aiming at each basic platform, and taking the public opinion information as the public opinion data of the target self-media on the basic platform.
Optionally, the data evaluation module includes:
the interactive score evaluation unit is used for carrying out statistical weighting on the interactive data according to the preset weight of each interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
The quality score evaluation unit is used for analyzing the content of the works, scoring the quality of the works according to the analysis result and obtaining a second score;
the comment score evaluation unit is used for carrying out semantic recognition on the comment data and scoring according to the obtained semantic recognition result to obtain a third score;
and a result generation unit for determining the evaluation information of the public opinion data based on the first score, the second score and the third score as the analysis result.
Optionally, the scoring value evaluation unit includes:
the effective comment screening subunit is configured to select, for the same user name, the same number of user evaluations as the preset threshold value as the effective evaluation of the user name if the number of user evaluations corresponding to the user name exceeds the preset threshold value, and use each user evaluation corresponding to the user name as one effective evaluation if the number of user evaluations corresponding to the user name does not exceed the preset threshold value;
the semantic analysis subunit is used for carrying out evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain the corresponding approval degree of each effective evaluation;
And the score evaluation subunit is used for comprehensively evaluating the corresponding thought degree of each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the semantic analysis subunit includes:
the word segmentation extraction component is used for extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
the word vector generation component is used for training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
the word segmentation and clustering component is used for carrying out cluster analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a cluster analysis result;
and the acceptance degree calculation component is used for calculating the Euclidean distance between the clustering analysis result and each preset acceptance degree in a preset acceptance degree set, and taking the preset acceptance degree with the smallest Euclidean distance value as the acceptance degree corresponding to the effective evaluation.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the steps of the self-media pushing method are implemented when the processor executes the computer program.
In order to solve the above technical problem, the embodiments of the present application further provide a computer readable storage medium, where a computer program is stored, where the computer program implements the steps of the self-media pushing method described above when executed by a processor.
According to the self-media pushing method, device, computer equipment and storage medium provided by the embodiment of the invention, the public opinion data of the target self-media in each basic platform is obtained by collecting the public opinion data of the target self-media in each basic platform, each public opinion data is analyzed to obtain an analysis result, the target classification crowd corresponding to the target self-media is determined based on the analysis result, the comprehensive score of the target self-media is determined based on the preset scoring weight corresponding to each basic platform, the marketing index of the target self-media is determined according to the comprehensive score, the target platform to be recommended is determined from each basic platform based on the marketing index, the target self-media is recommended to the target classification crowd in the target platform, the rapid positioning of the target self-media is realized, and the target classification crowd in the target platform corresponding to the positioning is recommended, so that the accuracy of the target self-media pushing is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a self-media push method of the present application;
FIG. 3 is a schematic diagram of one embodiment of a self-media pushing device according to the present application;
FIG. 4 is a schematic structural diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Eperts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Eperts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the self-media pushing method provided in the embodiment of the present application is executed by a server, and accordingly, the self-media pushing device is disposed in the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation requirements, and the terminal devices 101, 102 and 103 in the embodiments of the present application may specifically correspond to application systems in actual production.
Referring to fig. 2, fig. 2 shows a self-media pushing method according to an embodiment of the present invention, and the method is applied to the server in fig. 1 for illustration, and is described in detail as follows:
s201: and collecting public opinion data of the works of the target self-media on each basic platform to obtain the public opinion data of the target self-media on each basic platform.
Optionally, the public opinion data includes at least one of interactive data, work content and comment data.
Among them, target-specific media include, but are not limited to, personal authors, brands, etc., and work that they publish includes, but is not limited to: text, pictures, short videos, activity text, etc.
For specific ways of collecting public opinion data, reference may be made to the description of the following embodiments, and in order to avoid repetition, details are not repeated here.
S202: and analyzing each piece of public opinion data to obtain an analysis result, determining target classification crowd corresponding to the target self-media based on the analysis result, and determining comprehensive scores of the target self-media based on preset scoring weights corresponding to each basic platform.
Optionally, the composite score representation is in the form of a multi-dimensional score graph.
The multi-dimensional score graph is a graph for generating score data of multiple dimensions, and the number of specific dimensions can be set according to actual needs, which is not limited herein. For example, in one specific implementation, a six-dimensional score graph is used, respectively: based on analysis of public opinion data, the evaluation scores of the six dimensions are obtained, and then a hexagonal scoring pattern is generated and stored in a data material corresponding to the target self-media in a visual mode.
S203: and determining the marketing index of the target self-media according to the comprehensive score.
The marketing index refers to the market positioning index of the target self-media on each basic platform.
S204: and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending targets from the media to target classification crowd in the target platform.
In this embodiment, public opinion data of a work of a target self-media on each basic platform is collected, public opinion data of the target self-media on each basic platform is obtained, each public opinion data is analyzed, an analysis result is obtained, a target classification crowd corresponding to the target self-media is determined based on the analysis result, a comprehensive score of the target self-media is determined based on a preset scoring weight corresponding to each basic platform, a marketing index of the target self-media is determined according to the comprehensive score, a target platform to be recommended is determined from each basic platform based on the marketing index, and the target self-media is recommended to the target classification crowd in the target platform, so that rapid positioning of the target self-media is realized, and the target classification crowd in the target platform corresponding to the positioning is recommended, thereby being beneficial to improving the accuracy of pushing the target self-media.
In a specific optional embodiment, in step S201, collecting public opinion data of a work of the target self-media on each base platform, and obtaining public opinion data of the target self-media on each base platform includes:
acquiring a uniform resource locator corresponding to each basic platform;
aiming at each basic platform, carrying out crawling analysis on the page files corresponding to the uniform resource locators in a web crawler mode to obtain page files of the target self-media corresponding works as target pages;
and extracting public opinion information related to the target self-media corresponding work from the content contained in the target page by a fuzzy matching mode aiming at each basic platform, and taking the public opinion information as public opinion data of the target self-media on the basic platform.
Specifically, before the target self-media crawls the works of each basic platform, the uniform resource locator corresponding to each basic platform needs to be acquired, the uniform resource locator corresponding to each basic platform corresponds to a plurality of page files, each page file corresponds to one work, and public opinion data of the work can be acquired through the page files corresponding to the uniform resource locator.
The uniform resource locator (Uniform Resource Locator, URL) is a concise representation of the location and access method of resources available on the internet, and is the address of standard resources on the internet.
Because the crawling range and the number of the web crawlers are huge, the crawling speed and the storage space requirements are high, the order requirements on crawling pages are relatively low, meanwhile, because the pages to be refreshed are too many, a parallel working mode is generally adopted, and the structure of the web crawlers can be roughly divided into a page crawling module, a page analysis module, a link filtering module, a page database, a URL queue and an initial URL set. To improve the working efficiency, the general web crawlers may take a certain crawling strategy. Common crawling strategies are: depth-first policy and breadth-first policy.
The basic method of the depth-first strategy is to sequentially access the links of the next level of web pages according to the order from low depth to high depth until the links can not go deep any more. The crawler returns to the last link node after completing a crawling branch to further search for other links. When all links are traversed, the crawling task is ended.
The breadth-first strategy is to crawl pages according to the depth of the content directory hierarchy of the web page, and the pages in the shallower directory hierarchy are crawled first. After the crawling of the pages in the same layer is completed, the crawler goes deep into the next layer to continue crawling. The strategy can effectively control the crawling depth of the page, avoid the problem that crawling cannot be finished when an infinite deep branch is encountered, and is convenient to implement without storing a large number of intermediate nodes.
Preferably, the crawling policy adopted in the embodiment of the invention is breadth-first policy, so that uniform resource locators corresponding to the basic platforms are crawled first, a plurality of page files corresponding to preset uniform resource locators are obtained, then each page file is crawled later, public opinion information of works contained in each page file is obtained, extra time expenditure caused by crawling excessive useless information is avoided, and crawling efficiency is improved.
Among the ways of fuzzy matching include, but are not limited to: fuzzy matching based on a string pattern matching (Horspool) algorithm, fuzzy matching based on a Trie to realize search words, fuzzy matching based on a jquery selector, and the like.
In this embodiment, a uniform resource locator corresponding to each basic platform is obtained, for each basic platform, a page file corresponding to the uniform resource locator is crawled and analyzed in a web crawler manner to obtain a page file of a target self-media corresponding work, the page file is used as a target page, public opinion information related to the target self-media corresponding work is extracted from contents contained in the target page in a fuzzy matching manner for each basic platform, and the public opinion information is used as public opinion data of the target self-media in the basic platform, so that the intelligent acquisition of the public opinion data of the target self-media from a network is realized, the acquisition time is saved, and the acquisition efficiency of the public opinion data is improved.
In a specific optional embodiment, in step S202, each public opinion data is analyzed, and the analysis result includes:
according to the preset weight of each interactive data, carrying out statistical weighting on the interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to the analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and scoring according to the obtained semantic recognition result to obtain a third score;
and determining the evaluation information of the public opinion data based on the first score, the second score and the third score as an analysis result.
In a specific optional implementation manner, performing semantic recognition on the comment data, and scoring according to the obtained semantic recognition result, where obtaining the third score includes:
for the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of user evaluations corresponding to the user name does not exceed the preset threshold, taking the user evaluation corresponding to each user name as an effective evaluation;
Carrying out evaluation emotion analysis on each effective evaluation by adopting a semantic analysis mode to obtain the corresponding approval degree of each effective evaluation;
and comprehensively evaluating the corresponding thought degree of each effective evaluation according to a preset evaluation mode to obtain a third score.
Specifically, in the user evaluation, there is a case that the user repeatedly reviews the same work for multiple times, in order to avoid interference of the case with analysis of the target evaluation, in this embodiment, a preset threshold is set for the number of user evaluations of the user for the same work, when the number of user evaluations of the user for the same work exceeds the preset threshold, the user evaluation of the same number as the preset threshold is selected as an effective evaluation of the user for the same work, when the number of user evaluations corresponding to the user name does not exceed the preset threshold, each user evaluation corresponding to the user name is taken as an effective evaluation, after the effective evaluation is determined, a semantic analysis mode is adopted to analyze semantics included in each effective evaluation, so as to obtain the approval degree of the user included in the effective evaluation for the application program, and a corresponding evaluation mode is preset for each approval degree, and according to the approval degree corresponding to each effective evaluation of the same work, the comprehensive evaluation of the work is further obtained, so as to obtain a third score of the target self-media corresponding work.
The implementation method of the semantic analysis comprises the following steps of, but is not limited to: natural language processing algorithms (Natural Language Processing, NLP), natural language semantic analysis algorithms based on N-Gram models, clustering algorithms based on word vectors, and the like.
Preferably, the present embodiment employs a word vector based clustering algorithm to achieve semantic analysis of effective evaluations.
The preset evaluation mode may be set according to actual requirements, for example, different scores are set for different approval degrees, and the like, which is not particularly limited herein.
In a specific alternative embodiment, performing evaluation emotion analysis on each effective evaluation by adopting a semantic analysis mode, and obtaining the corresponding approval degree of each effective evaluation includes:
extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords by adopting a word vector mode to obtain space word vectors corresponding to the keywords;
carrying out cluster analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a cluster analysis result;
and calculating the Euclidean distance between the clustering analysis result and each preset approval degree in the preset approval degree set, and taking the preset approval degree with the smallest Euclidean distance value as the approval degree corresponding to the effective evaluation.
Specifically, word segmentation processing is carried out on the effective comments through a third-party word segmentation tool or word segmentation algorithm to obtain at least one keyword, and the specific number of the keywords is determined according to word segmentation results.
Among the common third party word segmentation tools include, but are not limited to: stanford NLP segmenter, ictcelas segmenter, ansj segmenter, hanLP chinese segmenter, etc.
Among them, word segmentation algorithms include, but are not limited to: maximum forward Matching (MM) algorithm, reverse Maximum Matching (ReverseDirectionMaximum Matching Method, RMM) algorithm, bi-directional Maximum Matching (Bi-directction Matching method, BM) algorithm, hidden markov model (Hidden Markov Model, HMM), N-gram model, and the like.
It is easy to understand that keywords are extracted by word segmentation, on one hand, some nonsensical words in the effective comments can be filtered, and on the other hand, the method is also beneficial to generating space word vectors by using the keywords subsequently.
In artificial intelligence, word vector representations refer primarily to formal or mathematical descriptions of languages in order to represent the languages in a computer and to enable automatic processing by a computer program. The word vector in this embodiment is expressed in terms of a vector.
Specifically, each keyword is mapped into vectors according to a preset corpus, the vectors are connected together to form a word vector space, each vector is equivalent to a point in the space, and each vector is used as a space word vector.
For example, two to-be-matched segmented words, namely BMW and Benz, are arranged in a product name, and all possible classifications of the two to-be-matched segmented words are obtained according to a preset corpus: "automotive," luxury, "" animal, "" action, "and" food. Thus, a vector representation is introduced for the two to-be-matched segmentations:
< automobile, luxury, animal, action, food >
According to the statistical learning method, the probability that the two to-be-matched segmented words belong to each class is calculated, and the probability that the computer learns is:
BMW= <0.5,0.2,0.2,0.0,0.1)
Benz= <0.7,0.2,0.0,0.1,0.0>
It will be appreciated that the value of each dimension of the spatial word vector represents a feature that has some semantic and grammatical interpretability.
Through presetting a corpus, space word vectors of each keyword are constructed, so that words which cannot be accurately understood by a machine are converted into word vectors which are easily recognized and operated by the machine, and the recognition degree of an application program contained in the effective evaluation is obtained by analyzing the keywords in the effective evaluation.
Further, after the space word vector is constructed, for each space word vector corresponding to the effective evaluation, calculating the space distance between the space word vector and other space vectors, confirming the space word vector with the space distance exceeding a preset space distance threshold value with other space word vectors as an invalid word vector, and eliminating the invalid word vector, so that each space word vector represents the semantics represented by the keywords corresponding to the space word vector in the effective comments as correctly as possible.
Further, through a clustering mode, clustering analysis is carried out on the spatial word vectors corresponding to the same effective evaluation to obtain a clustering result corresponding to the effective evaluation, and preferably, the proposal uses a K-Means aggregation algorithm to carry out clustering analysis on the spatial word vectors.
The K-means algorithm is a distance-based clustering algorithm, and the distance is used as an evaluation index of similarity, namely the closer the distance between two objects is, the greater the similarity is. The algorithm considers clusters to be made up of objects that are close together, thus targeting a compact and independent cluster as the final target.
In this embodiment, the clustering analysis of the spatial word vectors using the K-Means aggregation algorithm is described in detail as follows:
Taking word vectors corresponding to preset parts of speech as a clustering center;
for each space word vector in the effective evaluation, calculating a first distance between the space word vector and each current cluster center, and putting the space word vector into a cluster where a cluster center corresponding to the minimum first distance is located to obtain m temporary clusters;
calculating the mean value of each temporary cluster aiming at each temporary cluster, and selecting the space word vector corresponding to the minimum second distance as a new cluster center of the temporary cluster to obtain m updated temporary clusters, wherein the second distance is between each space word vector in the temporary clusters and the mean value;
the standard deviation of each updated temporary cluster is calculated according to the following formula:
wherein sigma is standard deviation, A i For the ith spatial word vector in the updated temporary cluster, n is the number of spatial word vectors in the updated temporary cluster, and μ is the spatial word vector A i Located updatesThe average value of the temporary clusters after the clustering, i epsilon [1, n]And i and n are positive integers;
if at least one standard deviation in the m updated temporary clusters is greater than or equal to a preset standard deviation threshold, returning to execute the step of calculating a first distance between each spatial word vector in the effective evaluation and each current cluster center, and putting the spatial word vector into the cluster where the cluster center corresponding to the minimum first distance is located to obtain m temporary clusters;
And if the standard deviation of the m updated temporary clusters is smaller than the standard deviation threshold, taking the cluster centers of the m updated temporary clusters as a cluster analysis result.
Further, each preset approval degree in the preset approval degree set is converted into a word vector, euclidean distance calculation is carried out on the word vectors corresponding to each preset approval degree of the clustering analysis result, and the preset approval degree corresponding to the word vector with the minimum Euclidean distance value is used as the approval degree corresponding to the effective evaluation.
The approval degree refers to preference and approval attitudes of the application program included in the user evaluation, and may be specifically set according to actual requirements, which is not specifically limited herein.
In this embodiment, a preset word segmentation mode is adopted to extract keywords contained in an effective comment, then a word vector mode is adopted to train the keywords, a spatial word vector corresponding to the keywords is obtained, then a clustering analysis is performed on the spatial word vector based on a K-Means aggregation algorithm, a clustering analysis result is obtained, the Euclidean distance between the clustering analysis result and each preset approval degree in a preset approval degree set is calculated, and the preset approval degree with the smallest Euclidean distance value is used as the approval degree corresponding to the effective comment, so that user emotion contained in the effective comment is obtained through converting the user comment into the spatial word vector and performing clustering analysis, namely, the approval degree of works is achieved, intelligent analysis on the effective comment is achieved, the analysis speed on the effective comment is improved, and the efficiency of work evaluation is improved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
Fig. 3 shows a schematic block diagram of a self-media pushing device in one-to-one correspondence with the self-media pushing method of the above embodiment. As shown in fig. 3, the self-media pushing device includes a data acquisition module 31, a data evaluation module 32, an index determination module 33, and a target recommendation module 34. The functional modules are described in detail as follows:
the data acquisition module 31 is configured to acquire public opinion data of a work of the target self-media on each basic platform, and obtain public opinion data of the target self-media on each basic platform;
the data evaluation module 32 is configured to analyze each piece of public opinion data to obtain an analysis result, determine a target classification crowd corresponding to the target self-media based on the analysis result, and determine a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
an index determination module 33 for determining a marketing index of the target self-media based on the composite score;
the target recommendation module 34 is configured to determine a target platform to be recommended from each base platform based on the marketing index, and recommend targets from media to target classification crowd in the target platform.
Optionally, the data acquisition module 31 includes:
the resource positioning unit is used for acquiring a uniform resource locator corresponding to each basic platform;
the page determining module is used for carrying out crawling analysis on the page files corresponding to the uniform resource locators in a web crawler mode aiming at each basic platform to obtain the page files of the target self-media corresponding works as target pages;
the data crawling unit is used for extracting public opinion information related to the target self-media corresponding works from contents contained in the target pages in a fuzzy matching mode aiming at each basic platform, and taking the public opinion information as the public opinion data of the target self-media on the basic platform.
Optionally, the data evaluation module 32 includes:
the interactive score evaluation unit is used for carrying out statistical weighting on the interactive data according to the preset weight of each type of interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
the quality score evaluation unit is used for analyzing the content of the works, scoring the quality of the works according to the analysis result and obtaining a second score;
the comment score evaluation unit is used for carrying out semantic recognition on the comment data and scoring according to the obtained semantic recognition result to obtain a third score;
And a result generation unit for determining evaluation information of the public opinion data based on the first score, the second score and the third score as an analysis result.
Optionally, the comment score evaluation unit includes:
the effective comment screening subunit is used for selecting the user evaluations with the same number as the preset threshold value as the effective evaluation of the user name if the number of the user evaluations with the user name exceeds the preset threshold value, and taking the user evaluation with each user name as an effective evaluation if the number of the user evaluations with the user name does not exceed the preset threshold value;
the semantic analysis subunit is used for carrying out evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain the corresponding approval degree of each effective evaluation;
and the score evaluation subunit is used for comprehensively evaluating the corresponding thought degree of each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the semantic analysis subunit comprises:
the word segmentation extraction component is used for extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
the word vector generation component is used for training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
The word segmentation and clustering component is used for carrying out clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and the acceptance degree calculation component is used for calculating the Euclidean distance between the clustering analysis result and each preset acceptance degree in the preset acceptance degree set, and taking the preset acceptance degree with the smallest Euclidean distance value as the acceptance degree corresponding to the effective evaluation.
For specific limitations of the self-media pushing device, reference may be made to the above limitation of the self-media pushing method, and no further description is given here. The modules in the self-media pushing device described above may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only a computer device 4 having a component connection memory 41, a processor 42, a network interface 43 is shown in the figures, but it is understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used for storing an operating system and various application software installed on the computer device 4, such as program codes for controlling electronic files, etc. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute a program code stored in the memory 41 or process data, such as a program code for executing control of an electronic file.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
The present application also provides another embodiment, namely, a computer readable storage medium, where an interface display program is stored, where the interface display program is executable by at least one processor, so that the at least one processor performs the steps of the self-media pushing method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.
It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims (10)

1. A self-media pushing method, characterized in that the self-media pushing method comprises:
collecting public opinion data of works of target self-media on each basic platform to obtain the public opinion data of the target self-media on each basic platform;
analyzing each piece of public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining the comprehensive score of the target self-media based on a preset scoring weight corresponding to each basic platform;
Determining a marketing index of the target self-media according to the comprehensive score;
and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowd in the target platform.
2. The self-media pushing method of claim 1, wherein the composite score representation is in the form of a multi-dimensional score graph.
3. The self-media pushing method as claimed in claim 1, wherein the collecting public opinion data of the target self-media work on each base platform, and obtaining the public opinion data of the target self-media work on each base platform comprises:
acquiring a uniform resource locator corresponding to each basic platform;
aiming at each basic platform, carrying out crawling analysis on the page files corresponding to the uniform resource locators in a web crawler mode to obtain the page files of the target self-media corresponding works as target pages;
and extracting public opinion information related to the target self-media corresponding works from contents contained in the target page in a fuzzy matching mode aiming at each basic platform, and taking the public opinion information as public opinion data of the target self-media on the basic platform.
4. The self-media pushing method of any of claims 1 to 3, wherein the public opinion data comprises at least one of interactive data, work content, and comment data.
5. The self-media pushing method of claim 4, wherein analyzing each piece of public opinion data to obtain an analysis result comprises:
according to the preset weight of each interactive data, carrying out statistical weighting on the interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to the analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and scoring according to the obtained semantic recognition result to obtain a third score;
and determining the evaluation information of the public opinion data based on the first score, the second score and the third score as the analysis result.
6. The self-media pushing method of claim 5, wherein said performing semantic recognition on the comment data and scoring based on the obtained semantic recognition result, the obtaining a third score comprising:
For the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of user evaluations corresponding to the user name does not exceed the preset threshold, taking each user evaluation corresponding to the user name as one effective evaluation;
carrying out evaluation emotion analysis on each effective evaluation by adopting a semantic analysis mode to obtain the corresponding approval degree of each effective evaluation;
and comprehensively evaluating the corresponding thought degree of each effective evaluation according to a preset evaluation mode to obtain the third score.
7. The method of claim 6, wherein said performing, by means of semantic analysis, an emotion analysis on each of said valid evaluations to obtain a corresponding approval level for each of said valid evaluations comprises:
extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
Performing cluster analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a cluster analysis result;
and calculating the Euclidean distance between the clustering analysis result and each preset approval degree in a preset approval degree set, and taking the preset approval degree with the smallest Euclidean distance value as the approval degree corresponding to the effective evaluation.
8. A self-media pushing device, the self-media pushing device comprising:
the data acquisition module is used for acquiring public opinion data of works of target self-media on each basic platform to obtain the public opinion data of the target self-media on each basic platform;
the data evaluation module is used for analyzing each piece of public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining the comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
the index determining module is used for determining the marketing index of the target self-media according to the comprehensive score;
and the target recommendation module is used for determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowd in the target platform.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the self-media push method as claimed in any of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the self-media push method of any of claims 1 to 7.
CN202110741715.2A 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium Active CN113569118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110741715.2A CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110741715.2A CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113569118A CN113569118A (en) 2021-10-29
CN113569118B true CN113569118B (en) 2023-12-22

Family

ID=78163287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110741715.2A Active CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113569118B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861027A (en) * 2022-04-29 2022-08-05 深圳市东晟数据有限公司 Multi-dimensional public opinion recommendation method based on big data and natural language processing
CN115936514B (en) * 2022-12-14 2023-08-08 湖南工业大学 Country food creative system based on big data linkage management

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960063A (en) * 2017-04-20 2017-07-18 广州优亚信息技术有限公司 A kind of internet information crawl and commending system for field of inviting outside investment
CN110532461A (en) * 2019-07-05 2019-12-03 中国平安财产保险股份有限公司 Information platform method for pushing, device, computer equipment and storage medium
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960063A (en) * 2017-04-20 2017-07-18 广州优亚信息技术有限公司 A kind of internet information crawl and commending system for field of inviting outside investment
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN110532461A (en) * 2019-07-05 2019-12-03 中国平安财产保险股份有限公司 Information platform method for pushing, device, computer equipment and storage medium
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device

Also Published As

Publication number Publication date
CN113569118A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
CN110162593B (en) Search result processing and similarity model training method and device
CN107679039B (en) Method and device for determining statement intention
CN110032632A (en) Intelligent customer service answering method, device and storage medium based on text similarity
CN113822067A (en) Key information extraction method and device, computer equipment and storage medium
WO2020237856A1 (en) Smart question and answer method and apparatus based on knowledge graph, and computer storage medium
CN111813905B (en) Corpus generation method, corpus generation device, computer equipment and storage medium
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN106294618A (en) Searching method and device
CN113569118B (en) Self-media pushing method, device, computer equipment and storage medium
CN111538931A (en) Big data-based public opinion monitoring method and device, computer equipment and medium
CN111625715B (en) Information extraction method and device, electronic equipment and storage medium
CN111325030A (en) Text label construction method and device, computer equipment and storage medium
CN112926308B (en) Method, device, equipment, storage medium and program product for matching text
CN112085091B (en) Short text matching method, device, equipment and storage medium based on artificial intelligence
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
CN115203421A (en) Method, device and equipment for generating label of long text and storage medium
CN109271624A (en) A kind of target word determines method, apparatus and storage medium
CN115130601A (en) Two-stage academic data webpage classification method and system based on multi-dimensional feature fusion
CN110019763B (en) Text filtering method, system, equipment and computer readable storage medium
CN114328800A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN113626704A (en) Method, device and equipment for recommending information based on word2vec model
CN117312535A (en) Method, device, equipment and medium for processing problem data based on artificial intelligence
CN104408036A (en) Correlated topic recognition method and device
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant