CN116484101A - Information pushing method and device - Google Patents

Information pushing method and device Download PDF

Info

Publication number
CN116484101A
CN116484101A CN202310474733.8A CN202310474733A CN116484101A CN 116484101 A CN116484101 A CN 116484101A CN 202310474733 A CN202310474733 A CN 202310474733A CN 116484101 A CN116484101 A CN 116484101A
Authority
CN
China
Prior art keywords
classification
article
information
user
articles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310474733.8A
Other languages
Chinese (zh)
Inventor
罗秉安
朱乐和
张扬
洪欢江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310474733.8A priority Critical patent/CN116484101A/en
Publication of CN116484101A publication Critical patent/CN116484101A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an information pushing method and device, which can be used in the financial field or other fields, and the method comprises the following steps: acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article; applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels; and determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user. The method and the device can improve the matching degree between the pushed information and the user, meet the requirements of different users, and enable the user to quickly acquire the information focused by the user.

Description

Information pushing method and device
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to an information pushing method and apparatus.
Background
At present, mass information is generated every day in the information explosion age. Taking banking industry as an example, in the digital transformation stage, the hot spot information of various technologies and services in the banking industry can be learned through numerous media platforms (including websites, blogs, public numbers and the like) every day. However, at present, various information is in a disordered state, and is not reasonably classified, so that a user cannot quickly acquire information which needs to be focused in the current work.
Disclosure of Invention
Aiming at least one problem in the prior art, the application provides an information pushing method and device, which can improve the matching degree between pushed information and users, meet the requirements of different users, and enable the users to quickly acquire the information focused by themselves.
In order to solve the technical problems, the application provides the following technical scheme:
in a first aspect, the present application provides an information pushing method, including:
acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article;
applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
And determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user.
In one embodiment, before the applying the secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, the method further includes:
acquiring training sets corresponding to each class of class one classification, wherein each training set comprises: the corresponding first-stage classification is the same as the historical article information of the batch of historical articles of the training set and the corresponding actual second-stage classification labels thereof;
and training the text classification algorithm by applying each training set to obtain secondary classification models corresponding to each type of primary classification.
In one embodiment, the determining the user corresponding to each article and pushing the article information of each article to the terminal device of the corresponding user according to the preset classification and user relationship table, the first class classification and the second class classification of each article includes:
generating article information documents corresponding to the primary classification and the secondary classification according to the article information of the articles with the same primary classification and the same secondary classification;
And determining users corresponding to each primary classification and each secondary classification according to the preset classification and user relation table, and pushing the article information document to an email box of the corresponding user.
In one embodiment, the obtaining article information of a batch of articles includes:
article information of a batch of articles is crawled from a target website and a recommended website on a page of the target website.
In one embodiment, the determining the first class classification of each article according to the article information of the article includes:
obtaining keywords corresponding to each class of first-level classification;
and screening according to the keywords corresponding to each class of the first class classification and the article information to determine the first class classification of each article.
In one embodiment, the training of the text classification algorithm by using each training set to obtain a secondary classification model corresponding to each of the primary classifications includes:
word segmentation processing is carried out by applying the historical article information of the batch of historical articles to obtain word vectors of all the historical articles;
extracting features based on word vectors of each historical article to obtain key information of each historical article;
and training a text classification algorithm by applying key information of each historical article and corresponding actual secondary classification labels thereof to obtain the secondary classification model.
In one embodiment, before determining the user corresponding to each article according to the preset classification and user relationship table, the first class classification and the second class classification of each article and pushing the article information of each article to the terminal device of the corresponding user, the method further includes:
receiving an article subscription request, the article subscription request comprising: a unique identification of a user, a primary classification and a secondary classification selected by the user;
storing the unique identification of the user, the primary classification and the secondary classification selected by the user in the classification and user relationship table.
In a second aspect, the present application provides an information pushing apparatus, including:
the acquisition module is used for acquiring article information of the articles in batches and determining the first class classification of each article according to the article information of the articles;
the application module is used for applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, and each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
And the pushing module is used for determining the users corresponding to the articles according to the preset classification and user relation table, the primary classification and the secondary classification of the articles and pushing the article information of the articles to the terminal equipment of the corresponding users.
In one embodiment, the information pushing device further includes:
the training set acquisition module is used for acquiring training sets corresponding to each class of first-class classification, and each training set comprises: the corresponding first-stage classification is the same as the historical article information of the batch of historical articles of the training set and the corresponding actual second-stage classification labels thereof;
and the training module is used for training the text classification algorithm by applying each training set to obtain secondary classification models corresponding to each class of primary classification.
In one embodiment, the pushing module includes:
the document generation unit is used for generating article information documents corresponding to the primary classification and the secondary classification according to the article information of the articles with the same primary classification and the same secondary classification;
and the pushing unit is used for determining the users corresponding to each primary classification and each secondary classification according to the preset classification and user relation table and pushing the article information document to the corresponding electronic mailbox of the user.
In one embodiment, the acquisition module includes:
and the crawling unit is used for crawling article information of the batch of articles from the target website and the recommended website on the page of the target website.
In one embodiment, the acquisition module includes:
the keyword acquisition unit is used for acquiring keywords corresponding to each class of first-level classification;
and the screening unit is used for screening according to the keywords corresponding to each class of the class-one classification and the article information to determine the class-one classification of each article.
In one embodiment, the training module comprises:
the word segmentation unit is used for carrying out word segmentation processing by applying the history article information of the batch of history articles to obtain word vectors of each history article;
the feature extraction unit is used for extracting features based on word vectors of each history article to obtain key information of each history article;
and the training unit is used for training the text classification algorithm by applying the key information of each historical article and the corresponding actual secondary classification labels thereof to obtain the secondary classification model.
In one embodiment, the information pushing device further includes:
the subscription module is used for receiving an article subscription request, and the article subscription request comprises: a unique identification of a user, a primary classification and a secondary classification selected by the user;
And the storage module is used for storing the unique identification of the user, the primary classification and the secondary classification selected by the user in the classification and user relationship table.
In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the information pushing method when executing the program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon computer instructions that when executed implement the information push method.
As can be seen from the above technical solutions, the present application provides an information pushing method and apparatus. Wherein the method comprises the following steps: acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article; applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels; according to a preset classification and user relation table, primary classification and secondary classification of each article, determining a user corresponding to each article and pushing article information of each article to corresponding user terminal equipment, matching degree between pushed information and the user can be improved, requirements of different users are met, and the user can quickly acquire information focused by the user; specifically, the latest information can be automatically crawled, and the information articles can be intelligently and automatically classified by using an artificial intelligent algorithm, so that the information can be rapidly provided for article information required by information subscription personnel; a unified information summarizing point is established, so that information can be consulted on the aggregated public numbers at any time without reading on a plurality of public numbers or websites one by one.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a first flow diagram of an information pushing method in an embodiment of the present application;
fig. 2 is a second flow diagram of an information pushing method in an embodiment of the present application;
fig. 3 is a schematic third flow chart of an information pushing method in an embodiment of the present application;
fig. 4 is a flow chart of an information pushing method in an application example of the present application;
fig. 5 is a first structural schematic diagram of an information pushing device in an embodiment of the present application;
fig. 6 is a second structural schematic diagram of the information pushing device in the embodiment of the present application;
fig. 7 is a schematic block diagram of a system configuration of an electronic device according to an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions in the present specification, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be noted that, the information pushing method and device disclosed in the present application may be used in the financial technical field, and may also be used in any field other than the financial technical field, and the application field of the information pushing method and device disclosed in the present application is not limited. In the technical schemes of the embodiments of the application, the acquisition, storage, use, processing and the like of the data all conform to relevant regulations of laws and regulations.
The following examples are presented in detail.
In order to improve the matching degree between the pushed information and the user and meet the requirements of different users, so that the user can quickly obtain the information focused by the user, the embodiment provides an information pushing method, the execution subject of which is an information pushing device, the information pushing device comprises but not limited to a server, as shown in fig. 1, and the method specifically comprises the following contents:
step 100: and acquiring article information of the articles in batches, and determining the first-level classification of each article according to the article information of the article.
Specifically, various hot spot information articles can be regularly acquired from a plurality of media platforms; the article information may include: web page address and article content, which may include: title, abstract and body (including text and pictures), etc. Taking banking as an example, the first class classification may include: technical classes and business classes.
Step 200: and determining the secondary classification of each article by applying a secondary classification model corresponding to the primary classification of each article and the article information, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels of the historical article information.
Specifically, the primary classification model and the secondary classification model can have a one-to-one correspondence; the secondary classification may represent a subclass under the primary classification, for example, as shown in table 1:
TABLE 1
Step 300: and determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user.
Specifically, the preset classification and user relationship table may include a correspondence between each primary classification, each secondary classification, and a unique identifier of the user; the preset classification and user relationship table may be stored locally on the information pushing device; the terminal equipment can be a desktop computer, a notebook computer, a mobile phone and the like of the user.
To further improve the reliability of the training of the two-stage classification model, as shown in fig. 2, in one embodiment, before step 200, the method further includes:
step 021: obtaining a plurality of training sets, each training set comprising: the historical article information of the batch of the historical articles and the corresponding actual secondary classification labels are the same in the first class classification of the historical articles corresponding to the same training set, and the first class classification of the historical articles corresponding to different training sets is different.
Step 022: and training the text classification algorithm by applying each training set to obtain a secondary classification model of each class of primary classification.
In particular, the text classification algorithm may be a TextCNN classification algorithm from which class rules are found and the class of new data is predicted by calculation and analysis of a training set of known classes.
To facilitate the user to quickly obtain information of interest to the user and to facilitate the user to browse the article information, in one embodiment, step 300 includes:
step 301: and generating article information documents corresponding to the primary classification and the secondary classification according to the article information of the articles with the same primary classification and the same secondary classification.
Specifically, the article information document may include: article information of all articles corresponding to the first-class classification and the second-class classification; for example, a total of 10 articles classified as "technology class" - "new technology application" class are summarized in a summary document, and each article contains its address links, article titles, summaries, and original graphic information, etc.
Step 302: and determining users corresponding to each primary classification and each secondary classification according to the preset classification and user relation table, and pushing the article information document to an email box of the corresponding user.
For example, according to the preset classification and user relationship table, all users corresponding to the category "technology class" - "new technology application" may be determined, and the article information document corresponding to the category "technology class" - "new technology application" is sent to the electronic mailbox of all users corresponding to the category.
In order to improve the efficiency of obtaining article information, in one embodiment, the obtaining article information of a batch of articles in step 100 includes: article information of a batch of articles is crawled from a target website and a recommended website on a page of the target website.
Specifically, the target website may be a plurality of websites preselected according to actual conditions; the recommended website of the page under the target website can be obtained, namely, the brother website of the target website can also automatically obtain related information, the brother website is added into the information source, and the unsuitable website can be manually screened out later.
To improve the efficiency of the first-order classification process, in one embodiment, determining the first-order classification of each article based on the article information of the article as described in step 100 includes:
Step 101: and obtaining keywords corresponding to each class of the first-level classification.
Step 102: and screening according to the keywords corresponding to each class of the first class classification and the article information to determine the first class classification of each article.
Specifically, a first class classification of each article may be obtained in a rule-based manner. If the article content in the article information is screened out, the articles containing the new service and the new product fields are summarized into service classes; articles containing 'technology' and 'digitization' fields in the article content are screened out and summarized into technology class and the like.
To further improve reliability of model training, in one embodiment, step 022 includes:
step 0221: and performing word segmentation processing by using the history article information of the batch of history articles to obtain word vectors of each history article.
Step 0222: and extracting features based on the word vectors of the historical articles to obtain the key information of the historical articles.
Step 0223: and training a text classification algorithm by applying key information of each historical article and corresponding actual secondary classification labels thereof to obtain the secondary classification model.
Specifically, the article information content can be segmented, word vectors are constructed by using Word segmentation, natural language can be digitized, and subsequent processing is facilitated. The text rank algorithm can be used for extracting key information of the article content, namely, a keyword or a keyword group is extracted from a text, and a keyword sentence of the text is extracted by using an extraction type automatic abstract method. The extracted key information may be as shown in table 2:
TABLE 2
Original text Key information
Machine learning: artificial intelligence how to enable financial practices Machine learning artificial intelligence finance
Cryptographic techniquesThe more developed, the more secure the financial data Cryptographic financial data
The point of view: distributed core system chaos test exploration and practice Distributed chaos test
The construction of new financial systems in new times of assistance by adhering to the drive of technological innovation Science and technology innovation finance
Innovative development of finance assistance technology Innovative finance and technology
Research and exploration of blockchain in cross-border financial field Blockchain cross-border finance
To improve the flexibility of subscribing to article information by the user, as shown in fig. 3, in one embodiment, before step 300, the method further includes:
step 301: receiving an article subscription request, the article subscription request comprising: a unique identification of the user, a primary classification selected by the user, and a secondary classification.
Specifically, an article subscription request sent by a terminal device of a user can be received; the first class and the second class selected by the user may be the first class and the second class selected by the user through the terminal device, that is, the first class and the second class of the article information the user wishes to pay attention to. The unique identifier of the user is used for distinguishing different users and can be a character string consisting of letters and numbers, such as an identity card number, an email address and the like.
Step 302: storing the unique identification of the user, the primary classification and the secondary classification selected by the user in the classification and user relationship table.
Specifically, the article category that needs attention can be customized by the user. If the current work content is a new technology research direction such as a blockchain, a focused technology class, a new technology application, can be defined, and the mailbox address of the user can be saved.
In order to further explain the scheme, the application example of the information pushing method is provided, personalized hot spot information collection based on a TextCNN model is performed, various hot spot information articles of a media platform are automatically crawled, the content of the articles is automatically classified and integrated by using the TextCNN model, and the information is automatically released to a stem system person, as shown in fig. 4, specifically described as follows:
step S01: an information source is defined. A website address is defined where information needs to be obtained. A series of fixed web sites, such as "finance computers", "banking networks", etc., are manually specified. In the step S02 of crawling the network, the system can automatically acquire the recommended website of the page under the fixed website, namely, the brother website of the appointed website can also automatically acquire related information, the brother website is added into the information source, and the unsuitable website can be manually screened out later.
Step S02: and climbing the original information of the network. And acquiring information of all articles under a medium platform in the information source in a daily automatic network climbing mode. The articles obtained from a web site are shown in table 3 below:
TABLE 3 Table 3
Numbering device Article title
1001 Case-data analysis platform design
1002 Direct broadcast-foreign resource line alternate mine treading, information protection is developed at the first place
1003 * And the application scene of the application in the financial field is wide in an incoming RPA enterprise
1004 The data quality has a plurality of outstanding problems, and the bank insurance mechanism is notified by the silver insurance prison
1005 * Successful online transaction processing capability improvement of new generation core system of farm banking
1006 * Compliance response concept for personal financial information protection
1007 * Strategic investment in state information will deepen financial technology collaboration
1008 Report interpretation-event draws attention, study bureau vs. algorithm steady coin development study
1009 * Bank planning annual end of time pushing out scrip carbon credit
1010 Research and exploration of blockchain in the field
Step S03: and integrating the original information. The information to be crawled down includes: the records of the web page address, the title, the abstract, the text of the article (including text and pictures) and the like are stored in a local database.
Step S04: information categories are defined. According to the current working direction, information categories are specified. Taking the two-stage classification of the information related to the work focused by the banking staff as an example, wherein the first stage is divided into a technical class or a service class, and the first stage can be divided into technical transformation, creation, digitization, new technical application and the like under the technical class.
Step S05: the pretreatment information specifically comprises:
the first step: unnecessary information such as "recruitment", "personnel", "bidding" and "ticket" is directly removed, and content not related to the current work attention can be directly screened out.
And a second step of: labeling the data set, selecting a data subset, wherein one part is used as a training set, the other part is used as a test set, manually defining each article information as to which category, such as article information of 'research and exploration of the block chain in the field', and defining the attribution of the article information as 'technology category' - 'new technology application'.
Step S06: automatic classification information.
And establishing two layers of models according to the classification standard, wherein the first layer of models distinguish the major categories to which the articles belong, and the second layer of models distinguish the second class to which the articles belong on the basis of the major categories.
The first layer model mainly distinguishes the broad category attribution of articles. Since the subject of its classification is relatively fixed, a rule-based approach may be used. Such as summarizing "new business", "new product" into business classes; the "technology" and "digitization" are summarized as technical classes.
The second layer model requires a subdivision of the category. The method comprises the following specific steps:
The first step: and segmenting the information content of the article. Word vectors can be constructed by Word segmentation, natural language can be digitized, and subsequent processing is facilitated.
And a second step of: and extracting the characteristics. The application example uses TextRank to extract key information of the article content. Namely, extracting keywords or keyword groups from a text, and extracting the keywords of the text by using an extraction type automatic abstract method.
And a third step of: the application instance uses TextCNN classification algorithm to find class rules from the known class training set by calculation and analysis and predict the class of new data. The method mainly comprises the following steps:
1) And (3) convolving the word vector obtained in the first step. Convolution is a mathematical operator, which is used for extracting characteristics of a text, and a new matrix is obtained after convolution operation.
2) And carrying out max-pooling on the result, and taking a maximum value from a plurality of values. With the main features maintained, the number of parameters is reduced, further speeding up the calculation, while reducing the risk of overfitting.
3) K-classification was done using softmax layers. And splicing the results of the max-pulling, and sending the results into the softmax to obtain the probability of each category. During the training process, a loss function is calculated according to the predicted label and the actual label, the parameter gradient required to be updated in the network is obtained, and the parameters in the steps are updated in sequence, so that one round of training is completed.
And classifying the article information by using the model.
Step S07: subscription category information. The article category that needs to be concerned is customized by the information subscriber. If the current work content is a new technology research direction such as a blockchain, a focused technology class, a new technology application, can be defined, and the mailbox address of the subscriber is saved.
Step S08: and integrating the information. The automatic integration step S06 is performed by classifying the classified articles, for example, 10 articles classified as "technical class" —new technical application "class on the same day, and then the 10 articles are summarized in a summary document, and each article includes its address link, and the title, abstract, and original graphic information of the article.
Step S09: and automatically pushing information. And according to the category subscribed by the information subscriber, the integrated summary documents in the category are sent to the subscriber through mails every day. Meanwhile, an 'information summarizing' public number is established, two-level classification is established on the public number, and each sub-classification is an integrated summarizing document corresponding to the word class, so that people can quickly review the information of the interested class.
In order to improve the matching degree between the pushed information and the user and meet the requirements of different users, so that the user can quickly obtain the information focused by himself, the application provides an embodiment of an information pushing device for implementing all or part of the content in the information pushing method, referring to fig. 5, the information pushing device specifically includes the following contents:
The acquiring module 10 is configured to acquire article information of a batch of articles, and determine a first class classification of each article according to the article information of the article;
the application module 20 is configured to apply a secondary classification model corresponding to the primary classification of each article and the article information, determine a secondary classification of the article, where each secondary classification model is obtained by training in advance a text classification algorithm according to the historical article information of a batch of historical articles with the same primary classification as the secondary classification model and their corresponding actual secondary classification labels;
and the pushing module 30 is configured to determine a user corresponding to each article according to a preset classification and user relationship table, and the first class classification and the second class classification of each article, and push article information of each article to a terminal device of the corresponding user.
As shown in fig. 6, in one embodiment, the information pushing device further includes:
the training set acquisition module 40 is configured to acquire training sets corresponding to each of the class-one classifications, where each training set includes: the corresponding first-stage classification is the same as the historical article information of the batch of historical articles of the training set and the corresponding actual second-stage classification labels thereof;
The training module 50 is configured to apply each training set to train the text classification algorithm to obtain secondary classification models corresponding to each of the primary classifications.
In one embodiment, the pushing module includes:
the document generation unit is used for generating article information documents corresponding to the primary classification and the secondary classification according to the article information of the articles with the same primary classification and the same secondary classification;
and the pushing unit is used for determining the users corresponding to each primary classification and each secondary classification according to the preset classification and user relation table and pushing the article information document to the corresponding electronic mailbox of the user.
In one embodiment, the acquisition module includes:
and the crawling unit is used for crawling article information of the batch of articles from the target website and the recommended website on the page of the target website.
In one embodiment, the acquisition module includes:
the keyword acquisition unit is used for acquiring keywords corresponding to each class of first-level classification;
and the screening unit is used for screening according to the keywords corresponding to each class of the class-one classification and the article information to determine the class-one classification of each article.
In one embodiment, the training module comprises:
The word segmentation unit is used for carrying out word segmentation processing by applying the history article information of the batch of history articles to obtain word vectors of each history article;
the feature extraction unit is used for extracting features based on word vectors of each history article to obtain key information of each history article;
and the training unit is used for training the text classification algorithm by applying the key information of each historical article and the corresponding actual secondary classification labels thereof to obtain the secondary classification model.
In one embodiment, the information pushing device further includes:
the subscription module is used for receiving an article subscription request, and the article subscription request comprises: a unique identification of a user, a primary classification and a secondary classification selected by the user;
and the storage module is used for storing the unique identification of the user, the primary classification and the secondary classification selected by the user in the classification and user relationship table.
The embodiment of the information pushing device provided in the present disclosure may be specifically used to execute the processing flow of the embodiment of the information pushing method, and the functions thereof are not described herein again, and may refer to the detailed description of the embodiment of the information pushing method.
In order to further explain the scheme, in combination with the application example of the information pushing method, the application example of the information pushing device is provided, and the specific description is as follows:
The acquisition module is used for: various hot spot information articles are mainly obtained from various medium platforms in a climbing network mode, and downloaded original information data are summarized.
And an application module: the method mainly comprises the steps of preprocessing summarized original information, and then automatically classifying information data by applying a textrank+textCNN model.
And the pushing module is used for: and the classified information is automatically pushed to the information subscription personnel of the corresponding category.
In order to improve matching degree between pushed information and users and meet requirements of different users in terms of hardware level, so that the users can quickly obtain information focused on themselves, the application provides an embodiment of electronic equipment for realizing all or part of contents in the information pushing method, and the electronic equipment specifically comprises the following contents:
a processor (processor), a memory (memory), a communication interface (Communications Interface), and a bus; the processor, the memory and the communication interface complete communication with each other through the bus; the communication interface is used for realizing information transmission between the information pushing device and related equipment such as a user terminal; the electronic device may be a desktop computer, a tablet computer, a mobile terminal, etc., and the embodiment is not limited thereto. In this embodiment, the electronic device may be implemented with reference to an embodiment for implementing the information pushing method and an embodiment for implementing the information pushing apparatus according to the embodiments, and the contents thereof are incorporated herein, and are not repeated here.
Fig. 7 is a schematic block diagram of a system configuration of an electronic device 9600 of an embodiment of the present application. As shown in fig. 7, the electronic device 9600 may include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 7 is exemplary; other types of structures may also be used in addition to or in place of the structures to implement telecommunications functions or other functions.
In one or more embodiments of the present application, the information push functionality may be integrated into the central processor 9100. The central processor 9100 may be configured to perform the following control:
step 100: acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article;
step 200: applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
step 300: and determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user.
From the above description, the electronic device provided by the embodiment of the present application can improve the matching degree between the pushed information and the user, and satisfy the requirements of different users, so that the user can quickly obtain the information focused by himself.
In another embodiment, the information pushing device may be configured separately from the central processor 9100, for example, the information pushing device may be configured as a chip connected to the central processor 9100, and the information pushing function is implemented by control of the central processor.
As shown in fig. 7, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 need not include all of the components shown in fig. 7; in addition, the electronic device 9600 may further include components not shown in fig. 7, and reference may be made to the related art.
As shown in fig. 7, the central processor 9100, sometimes referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, which central processor 9100 receives inputs and controls the operation of the various components of the electronic device 9600.
The memory 9140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information about failure may be stored, and a program for executing the information may be stored. And the central processor 9100 can execute the program stored in the memory 9140 to realize information storage or processing, and the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. The power supply 9170 is used to provide power to the electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, but not limited to, an LCD display.
The memory 9140 may be a solid state memory such as Read Only Memory (ROM), random Access Memory (RAM), SIM card, etc. But also a memory which holds information even when powered down, can be selectively erased and provided with further data, an example of which is sometimes referred to as EPROM or the like. The memory 9140 may also be some other type of device. The memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 storing application programs and function programs or a flow for executing operations of the electronic device 9600 by the central processor 9100.
The memory 9140 may also include a data store 9143, the data store 9143 for storing data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers of the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, address book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. A communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, as in the case of conventional mobile communication terminals.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, etc., may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and to receive audio input from the microphone 9132 to implement usual telecommunications functions. The audio processor 9130 can include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100 so that sound can be recorded locally through the microphone 9132 and sound stored locally can be played through the speaker 9131.
As can be seen from the above description, the electronic device provided by the embodiment of the present application can improve the matching degree between the pushed information and the user, and satisfy the requirements of different users, so that the user can quickly obtain the information focused by himself.
The embodiments of the present application also provide a computer-readable storage medium capable of implementing all the steps of the information pushing method in the above embodiments, the computer-readable storage medium storing thereon a computer program that, when executed by a processor, implements all the steps of the information pushing method in the above embodiments, for example, the processor implements the following steps when executing the computer program:
step 100: acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article;
step 200: applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
step 300: and determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user.
As can be seen from the above description, the computer readable storage medium provided by the embodiments of the present application can improve the matching degree between the pushed information and the user, and satisfy the requirements of different users, so that the user can quickly obtain the information focused by himself.
All embodiments of the method are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred to, and each embodiment mainly describes differences from other embodiments. For relevance, see the description of the method embodiments.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present application are described herein with reference to specific examples, the description of which is only for the purpose of aiding in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. An information pushing method is characterized by comprising the following steps:
acquiring article information of a batch of articles, and determining the first class classification of each article according to the article information of the article;
applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, wherein each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
and determining the corresponding users of each article according to a preset classification and user relation table, the primary classification and the secondary classification of each article, and pushing the article information of each article to the corresponding terminal equipment of the user.
2. The method for pushing information according to claim 1, further comprising, before said applying the secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article:
acquiring training sets corresponding to each class of class one classification, wherein each training set comprises: the corresponding first-stage classification is the same as the historical article information of the batch of historical articles of the training set and the corresponding actual second-stage classification labels thereof;
And training the text classification algorithm by applying each training set to obtain secondary classification models corresponding to each type of primary classification.
3. The information pushing method according to claim 1, wherein the determining the user corresponding to each article and pushing the article information of each article to the terminal device of the corresponding user according to the preset classification and user relationship table, the first class classification and the second class classification of each article includes:
generating article information documents corresponding to the primary classification and the secondary classification according to the article information of the articles with the same primary classification and the same secondary classification;
and determining users corresponding to each primary classification and each secondary classification according to the preset classification and user relation table, and pushing the article information document to an email box of the corresponding user.
4. The method for pushing information according to claim 1, wherein the obtaining article information of a batch of articles includes:
article information of a batch of articles is crawled from a target website and a recommended website on a page of the target website.
5. The method for pushing information according to claim 1, wherein determining a first class classification of each article according to article information of the article comprises:
Obtaining keywords corresponding to each class of first-level classification;
and screening according to the keywords corresponding to each class of the first class classification and the article information to determine the first class classification of each article.
6. The information pushing method according to claim 2, wherein the training the text classification algorithm by applying each training set to obtain a secondary classification model corresponding to each of the primary classifications, respectively, includes:
word segmentation processing is carried out by applying the historical article information of the batch of historical articles to obtain word vectors of all the historical articles;
extracting features based on word vectors of each historical article to obtain key information of each historical article;
and training a text classification algorithm by applying key information of each historical article and corresponding actual secondary classification labels thereof to obtain the secondary classification model.
7. The information pushing method according to claim 1, wherein before determining the user corresponding to each article according to the preset classification-user relationship table, the first class classification and the second class classification of each article and pushing the article information of each article to the terminal device of the corresponding user, the method further comprises:
Receiving an article subscription request, the article subscription request comprising: a unique identification of a user, a primary classification and a secondary classification selected by the user;
storing the unique identification of the user, the primary classification and the secondary classification selected by the user in the classification and user relationship table.
8. An information pushing apparatus, characterized by comprising:
the acquisition module is used for acquiring article information of the articles in batches and determining the first class classification of each article according to the article information of the articles;
the application module is used for applying a secondary classification model corresponding to the primary classification of each article and the article information to determine the secondary classification of the article, and each secondary classification model is obtained by training a text classification algorithm in advance according to the historical article information of the batch of historical articles with the same primary classification as the secondary classification model and the corresponding actual secondary classification labels;
and the pushing module is used for determining the users corresponding to the articles according to the preset classification and user relation table, the primary classification and the secondary classification of the articles and pushing the article information of the articles to the terminal equipment of the corresponding users.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the information push method of any of claims 1 to 7 when executing the program.
10. A computer readable storage medium having stored thereon computer instructions, which when executed implement the information push method of any of claims 1 to 7.
CN202310474733.8A 2023-04-27 2023-04-27 Information pushing method and device Pending CN116484101A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310474733.8A CN116484101A (en) 2023-04-27 2023-04-27 Information pushing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310474733.8A CN116484101A (en) 2023-04-27 2023-04-27 Information pushing method and device

Publications (1)

Publication Number Publication Date
CN116484101A true CN116484101A (en) 2023-07-25

Family

ID=87211663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310474733.8A Pending CN116484101A (en) 2023-04-27 2023-04-27 Information pushing method and device

Country Status (1)

Country Link
CN (1) CN116484101A (en)

Similar Documents

Publication Publication Date Title
US20240078386A1 (en) Methods and systems for language-agnostic machine learning in natural language processing using feature extraction
US20230222366A1 (en) Systems and methods for semantic analysis based on knowledge graph
US9449271B2 (en) Classifying resources using a deep network
CN112507116A (en) Customer portrait method based on customer response corpus and related equipment thereof
CN110516057B (en) Petition question answering method and device
CN110866110A (en) Conference summary generation method, device, equipment and medium based on artificial intelligence
CN111753551B (en) Information generation method and device based on word vector generation model
US11586618B2 (en) Query parsing from natural language questions supported by captured subject matter knowledge
CN110798567A (en) Short message classification display method and device, storage medium and electronic equipment
CN112417121A (en) Client intention recognition method and device, computer equipment and storage medium
CN111582314A (en) Target user determination method and device and electronic equipment
CN112434501A (en) Work order intelligent generation method and device, electronic equipment and medium
CN112766825A (en) Enterprise financial service risk prediction method and device
CN117114514A (en) Talent information analysis management method, system and device based on big data
CN107766498A (en) Method and apparatus for generating information
CN113283984A (en) Personal loan information input method and device
CN110008318A (en) Problem distributing method and device
CN117520503A (en) Financial customer service dialogue generation method, device, equipment and medium based on LLM model
CN110046233A (en) Problem distributing method and device
CN116484101A (en) Information pushing method and device
CN112019675A (en) Address list contact person sorting method and device and electronic equipment
CN111311197A (en) Travel data processing method and device
CN112445955A (en) Business opportunity information management method, system and storage medium
CN108520334A (en) A kind of occupation reference method and apparatus
CN113192511B (en) Information input method, information input device, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination