CN110162796B - News thematic creation method and device - Google Patents

News thematic creation method and device Download PDF

Info

Publication number
CN110162796B
CN110162796B CN201910471260.XA CN201910471260A CN110162796B CN 110162796 B CN110162796 B CN 110162796B CN 201910471260 A CN201910471260 A CN 201910471260A CN 110162796 B CN110162796 B CN 110162796B
Authority
CN
China
Prior art keywords
news
information
event
topic
pieces
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910471260.XA
Other languages
Chinese (zh)
Other versions
CN110162796A (en
Inventor
张莹楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910471260.XA priority Critical patent/CN110162796B/en
Publication of CN110162796A publication Critical patent/CN110162796A/en
Application granted granted Critical
Publication of CN110162796B publication Critical patent/CN110162796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application aims to provide a news topic creation method and device, wherein the method comprises the following steps: obtaining a plurality of pieces of news information provided by an information provider, clustering the plurality of pieces of news information through a pre-trained information clustering model to obtain a plurality of information sets, wherein the plurality of pieces of news information included in each information set correspond to the same news event, and creating corresponding news topics for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system, wherein the news topics are used for reporting the corresponding news event. By the embodiment of the application, the efficiency of creating news topics can be improved.

Description

News thematic creation method and device
Technical Field
The present application relates to the field of computers, and in particular, to a method and apparatus for creating news topics.
Background
At present, the main process of pushing news topics to users is that after a plurality of news information is acquired from each news provider, staff browse the contents of the news information, manually screen the news information meeting the requirements of preset contents according to the contents of the news information, combine the news information into the news topics, and push the news topics to the users. Wherein each news topic is used for reporting a news event, and each news topic comprises a topic name and a plurality of news information.
Therefore, the current mode of manually combining news information into news topics has the problem of low efficiency of creating news topics.
Disclosure of Invention
The embodiment of the application aims to provide a news topic creation method and device so as to improve the creation efficiency of news topics.
In order to solve the technical problems, the embodiment of the application is realized as follows:
the embodiment of the application provides a news topic creation method, which comprises the following steps:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
creating corresponding news topics for each news event according to a plurality of pieces of news information corresponding to each news event through a news topic creation system; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
The embodiment of the application provides a news topic creation method, which comprises the following steps:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
determining event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
creating a news topic for the target event according to a plurality of pieces of news information corresponding to the target event by a news topic creation system; wherein the news topic is used for reporting the target event; the news topic comprises topic names and topic information, and the topic information is determined and obtained according to a plurality of news information corresponding to the target event.
The embodiment of the application provides a news topic creation device, which comprises:
The first acquisition module is used for acquiring a plurality of pieces of news information provided by the information provider, wherein the news information is recorded with news content corresponding to a news event;
the first clustering module is used for clustering the plurality of news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
the first creating module is used for creating corresponding news topics for each news event according to a plurality of pieces of news information corresponding to each news event through the news topic creating system; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
The embodiment of the application provides a news topic creation device, which comprises:
the second acquisition module is used for acquiring a plurality of pieces of news information provided by the information provider, wherein the news information is recorded with news content corresponding to a news event;
The second clustering module is used for clustering the plurality of news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
the selecting module is used for determining the event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
the second creation module is used for creating news topics for the target event according to the plurality of pieces of news information corresponding to the target event through a news topic creation system; wherein the news topic is used for reporting the target event; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the target event.
The embodiment of the application provides news topic creation equipment, which comprises the following components: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to implement the steps of the news topic creation method described above.
The embodiments of the present application provide a storage medium storing computer-executable instructions that, when executed, implement the steps of the news topic creation method described above.
In the embodiment of the application, a plurality of pieces of news information provided by an information provider is firstly obtained, then the plurality of pieces of news information are clustered through a pre-trained information clustering model to obtain a plurality of information sets, news contents of the plurality of pieces of news information included in each information set correspond to the same news event, and finally a corresponding news topic is created for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system. Therefore, according to the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a schematic view of a scenario of a news topic creation method according to an embodiment of the present application;
FIG. 2 is a flowchart of a news topic creation method according to an embodiment of the present disclosure;
FIG. 3a is a schematic diagram of an interface of a news topic creation system according to an embodiment of the present application;
FIG. 3b is a schematic diagram illustrating an interface of a news topic creation system in accordance with another embodiment of the present application;
FIG. 3c is a schematic diagram illustrating an interface of a news topic creation system in accordance with another embodiment of the present application;
fig. 4 is an interface schematic diagram of a user terminal according to an embodiment of the present application;
fig. 5 is a flowchart of a news topic creating method according to another embodiment of the present application;
FIG. 6 is a schematic diagram of module components of a news topic creating apparatus according to an embodiment of the present application;
fig. 7 is a schematic block diagram of a news topic creating apparatus according to another embodiment of the present application;
fig. 8 is a schematic structural diagram of a news topic creating apparatus provided in an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without undue burden are intended to be within the scope of the present application.
The embodiment of the application aims to provide a news topic creation method and device so as to improve the creation efficiency of news topics. The news topic creating method in the embodiment of the application can be executed by a news topic creating device (such as a background server).
Fig. 1 is a schematic view of a scenario of a news topic creation method according to an embodiment of the present application, as shown in fig. 1, where the scenario includes a user terminal and a news topic creation device, where the user terminal includes, but is not limited to, a tablet 101, a mobile phone 102, a desktop 103, and a notebook 104 as shown in fig. 1, and the news topic creation device includes, but is not limited to, a server 200 as shown in fig. 1. In this scenario, the news topic creation device may execute the news topic creation method provided in the embodiment of the present application to generate a news topic, and push the news topic to the user terminal, where the user terminal may display the news topic.
Fig. 2 is a flow chart of a news topic creation method according to an embodiment of the present application, as shown in fig. 2, the flow chart includes the following steps:
step S202, obtaining a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
Step S204, clustering the news information by a pre-trained information clustering model to obtain a plurality of information sets; the news content of the plurality of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
step S206, creating corresponding news topics for each news event according to the plurality of news information corresponding to each news event by a news topic creation system; the news topics are used for reporting corresponding news events; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the news events reported by the news topics.
In the embodiment of the application, a plurality of pieces of news information provided by an information provider is firstly obtained, then the plurality of pieces of news information are clustered through a pre-trained information clustering model to obtain a plurality of information sets, news contents of the plurality of pieces of news information included in each information set correspond to the same news event, and finally a corresponding news topic is created for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system. Therefore, according to the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved.
In the step S202, a plurality of pieces of news information provided by the information provider are acquired, and news contents corresponding to news events are recorded in the news information. Specifically, the present invention relates to a method for manufacturing a semiconductor device. The information provider may be each news provider such as a first financial accounting, a central news, etc., and the news topic creating apparatus acquires news information provided by each information provider. News content corresponding to a news event is recorded in the news information, for example, the news event is an eleven-symbol event, news content about a checked soldier in the eleven-symbol event can be recorded in the news information a, and news content about checked weapon equipment in the eleven-symbol event can be recorded in the news information B.
In the step S204, the acquired news information is clustered by a pre-trained information clustering model to obtain a plurality of information sets. Wherein, news content of a plurality of news information included in each information set corresponds to the same news event, and news events corresponding to different information sets are different.
For example, ten pieces of news information are obtained in step S202, wherein three pieces of news information are recorded with related news content of eleven readers respectively, other three pieces of news information are recorded with related news content with one peak respectively, and other four pieces of news information are recorded with related news content of a celebration event of a noon festival respectively. In step S204, the ten pieces of news information are clustered by the pre-trained information clustering model, so that three information sets can be obtained, wherein the three pieces of news information included in the first information set all record related news content of eleven readers, the three pieces of news information included in the second information set all record related news content with a peak, and the four pieces of news information included in the third information set all record related news content of the festival celebration in the noon.
The specific process of step S204 may be:
(a1) Extracting information titles of all news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all news information;
(a2) Clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
Specifically, the pre-trained information clustering model can be realized through a clustering algorithm such as Affinity Propagation. In this embodiment, the information clustering model extracts information titles of each piece of news information, performs word segmentation on each information title, determines word vectors of words obtained by each word segmentation, determines a vector of each information title according to the word vector of each word in the information title for one information title, calculates distances between the information titles according to the vectors of each information title, and determines semantic similarity between the information titles according to the distances between the information titles. Of course, the information clustering model may also determine the semantic similarity between the information titles of each news information through other existing algorithms, which is not limited herein.
After determining the semantic similarity between the information titles of each news information, the information clustering model can group each information title with larger semantic similarity into a class, and the clustering center can be flexibly adjusted according to the increase of the information titles in the class. After the information titles are clustered into a plurality of categories, the information clustering model clusters the news information corresponding to each information title according to the clustering mode, so that the news information is clustered to obtain a plurality of information sets. In this embodiment, the information clustering model may be implemented by other existing algorithms, which are not limited herein.
One training method of the information clustering model is as follows: the method comprises the steps of presetting a large amount of sample information, marking an information title of each sample information, manually gathering each information title into a plurality of types, gathering each sample information into a plurality of types, inputting the sample information, the information title of each sample information and the information title of each information title into a neural network model for training in a clustering mode, and obtaining the information clustering model by the neural network model trained by the sample data.
Therefore, according to the embodiment, the acquired plurality of news information can be clustered through the pre-trained information clustering model to obtain a plurality of information sets, and each information set can be understood to correspond to one news event, so that the effect of event mining is achieved.
In this embodiment, after a plurality of news information obtained by clustering is obtained by using a pre-trained information clustering model, an event name may also be generated for a news event corresponding to each information set. The process comprises the following steps: for each information set, word segmentation is carried out on the information titles of all news information in the information set, and event names of news events corresponding to the information set are determined according to a plurality of words obtained by word segmentation.
For example, if two information sets are obtained by copolymerization, word segmentation is performed on the information title of each piece of news information in one information set, the event name of the news event corresponding to the information set is determined according to a plurality of words obtained by word segmentation, then word segmentation is performed on the information title of each piece of news information in the other information set, and the event name of the news event corresponding to the information set is determined according to a plurality of words obtained by word segmentation.
According to a plurality of words obtained by word segmentation, determining the event name of the news event corresponding to the information set can be: removing the words with repeated semantics from the words obtained by word segmentation, determining the parts of speech of the remaining words, wherein the parts of speech comprise nouns, verbs, pronouns, prepositions and the like, connecting the words according to the parts of speech of the remaining words and a preset part of speech connection rule to obtain at least one phrase, and selecting the event name of a news event corresponding to the information set from the at least one phrase according to the similarity of each phrase and each phrase in a first preset phrase library.
Each phrase in the first preset phrase library is an event name of a historical news event collected manually in advance. The similarity between each phrase and each phrase in the first preset phrase library is calculated, and the phrase with more logical and reasonable language is screened out according to the similarity to serve as an event name in consideration of the fact that the current news event is similar to the historical news event.
For example, for a certain information set, word segmentation is performed on the information title of each piece of news information in the information set, and then each word is obtained as follows: 11. national celebration, reading, weapons, introduction. Firstly, removing words with repeated meanings, such as deleting eleven, then determining the parts of speech of the rest words, connecting the words according to the parts of speech of the rest words and a preset part of speech connection rule to obtain two phrases, namely 'national celebration reader introduction' and 'weapon reader national celebration introduction', finally, respectively calculating the similarity of each phrase and each phrase in a first preset phrase library to obtain the maximum similarity value of 98%, wherein the 98% corresponding short words are 'national celebration reader weapon introduction', and taking the phrases as event names of news events corresponding to the information set.
In this embodiment, the process of generating the event name of the news event corresponding to each information set may also be implemented through an information clustering model. Therefore, according to the embodiment, not only can a plurality of information sets be clustered, but also the event name of the news event corresponding to each information set can be generated, so that the effect of automatically and efficiently mining news events from news information is achieved.
In the step S206, a news topic creating system creates a corresponding news topic for each news event according to the plurality of news information corresponding to each news event. The thematic information is selected from a plurality of news information corresponding to the news event reported by the news themes. In this embodiment, a news topic creation system is provided, where the news topic creation system is running in a news topic creation device and is configured to execute step S206, and create a corresponding news topic for each news event according to a plurality of pieces of news information corresponding to each news event.
In step S206, corresponding news topics are created for each news event according to the plurality of news information corresponding to each news event, specifically:
(b1) Selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
(b2) For each news event, carrying out word segmentation on the information title of each topic information corresponding to the news event, selecting words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value from a plurality of words obtained by word segmentation, and generating topic names for reporting news topics of the news event according to the selected words;
(b3) For each news event, a news topic for reporting the news event is created based on the topic information for reporting the news topic for the news event and the topic name for reporting the news topic for the news event.
In the above-mentioned action (b 1), for each news event, firstly, selecting information not recorded with preset sensitive words and preset sensitive words from a plurality of news information corresponding to the news event (i.e. a plurality of news information in an information set corresponding to the news event), and in this process, screening in the title and text of the news information to obtain information not recorded with preset sensitive words and preset sensitive words in the title and text. Then, the information with the information heat degree larger than the preset heat degree is further screened from the screened news information, and the screened information is the topic information of the news topic for reporting the news event.
The method can be implemented by a news topic creation system, specifically, the source, the transfer capacity and the reading capacity of the news information are determined for each piece of news information through the news topic creation system, and the information popularity of the news information is determined according to the source, the transfer capacity and the reading capacity of the news information. The news topic creation system may store an information popularity calculation formula in advance, where the formula may be exemplified by an information popularity s=a×a1+b×b1+c 1, where a represents a score corresponding to a source of each news information, where a may preset a score of each source of the news information, for example, the score of the first financial transaction is 10 points, the score of the central news is 10 points, a1 is a weight coefficient corresponding to a preset source of the news information, B represents a transfer amount of each news information, B1 is a weight coefficient corresponding to a transfer amount of the preset news information, C represents a reading amount of each news information, and C1 is a weight coefficient corresponding to a reading amount of the preset news information. In this embodiment, the news topic creation system can calculate the information popularity of each news information through the formula.
In the above-mentioned action (b 2), for each news event, the information title of each topic information corresponding to the news event is subjected to word segmentation, a word having a TF-IDF (term frequency-inverse text frequency index) value greater than a preset threshold value is selected from a plurality of words obtained by word segmentation, and a topic name for reporting the news topic of the news event is generated according to the selected word.
For example, two news events are shared, word segmentation is performed on the information title of each piece of topic information corresponding to one news event, and the TF-IDF value of each word obtained by the word segmentation is calculated, so that among a plurality of words obtained by the word segmentation, a word with the TF-IDF value larger than a preset threshold value is selected as a target word, and a topic name for reporting the news topic of the news event is generated according to the target word. Similarly, for another news event, word segmentation is performed on the information title of each piece of topic information corresponding to the news event, and the TF-IDF value of each word obtained by word segmentation is calculated, so that among a plurality of words obtained by word segmentation, a word with the TF-IDF value larger than a preset threshold value is selected as a target word, and a topic name for reporting the news topic of the news event is generated according to the target word.
Generating a topic name of a news topic for reporting the news event according to the target word, wherein the topic name is specifically: removing the words with repeated semantics from each target word, determining the parts of speech of each remaining word, wherein the parts of speech comprise nouns, verbs, pronouns, prepositions and the like, connecting each word according to the parts of speech of each remaining word and a preset part of speech connection rule to obtain at least one phrase, and selecting a topic name for reporting the news topic of the news event from each phrase according to the similarity of each phrase and each phrase in a second preset phrase library.
Wherein, each phrase in the second preset phrase library is the topic name of the historical news topic collected manually in advance. The similarity between each phrase and each phrase in the second preset phrase library is calculated, and the phrase with more general and reasonable language logic is screened out according to the similarity to serve as the topic name in consideration of the fact that the current news topic is similar to the historical news topic. The process of generating the thematic name is similar to the process of generating the event name described above and is not repeated here.
In the above-described action (b 3), for each news event, a news topic for reporting the news event is created based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event. Specifically, for each news event, a news topic for reporting the news event is created by a news topic creation system based on topic information of the news topic for reporting the news event and topic names of news topics for reporting the news event, the news topic including at least topic names and topic information. The news topic creation system may also display each created news topic for review by editors.
Fig. 3a is a schematic diagram of an interface of a news topic creating system according to an embodiment of the present application, as shown in fig. 3a, in which a plurality of information sets obtained by clustering in step S204 are displayed in the form of news events, and event names corresponding to each information set are displayed. Of course, in this embodiment, the "create topic" button may not be set, and the news topic creation system may automatically create a corresponding news topic for the news event.
Fig. 3b is an interface schematic diagram of a news topic creation system according to another embodiment of the present application, where after a user clicks a "create topic" button, the news topic creation system automatically creates a news topic, and the created news topic is shown in fig. 3 b. Fig. 3c is an interface schematic diagram of a news topic creation system according to another embodiment of the present application, where, as shown in fig. 3a and fig. 3c, if a user clicks on an event name of a certain news event in fig. 3a, the interface jumps to fig. 3c, and a plurality of news information corresponding to the news event is displayed.
In this embodiment, the news topic creation system may also display the event popularity of each news event as shown in fig. 3a, and the method for calculating the event popularity of the news event will be developed later. Also shown in fig. 3a is the amount of reading per news event, which is the sum of the amounts of reading of the respective news information corresponding to each news event.
In this embodiment, the news topic creation system may also generate a corresponding topic description for each news topic after creating each news topic as shown in fig. 3 b. The generated topic description may be generated manually or by a news topic creation system based on information titles of a plurality of topic information contained in each news topic. Specifically, the news topic creation system acquires, for each news topic, information titles of a plurality of topic information contained in the news topic, performs word segmentation processing on each information title to obtain a plurality of words, and generates a topic description of the news topic according to concatenation of the plurality of words.
In this embodiment, as shown in fig. 3c, the news topic creating system may also provide a "create topic" button in the interface of fig. 3c, and after clicking the event name of a certain news event, the editor may click the "create topic" button in the interface, so that the news topic is automatically created for the news event by the news topic creating system.
Further, the method in this embodiment further includes:
(c1) Determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
(c2) Generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to an information provider so that the information provider can provide news information according to the event popularity ranking list.
In the above operation (c 1), determining the event popularity of the news event according to the source, the reloading amount and the reading amount of each piece of news information corresponding to the news event, specifically: and determining the information heat of each news information according to the source, the transfer capacity and the reading quantity of each news information corresponding to the news event, and taking the sum of the information heat of each news information as the event heat of the news event. The manner in which the information popularity of each news item is calculated may be as described above and will not be repeated here. In another embodiment, the sum of the transfer amount and the reading amount of each piece of news information corresponding to the news event can be used as the event popularity of the news event.
In the above-mentioned action (c 2), an event popularity ranking list is generated according to the event popularity of each news event, and the event popularity ranking list is sent to the information provider, so that the information provider can know the popularity of each news event currently, thereby providing popular information and avoiding the waste of labor cost for providing non-popular information. The event heat leaderboard may be as shown in table 1 below.
TABLE 1
Event sequence number Event name Event heat
1 Eleven reader 100
2 Some exercise 98
10 XX TV play start-up 50
Fig. 4 is an interface schematic diagram of a user terminal according to an embodiment of the present application, where, as shown in fig. 4, the user terminal may receive and display a news topic pushed by a news topic creating device. As shown in FIG. 4, the interface displays the title of the news title "the purchase intention of iphone falls" and displays the information icon of each title information. In this embodiment, when the news topic creation device stores the created news topic, the created news topic may be stored in the Hbase database, so as to improve stability of data storage and convenience of data call.
In sum, through the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved. And because the event popularity ranking list can be generated according to the event popularity of each news event and is sent to the information provider, the information provider can conveniently master the current trend of the popular event, and the information provider can conveniently provide news information related to the popular event.
Fig. 5 is a flow chart of a news topic creation method according to another embodiment of the present application, as shown in fig. 5, the flow chart includes the following steps:
step S502, obtaining a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
step S504, clustering the news information by a pre-trained information clustering model to obtain a plurality of information sets; the news content of the plurality of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
step S506, determining the event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
step S508, creating news topics for the target event according to the plurality of pieces of news information corresponding to the target event by a news topic creation system; the news topics are used for reporting target events; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the target event.
According to the embodiment of the application, the target event with higher heat can be automatically selected from all news events, and the news topics corresponding to the target event are automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, automatic creation of popular topics is realized, and the creation efficiency of the news topics is improved.
The specific procedures of the above-described step S502 and step S504 are the same as those of the above-described step S202 and step S204, and are not repeated here.
In the step S506, the event popularity of each news event is determined according to the plurality of news information corresponding to each news event, which specifically includes: and determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event, and determining the event popularity of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event. This process may refer to the description of the aforementioned action (c 1), and is not repeated here. In this embodiment, the source of news information refers to the information provider of news information.
In the step S506, a target event is selected from the news events according to the event heat of each news event, specifically, an event with the event heat greater than a preset heat threshold is selected from the news events as the target event.
In the step S508, a news topic is created for the target event according to a plurality of news information corresponding to the target event by the news topic creation system, wherein the news topic includes a topic name and topic information, and the topic information is selected from the plurality of news information corresponding to the target event. It will be appreciated that from the foregoing description, news topics may also include a topic description.
The specific process of step S508 is:
(d1) Selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as the topic information of the news topic;
(d2) Word segmentation is carried out on the information titles of each topic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and topic names of news topics are generated according to the selected words;
(d3) A news topic is created based on the topic name of the news topic and topic information of the news topic.
This process may be described in detail with reference to the aforementioned step S206, and will not be repeated here.
The method of fig. 5 may further include: generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to an information provider so that the information provider can provide news information according to the event popularity ranking list. Specifically, an event popularity ranking list is generated according to the event popularity of each news event, and the event popularity ranking list is sent to an information provider, so that the information provider can know the popularity of each news event currently, thereby providing popular information and avoiding the waste of labor cost for providing non-popular information.
In sum, through the embodiment of the application, the target event with higher heat can be automatically selected from all news events, and the news topics corresponding to the target event are automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating news topics is solved, automatic creation of popular topics is realized, and the creation efficiency of the news topics is improved. And because the event popularity ranking list can be generated according to the event popularity of each news event and is sent to the information provider, the information provider can conveniently master the current trend of the popular event, and the information provider can conveniently provide news information related to the popular event.
Fig. 6 is a schematic diagram of module composition of a news topic creation device according to an embodiment of the present application, as shown in fig. 6, where the device includes:
a first obtaining module 61, configured to obtain a plurality of pieces of news information provided by an information provider, where news content corresponding to a news event is recorded in the news information;
a first clustering module 62, configured to cluster the plurality of news information by using a pre-trained information clustering model, so as to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
A first creating module 63, configured to create, by using a news topic creating system, a corresponding news topic for each news event according to a plurality of news information corresponding to each news event; wherein the news topic is used for reporting the corresponding news event; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
Optionally, the first clustering module 62 is specifically configured to:
extracting information titles of all the news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all the news information;
clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
Optionally, the system further comprises a name generation module for:
for each information set, word segmentation is carried out on the information titles of all news information in the information set, and event names of news events corresponding to the information set are determined according to a plurality of words obtained by word segmentation.
Optionally, the first creation module 63 is specifically configured to:
selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
for each news event, word segmentation is carried out on the information title of each topic information corresponding to the news event, words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value are selected from a plurality of words obtained by word segmentation, and topic names for reporting news topics of the news event are generated according to the selected words;
for each of the news events, a news topic for reporting the news event is created based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event.
Optionally, the device further comprises a heat determining module for:
and determining the source, the transfer capacity and the reading quantity of the news information according to each piece of news information by a news topic creation system, and determining the information heat of the news information according to the source, the transfer capacity and the reading quantity of the news information.
Optionally, the method further comprises a first ordering module for:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event according to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
In the embodiment of the application, a plurality of pieces of news information provided by an information provider is firstly obtained, then the plurality of pieces of news information are clustered through a pre-trained information clustering model to obtain a plurality of information sets, news contents of the plurality of pieces of news information included in each information set correspond to the same news event, and finally a corresponding news topic is created for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system. Therefore, according to the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved.
It should be noted that, the news topic creating device in the embodiment of the present application can implement each process of the foregoing news topic creating method, and achieve the same effects and functions, which are not described herein again.
Fig. 7 is a schematic block diagram of a news topic creating apparatus according to another embodiment of the present application, where, as shown in fig. 7, the apparatus includes:
a second obtaining module 71, configured to obtain a plurality of pieces of news information provided by an information provider, where news content corresponding to a news event is recorded in the news information;
a second clustering module 72, configured to cluster the plurality of news information by using a pre-trained information clustering model, so as to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
a selection module 73, configured to determine an event popularity of each news event according to a plurality of news information corresponding to each news event, and select a target event from each news event according to the event popularity of each news event;
a second creation module 74, configured to create, by using a news topic creation system, a news topic for the target event according to a plurality of pieces of news information corresponding to the target event; wherein the news topic is used for reporting the target event; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the target event.
Optionally, the second creating module 74 is specifically configured to:
selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as thematic information of the news thematic;
word segmentation is carried out on the information titles of the thematic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and thematic names of the news thematic are generated according to the selected words;
creating the news topic based on the topic name of the news topic and topic information of the news topic.
Optionally, the selecting module 73 is specifically configured to:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event;
and determining the event popularity of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event.
Optionally, the method further comprises a second sorting module for:
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
According to the embodiment of the application, the target event with higher heat can be automatically selected from all news events, and the news topics corresponding to the target event are automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, automatic creation of popular topics is realized, and the creation efficiency of the news topics is improved.
It should be noted that, the news topic creating device in the embodiment of the present application can implement each process of the foregoing news topic creating method, and achieve the same effects and functions, which are not described herein again.
The embodiment of the present application further provides a news topic creating device, and fig. 8 is a schematic structural diagram of the news topic creating device provided in an embodiment of the present application, as shown in fig. 8, the news topic creating device may generate relatively large differences due to different configurations or performances, and may include one or more processors 901 and a memory 902, where one or more storage applications or data may be stored in the memory 902. Wherein the memory 902 may be transient storage or persistent storage. The application program stored in the memory 902 may include one or more modules (not shown in the figures), each of which may include a series of computer-executable instructions for use in a news topic creation device. Still further, the processor 901 may be arranged to communicate with the memory 902 and execute a series of computer executable instructions in the memory 902 on the news topic creation device. The news topic creation device may also include one or more power supplies 903, one or more wired or wireless network interfaces 904, one or more input/output interfaces 905, one or more keyboards 906, and the like.
In a particular embodiment, a news topic creation device includes a memory and one or more programs, where the one or more programs are stored in the memory and the one or more programs may include one or more modules and each module may include a series of computer-executable instructions for the news topic creation device and configured to be executed by one or more processors and the one or more programs include computer-executable instructions for:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
creating corresponding news topics for each news event according to a plurality of pieces of news information corresponding to each news event through a news topic creation system; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
Optionally, the computer executable instructions, when executed, cluster the plurality of news information by a pre-trained information cluster model to obtain a plurality of information sets, comprising:
extracting information titles of all the news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all the news information;
clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
Optionally, the computer executable instructions, when executed, further comprise:
for each information set, word segmentation is carried out on the information titles of all news information in the information set, and event names of news events corresponding to the information set are determined according to a plurality of words obtained by word segmentation.
Optionally, the computer executable instructions, when executed, create a corresponding news topic for each of the news events based on a plurality of news information corresponding to each of the news events, comprising:
selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
For each news event, word segmentation is carried out on the information title of each topic information corresponding to the news event, words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value are selected from a plurality of words obtained by word segmentation, and topic names for reporting news topics of the news event are generated according to the selected words;
for each of the news events, a news topic for reporting the news event is created based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event.
Optionally, the computer executable instructions, when executed, further comprise:
and determining the source, the transfer capacity and the reading quantity of the news information according to each piece of news information by a news topic creation system, and determining the information heat of the news information according to the source, the transfer capacity and the reading quantity of the news information.
Optionally, the computer executable instructions, when executed, further comprise:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event according to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
Generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
In the embodiment of the application, a plurality of pieces of news information provided by an information provider is firstly obtained, then the plurality of pieces of news information are clustered through a pre-trained information clustering model to obtain a plurality of information sets, news contents of the plurality of pieces of news information included in each information set correspond to the same news event, and finally a corresponding news topic is created for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system. Therefore, according to the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved.
It should be noted that, the news topic creating device in the embodiment of the present application can implement each process of the foregoing news topic creating method, and achieve the same effects and functions, which are not described herein again.
In another particular embodiment, a news topic creation device includes a memory and one or more programs, where the one or more programs are stored in the memory and the one or more programs may include one or more modules and each module may include a series of computer-executable instructions for the news topic creation device and configured to be executed by one or more processors and the one or more programs include computer-executable instructions for:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
determining event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
Creating a news topic for the target event according to a plurality of pieces of news information corresponding to the target event by a news topic creation system; wherein the news topic is used for reporting the target event; the news topic comprises topic names and topic information, and the topic information is determined and obtained according to a plurality of news information corresponding to the target event.
Optionally, the computer executable instructions, when executed, create a news topic for the target event based on a plurality of news information corresponding to the target event, including:
selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as thematic information of the news thematic;
word segmentation is carried out on the information titles of the thematic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and thematic names of the news thematic are generated according to the selected words;
creating the news topic based on the topic name of the news topic and topic information of the news topic.
Optionally, the computer executable instructions, when executed, determine an event popularity for each of the news events based on a plurality of news information corresponding to each of the news events, including:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event;
and determining the event popularity of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event.
Optionally, the computer executable instructions, when executed, further comprise:
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
According to the embodiment of the application, the target event with higher heat can be automatically selected from all news events, and the news topics corresponding to the target event are automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, automatic creation of popular topics is realized, and the creation efficiency of the news topics is improved.
It should be noted that, the news topic creating device in the embodiment of the present application can implement each process of the foregoing news topic creating method, and achieve the same effects and functions, which are not described herein again.
Further, the embodiment of the present application further provides a storage medium, which is configured to store computer executable instructions, and in a specific embodiment, the storage medium may be a usb disk, an optical disc, a hard disk, etc., where the computer executable instructions stored in the storage medium can implement the following flow when executed by a processor:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
creating corresponding news topics for each news event according to a plurality of pieces of news information corresponding to each news event through a news topic creation system; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, cluster the plurality of news information by a pre-trained information cluster model to obtain a plurality of information sets, including:
extracting information titles of all the news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all the news information;
clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
Optionally, the storage medium stores computer executable instructions that when executed by the processor further comprise:
for each information set, word segmentation is carried out on the information titles of all news information in the information set, and event names of news events corresponding to the information set are determined according to a plurality of words obtained by word segmentation.
Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, create a corresponding news topic for each of the news events based on a plurality of news information corresponding to each of the news events, including:
Selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
for each news event, word segmentation is carried out on the information title of each topic information corresponding to the news event, words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value are selected from a plurality of words obtained by word segmentation, and topic names for reporting news topics of the news event are generated according to the selected words;
for each of the news events, a news topic for reporting the news event is created based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event.
Optionally, the storage medium stores computer executable instructions that when executed by the processor further comprise:
and determining the source, the transfer capacity and the reading quantity of the news information according to each piece of news information by a news topic creation system, and determining the information heat of the news information according to the source, the transfer capacity and the reading quantity of the news information.
Optionally, the storage medium stores computer executable instructions that when executed by the processor further comprise:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event according to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
In the embodiment of the application, a plurality of pieces of news information provided by an information provider is firstly obtained, then the plurality of pieces of news information are clustered through a pre-trained information clustering model to obtain a plurality of information sets, news contents of the plurality of pieces of news information included in each information set correspond to the same news event, and finally a corresponding news topic is created for each news event according to the plurality of pieces of news information corresponding to each news event through a news topic creation system. Therefore, according to the embodiment of the application, the news topics can be automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, and the creation efficiency of the news topics is improved.
It should be noted that, the storage medium in the embodiment of the present application can implement each process of the foregoing news topic creation method and achieve the same effects and functions, which are not described herein again.
In another specific embodiment, the storage medium may be a usb disk, an optical disk, a hard disk, or the like, where the storage medium stores computer executable instructions that when executed by the processor implement the following procedures:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
determining event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
creating a news topic for the target event according to a plurality of pieces of news information corresponding to the target event by a news topic creation system; wherein the news topic is used for reporting the target event; the news topic comprises topic names and topic information, and the topic information is determined and obtained according to a plurality of news information corresponding to the target event.
Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, create a news topic for the target event based on a plurality of news information corresponding to the target event, including:
selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as thematic information of the news thematic;
word segmentation is carried out on the information titles of the thematic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and thematic names of the news thematic are generated according to the selected words;
creating the news topic based on the topic name of the news topic and topic information of the news topic.
Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, determine an event popularity for each of the news events based on a plurality of news information corresponding to each of the news events, including:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event;
And determining the event popularity of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event.
Optionally, the storage medium stores computer executable instructions that when executed by the processor further comprise:
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
According to the embodiment of the application, the target event with higher heat can be automatically selected from all news events, and the news topics corresponding to the target event are automatically created through the information clustering model and the news topic creation system, so that the problem of low efficiency of manually creating the news topics is solved, automatic creation of popular topics is realized, and the creation efficiency of the news topics is improved.
It should be noted that, the storage medium in the embodiment of the present application can implement each process of the foregoing news topic creation method and achieve the same effects and functions, which are not described herein again.
In the 90 s of the 20 th century, improvements to one technology could clearly distinguish between improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) and software (improvements to the process flow). However, as technology advances, many of today's process flow improvements have become apparent as direct improvements in hardware circuit architecture. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of HDL, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be considered a hardware component, and means for performing various functions included therein may also be considered structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having some function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may be implemented in any method or technology for information storage. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Those skilled in the art will appreciate that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and similar parts of each embodiment are referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations will be apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (20)

1. A news topic creation method comprising:
acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
Extracting thematic information and thematic names according to a plurality of pieces of news information corresponding to each news event through a news thematic creation system, and creating corresponding news thematic for each news event based on the extracted thematic information and thematic names; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
2. The method of claim 1, clustering the plurality of news information by a pre-trained information clustering model to obtain a plurality of information sets, comprising:
extracting information titles of all the news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all the news information;
clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
3. The method of claim 2, further comprising:
For each information set, word segmentation is carried out on the information titles of all news information in the information set, and event names of news events corresponding to the information set are determined according to a plurality of words obtained by word segmentation.
4. The method of claim 1, extracting topic information and topic names from a plurality of news information corresponding to each of the news events, creating corresponding news topics for each of the news events based on the extracted topic information and topic names, comprising:
selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
for each news event, carrying out word segmentation on the information title of each piece of topic information corresponding to the news event, selecting words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value from a plurality of words obtained by word segmentation, and generating topic names for reporting news topics of the news event according to the selected words;
For each of the news events, creating a news topic for reporting the news event based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event.
5. The method of claim 4, further comprising:
and determining the source, the transfer capacity and the reading quantity of the news information according to each piece of news information by a news topic creation system, and determining the information heat of the news information according to the source, the transfer capacity and the reading quantity of the news information.
6. The method of any one of claims 1 to 5, further comprising:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event according to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
7. A news topic creation method comprising:
Acquiring a plurality of pieces of news information provided by an information provider, wherein news contents corresponding to news events are recorded in the news information;
clustering the news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
determining the event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
extracting thematic information and thematic names according to a plurality of pieces of news information corresponding to the target event by a news thematic creation system, and creating news thematic for the target event based on the extracted thematic information and thematic names; wherein the news topic is used for reporting the target event; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the target event.
8. The method of claim 7, extracting topic information and topic names from a plurality of news information corresponding to the target event, creating news topics for the target event based on the extracted topic information and topic names, comprising:
selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as thematic information of the news themes;
word segmentation is carried out on the information titles of the thematic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and thematic names of the news thematic are generated according to the selected words;
creating the news topic based on the topic name of the news topic and topic information of the news topic.
9. The method of claim 7 or 8, determining an event popularity of each of the news events based on a plurality of news information corresponding to each of the news events, comprising:
determining the source, the reloading amount and the reading amount of each piece of news information corresponding to each news event;
And determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event.
10. The method of claim 9, further comprising:
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
11. A news topic creation device comprising:
the first acquisition module is used for acquiring a plurality of pieces of news information provided by the information provider, wherein the news information is recorded with news content corresponding to a news event;
the first clustering module is used for clustering the plurality of news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
the first creation module is used for extracting thematic information and thematic names according to a plurality of pieces of news information corresponding to each news event through the news thematic creation system, and creating corresponding news themes for each news event based on the extracted thematic information and the thematic names; the news topics are used for reporting the corresponding news events; the news topics comprise topic names and topic information, and the topic information is obtained by determining a plurality of pieces of news information corresponding to news events reported by the news topics.
12. The apparatus of claim 11, the first clustering module is specifically configured to:
extracting information titles of all the news information through a pre-trained information clustering model, and calculating semantic similarity among the information titles of all the news information;
clustering each piece of news information according to semantic similarity among information titles of each piece of news information through a pre-trained information clustering model to obtain a plurality of information sets.
13. The apparatus of claim 11, the first creation module is specifically configured to:
selecting information which is not recorded with preset sensitive words and has information heat greater than preset heat from a plurality of pieces of news information corresponding to each news event, and taking the selected information as thematic information of news themes for reporting the news event;
for each news event, carrying out word segmentation on the information title of each piece of topic information corresponding to the news event, selecting words with word frequency-inverse text frequency index TF-IDF value larger than a preset threshold value from a plurality of words obtained by word segmentation, and generating topic names for reporting news topics of the news event according to the selected words;
For each of the news events, creating a news topic for reporting the news event based on topic information of the news topic for reporting the news event and a topic name of the news topic for reporting the news event.
14. The apparatus of any of claims 11 to 13, further comprising a first ranking module to:
determining the source, the transfer capacity and the reading quantity of each piece of news information corresponding to each news event according to each news event, and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event;
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
15. A news topic creation device comprising:
the second acquisition module is used for acquiring a plurality of pieces of news information provided by the information provider, wherein the news information is recorded with news content corresponding to a news event;
the second clustering module is used for clustering the plurality of news information through a pre-trained information clustering model to obtain a plurality of information sets; the news content of a plurality of pieces of news information contained in each information set corresponds to the same news event, and the news events corresponding to different information sets are different;
The selecting module is used for determining the event heat of each news event according to a plurality of pieces of news information corresponding to each news event, and selecting a target event from each news event according to the event heat of each news event;
the second creation module is used for extracting topic information and topic names according to a plurality of pieces of news information corresponding to the target event through a news topic creation system, and creating news topics for the target event based on the extracted topic information and topic names; wherein the news topic is used for reporting the target event; the news topics comprise topic names and topic information, and the topic information is determined and obtained according to a plurality of pieces of news information corresponding to the target event.
16. The apparatus of claim 15, the second creation module is specifically configured to:
selecting information which is not recorded with preset sensitive words and has information heat degree larger than preset heat degree from a plurality of pieces of news information corresponding to the target event, and taking the selected information as thematic information of the news themes;
word segmentation is carried out on the information titles of the thematic information, words with word frequency-inverse text frequency index TF-IDF values larger than a preset threshold value are selected from a plurality of words obtained through word segmentation, and thematic names of the news thematic are generated according to the selected words;
Creating the news topic based on the topic name of the news topic and topic information of the news topic.
17. The apparatus according to claim 15 or 16, wherein the selection module is specifically configured to:
determining the source, the reloading amount and the reading amount of each piece of news information corresponding to each news event;
and determining the event heat of the news event according to the source, the transfer capacity and the reading quantity of each piece of news information corresponding to the news event.
18. The apparatus of claim 17, further comprising a second ranking module to:
generating an event popularity ranking list according to the event popularity of each news event, and sending the event popularity ranking list to the information provider so that the information provider can provide news information according to the event popularity ranking list.
19. A news topic creation device comprising: a processor; and a memory arranged to store computer executable instructions that when executed cause the processor to implement the steps of the news topic creation method of any one of the preceding claims 1 to 6 or the news topic creation method of any one of the preceding claims 7 to 10.
20. A storage medium storing computer-executable instructions which, when executed, implement the steps of the news topic creation method of any one of the preceding claims 1 to 6 or the steps of the news topic creation method of any one of the preceding claims 7 to 10.
CN201910471260.XA 2019-05-31 2019-05-31 News thematic creation method and device Active CN110162796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910471260.XA CN110162796B (en) 2019-05-31 2019-05-31 News thematic creation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910471260.XA CN110162796B (en) 2019-05-31 2019-05-31 News thematic creation method and device

Publications (2)

Publication Number Publication Date
CN110162796A CN110162796A (en) 2019-08-23
CN110162796B true CN110162796B (en) 2023-07-18

Family

ID=67630956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910471260.XA Active CN110162796B (en) 2019-05-31 2019-05-31 News thematic creation method and device

Country Status (1)

Country Link
CN (1) CN110162796B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026990B (en) * 2019-12-05 2024-04-16 中国银行股份有限公司 Hot topic log information display method and device
CN111209390B (en) * 2020-01-06 2023-09-05 新方正控股发展有限责任公司 News display method and system and computer readable storage medium
CN111428049B (en) * 2020-03-20 2023-07-21 北京百度网讯科技有限公司 Event thematic generation method, device, equipment and storage medium
CN111460257B (en) * 2020-03-27 2023-10-31 北京百度网讯科技有限公司 Thematic generation method, apparatus, electronic device and storage medium
CN111698573B (en) * 2020-06-24 2021-10-01 四川长虹电器股份有限公司 Movie and television special topic creating method and device
CN111667023B (en) * 2020-06-30 2024-04-05 腾讯科技(深圳)有限公司 Method and device for acquiring articles of target category
CN112287172B (en) * 2020-10-29 2024-08-20 药渡经纬信息科技(北京)有限公司 Video album generation method and device
CN112926298B (en) * 2021-03-02 2024-08-06 北京百度网讯科技有限公司 News content identification method, related device and computer program product
CN114780712B (en) * 2022-04-06 2023-07-04 科技日报社 News thematic generation method and device based on quality evaluation
CN117076963B (en) * 2023-10-17 2024-01-02 北京国科众安科技有限公司 Information heat analysis method based on big data platform

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870474B (en) * 2012-12-11 2018-06-08 北京百度网讯科技有限公司 A kind of news topic method for organizing and device
CN107066537A (en) * 2017-03-06 2017-08-18 广州神马移动信息科技有限公司 Hot news generation method, equipment, electronic equipment
CN109800413A (en) * 2018-12-11 2019-05-24 北京百度网讯科技有限公司 Recognition methods, device, equipment and the readable storage medium storing program for executing of media event

Also Published As

Publication number Publication date
CN110162796A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
CN110162796B (en) News thematic creation method and device
CN117235226A (en) Question response method and device based on large language model
CN110569428B (en) Recommendation model construction method, device and equipment
CN116227474B (en) Method and device for generating countermeasure text, storage medium and electronic equipment
CN110457578B (en) Customer service demand identification method and device
CN110020427B (en) Policy determination method and device
CN110032582B (en) Data processing method, device, equipment and system
CN108171267A (en) User group partitioning method and device, information push method and device
CN117076650B (en) Intelligent dialogue method, device, medium and equipment based on large language model
CN114880489B (en) Data processing method, device and equipment
CN113079201B (en) Information processing system, method, device and equipment
CN111046304B (en) Data searching method and device
CN111209277B (en) Data processing method, device, equipment and medium
US20200234705A1 (en) Information processing system, method, device and equipment
CN115952859B (en) Data processing method, device and equipment
CN109584088B (en) Product information pushing method and device
CN116662657A (en) Model training and information recommending method, device, storage medium and equipment
CN104376034B (en) Information processing equipment, information processing method and program
CN112182116B (en) Data exploration method and device
CN114676257A (en) Conversation theme determining method and device
CN116340469B (en) Synonym mining method and device, storage medium and electronic equipment
CN111723567B (en) Text selection data processing method, device and equipment
CN117271611B (en) Information retrieval method, device and equipment based on large model
CN108804603B (en) Man-machine written dialogue method and system, server and medium
CN115600155B (en) Data processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200921

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200922

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant before: Advanced innovation technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant