CN108897774B - Method, device and storage medium for acquiring news hotspots - Google Patents

Method, device and storage medium for acquiring news hotspots Download PDF

Info

Publication number
CN108897774B
CN108897774B CN201810552297.0A CN201810552297A CN108897774B CN 108897774 B CN108897774 B CN 108897774B CN 201810552297 A CN201810552297 A CN 201810552297A CN 108897774 B CN108897774 B CN 108897774B
Authority
CN
China
Prior art keywords
news
hot news
initial
hot
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810552297.0A
Other languages
Chinese (zh)
Other versions
CN108897774A (en
Inventor
高锐
李�浩
吴伊竹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810552297.0A priority Critical patent/CN108897774B/en
Publication of CN108897774A publication Critical patent/CN108897774A/en
Application granted granted Critical
Publication of CN108897774B publication Critical patent/CN108897774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a method for acquiring hot news, which comprises the following steps: acquiring initial hot news from each news release source in a plurality of news release sources, wherein the initial hot news is the hot news of each news release source; clustering the initial hot news with the similarity larger than a similarity threshold value in the initial hot news to obtain intermediate hot news; and determining the intermediate hot news with the heat degree greater than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree greater than the heat degree threshold value as target hot news. According to the technical scheme, the hot news of each news release source in the whole network can be acquired and processed, and then the hot news with the highest hot rank in the hot news released by each news release source in the whole network is acquired, so that the efficiency of acquiring the hot news in the whole network is improved.

Description

Method, equipment and storage medium for acquiring news hotspots
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, a device, and a storage medium for obtaining a news hotspot.
Background
Today, with the development of networks, the demand for news reading has long shifted from traditional paper media to network media. News released on large-scale network media portals such as Tencent, network exchange and New wave have the characteristics of large release total amount, high update frequency, more reading users, wide reader distribution and the like. The number of news that is exposed each day is extremely limited for a single user. In order to display the articles that the user is most likely to read at the front position and obtain a higher click rate, the most common method of the web media portal is to sort based on the click rate, that is, to sort based on the popularity.
However, the existing hotness ranking method needs the user to browse one by one to know the important news of each day.
Disclosure of Invention
The embodiment of the application provides a method for acquiring hot news, which can acquire hot news with the highest rank in hot news issued by various news issuing sources in the whole network, thereby improving the efficiency of acquiring the hot news in the whole network. The embodiment of the application also provides a corresponding server, a terminal and a computer readable storage medium.
An embodiment of the present application provides a method for obtaining hot news in one aspect, including:
acquiring initial hot news from each news release source in a plurality of news release sources, wherein the initial hot news is the hot news of each news release source;
clustering the initial hot news with the similarity larger than a similarity threshold value in the initial hot news to obtain intermediate hot news;
and determining the intermediate hot news with the heat degree larger than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree larger than the heat degree threshold value as target hot news.
Another aspect of the embodiments of the present application provides a method for obtaining hot news, including:
responding to the selection operation of the news application;
sending a hotspot request message for the news application to a server;
receiving target hot news sent by the server, wherein the target hot news is initial hot news acquired by the server from each news release source in a plurality of news release sources and is obtained by processing the initial hot news, and the initial hot news is the hot news of each news release source;
and displaying the target hot news.
In another aspect, an embodiment of the present application provides a server, including:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring initial hot news from each news release source in a plurality of news release sources, and the initial hot news is the hot news of each news release source;
the clustering unit is used for clustering the initial hot news with the similarity larger than a similarity threshold value in the initial hot news acquired by the acquiring unit to obtain intermediate hot news;
and the determining unit is used for determining the intermediate hot news obtained by clustering the clustering unit with the heat degree greater than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree greater than the heat degree threshold value as the target hot news.
In combination with the above server, in a first possible implementation,
the acquisition unit is configured to:
determining a hot news layout corresponding to each news release source;
and acquiring the hot news of each news release source from the corresponding position of the hot news page corresponding to each news release source.
In combination with the first possible implementation manner of the server, in a second possible implementation manner,
the acquisition unit is configured to:
and acquiring the hot news of each news release source in an acquisition time period, wherein the hot news in the acquisition time period comprises the hot news at the initial moment of the acquisition time period and newly-added hot news after the news release source page is refreshed in the acquisition process.
With reference to the server and the first or second possible implementation manner, in a third possible implementation manner,
the clustering unit is used for:
determining the field to which the initial hot news belongs;
and clustering the initial hot news with the similarity larger than the similarity threshold into intermediate hot news aiming at the initial hot news in the same field.
In combination with the third possible implementation manner, in a fourth possible implementation manner,
the clustering unit is further configured to:
aiming at initial hot news in the same field, determining an entity and a keyword, wherein the entity is a participant object in the news;
and determining the similarity of each initial hot news in the same field according to the entity and the keyword.
In combination with the above-mentioned fourth possible implementation manner, in a fifth possible implementation manner,
the clustering unit is configured to:
determining the occurrence times and positions of the same entity in each initial hot news in the same field;
determining initial hot news meeting initial clustering conditions in the same field according to the times and the positions;
and for each initial hot news in the initial hot news meeting the initial clustering conditions, performing similarity scoring on each initial hot news according to keywords.
In combination with the third possible implementation manner, in a sixth possible implementation manner,
the clustering unit is used for:
and determining the field of the initial hot news according to the link information of the news release source from which the initial hot news comes and the text keywords of the initial hot news.
With reference to the sixth possible implementation manner, in a seventh possible implementation manner,
the clustering unit is used for:
analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs;
calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field;
and if the cosine similarity is larger than a preset threshold value, the estimated field is the field to which the initial hot news belongs.
With reference to the server and the first or second possible implementation manner, in an eighth possible implementation manner,
the determination unit is further configured to:
determining the popularity of the intermediate hot news according to the authority level of the news release source to which the initial hot news corresponding to the intermediate hot news belongs, the browsed quantity and the concerned quantity of the corresponding initial hot news;
and determining the popularity of the non-clustered initial hot news according to the authority level, the browsed amount and the concerned amount of the news release source to which the non-clustered initial hot news belongs.
With reference to the server and the first or second possible implementation manner, in a ninth possible implementation manner, the server further includes a receiving unit and a sending unit,
the receiving unit is used for receiving a hotspot request sent by a terminal;
the determining unit is further configured to determine the current target hot news according to the hot request;
the sending unit is configured to send the current target hot news to the terminal.
In combination with the above ninth possible implementation manner, in a tenth possible implementation manner,
the sending unit is further configured to send display style indication information of the current target hot news to the terminal, where the display style indication information is used to indicate that the terminal displays the current target hot news according to the display style.
In combination with the above ninth possible implementation manner, in an eleventh possible implementation manner,
the sending unit is further configured to send a popularity ranking of each target hot news in the current target hot news to the terminal, where the popularity ranking is used for the terminal to allocate the target hot news to a preset display style for display.
In another aspect, an embodiment of the present application provides a terminal, including:
an input unit for responding to a selection operation for a news application;
the sending unit is used for sending a hot spot request message aiming at the news application to a server according to the selection operation input by the input unit;
the receiving unit is configured to receive target hot news sent by the server, where the target hot news is initial hot news acquired by the server from each of a plurality of news distribution sources, and is obtained by processing the initial hot news, and the initial hot news is hot news of each news distribution source;
and the display unit is used for displaying the target hot news received by the receiving unit.
In combination with the above terminal, in a first possible implementation manner,
the receiving unit is further configured to receive display style indication information of the target hot news sent by the server;
and the display unit is used for displaying the target hot news according to the display style indication information.
In combination with the above terminal, in a second possible implementation,
the receiving unit is further configured to receive a popularity ranking of each target hot news in the target hot news sent by the server;
and the display unit is used for distributing the target hot news into a preset display style according to the popularity ranking for display.
In another aspect, an embodiment of the present application provides a server, where the server includes: the system comprises an input/output (I/O) interface, a processor and a memory, wherein the memory stores an instruction for acquiring the news hotspot at the server side;
the processor is used for executing the instruction for acquiring the news hotspot stored in the memory and executing the steps of the method for acquiring the news hotspot at the server side.
In another aspect, an embodiment of the present application provides a terminal, where the terminal includes an input device, a transceiver, a display, a processor, and a memory, where the memory stores an instruction for obtaining hot news, and the processor is configured to control the input device, the transceiver, and the display to perform corresponding operations;
the input equipment is used for responding to the selection operation at the terminal side;
the transceiver is used for executing the steps of message sending and information receiving at the terminal side:
the display is used for displaying the target hot news at the terminal side.
Yet another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the method described above on the server or terminal side.
A further aspect of the present application provides a computer program product containing instructions which, when run on a computer, cause the computer to perform the method described above for a server or terminal.
According to the embodiment of the application, hot news with the highest rank in hot news published by each news publishing source in the whole network can be obtained and processed, and then the hot news with the highest rank in the hot news published by each news publishing source in the whole network can be obtained. Therefore, the efficiency of acquiring hot spot news in the whole network is improved.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a system for obtaining a news hotspot in an embodiment of the present application;
FIG. 2 is a schematic diagram of an embodiment of a method for obtaining a news hotspot in an embodiment of the present application;
fig. 3A is a schematic diagram illustrating hot news arrangement on a publishing source website in an embodiment of the present application;
fig. 3B is a schematic diagram illustrating hot news arrangement on another publishing source website in the embodiment of the present application;
fig. 3C is a schematic diagram illustrating hot news arrangement on another publishing source website in the embodiment of the present application;
FIG. 4 is an exemplary diagram of a page before and after refresh in an embodiment of the present application;
FIG. 5 is a schematic diagram of another embodiment of a method for obtaining a news hotspot in the embodiment of the present application;
FIG. 6 is a schematic diagram of another embodiment of a method for obtaining a news hotspot in the embodiment of the present application;
fig. 7 is a schematic diagram of another embodiment of the method for obtaining a news hotspot in the embodiment of the present application;
FIG. 8 is a schematic diagram of another embodiment of a system for obtaining a news hotspot in the embodiment of the present application;
fig. 9 is a schematic interface diagram of a terminal displaying target hot news in the embodiment of the application;
fig. 10 is a schematic diagram of another embodiment of a method for acquiring a news hotspot in the embodiment of the present application;
FIG. 11 is a schematic diagram of an embodiment of a server in an embodiment of the present application;
FIG. 12 is a schematic diagram of another embodiment of a server in the embodiment of the present application;
fig. 13 is a schematic diagram of an embodiment of a terminal in the embodiment of the present application;
FIG. 14 is a schematic diagram of another embodiment of a server in the embodiment of the present application;
fig. 15 is a schematic diagram of an embodiment of a terminal in the embodiment of the present application.
Detailed Description
Embodiments of the present application will now be described with reference to the accompanying drawings, and it is to be understood that the described embodiments are merely illustrative of some, but not all, embodiments of the present application. As can be known to those skilled in the art, with the development of technology and the emergence of new scenes, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
The embodiment of the application provides a method for acquiring hot news, which can acquire hot news with the highest rank in hot news issued by various news issuing sources in the whole network, thereby improving the efficiency of acquiring the hot news in the whole network. The embodiment of the application also provides a corresponding server, a terminal and a computer readable storage medium. The following are detailed descriptions.
Fig. 1 is a schematic diagram of an embodiment of a system for acquiring a news hotspot in the embodiment of the present application.
As shown in fig. 1, an embodiment of the system for acquiring a news hotspot provided by the embodiment of the present application includes a server 10, a network 20, and a plurality of news distribution sources 30, where the server 10 and the plurality of news distribution sources 30 are communicatively connected through the network 20.
The plurality of news feed sources 30 may include, for example, news feed 1, news feed 2, and news feed N, where N is an integer greater than 2. Each news feed may be understood as a news web site, which may be a collaboration web site, or a news web site that is only available for diversion, such as: the existing news-publishing websites in the networks such as fox searching news, wave news, xinhua network, south network and the like.
Each news distribution source distributes hot news counted or maintained by the news distribution source. The server 10 may obtain, from each news distribution source, the initial hot news on each news distribution source through the network 20, where the initial hot news is the hot news of each news distribution source. The initial hot news published on each news publication source may be the same or different or may be partially the same. Therefore, after acquiring the initial hot news from each news distribution source, the server 10 clusters the initial hot news, that is, determines the same or similar hot news. For hot news with the same or high similarity, the hot news can be clustered into one news. This clustered news may be understood as intermediate hot news, no identical or similar initial hot news belonging to the non-clustered initial hot news. Then, the server 10 selects the intermediate hot news after the clustering process and the initial hot news which is not clustered according to the heat degree, and determines target hot news.
If the total sum of the initial hot news obtained from each news distribution source is marked by a set a, for example, 10 initial hot news { a1, a2, a3, a4, a5, a6, a7, a8, a9, a10} are included, and the server 10 clusters, if a1, a2, and a3 are the same or very similar news, the server 10 clusters a1, a2, and a3 into an intermediate hot news b1, where b1 may include all the contents of a1, a2, and a3, and only makes an event progress association according to the sequence of occurrence time of a1, a2, and a 3. If a6, a7 and a8 are the same news or the news with high similarity, the server 10 clusters a6, a7 and a8 into an intermediate hot news b2, and other a4, a5, a9 and a10 are all different and dissimilar hot news, so that a4, a5, a9 and a10 are all initial hot news which is not clustered. The clustered hot news set B may include B1, a4, a5, B2, a9, and a10, and the hot news in the set B may be understood as intermediate hot news plus the original hot news that is not clustered. And then the server judges the popularity of each hot news in the set B again, and if the popularity of B1, a4, a5, B2 and a9 is greater than the popularity threshold, it can be determined that B1, a4, a5, B2 and a9 are the target hot news. The target hot news can also be understood as a set C, which includes b1, a4, a5, b2, and a 9. It should be noted that this scenario is only an example, and in fact, there are many initial hot news released by each news releasing source, that is, there may be many hot news in the sets a, B, and C. The number of hot news in each set in this scenario example should not be construed as a limitation on the number of hot news in the embodiments of the present application.
The above is a description of a scene of a system for acquiring hot news, and the following is a method for acquiring hot news at a server side, also in combination with the above system.
As shown in fig. 2, an embodiment of the method for obtaining hot news provided in the embodiment of the present application includes:
101. the method comprises the steps of obtaining initial hot news from each news release source in a plurality of news release sources, wherein the initial hot news is the hot news of each news release source.
102. And clustering the initial hot news with the similarity larger than the similarity threshold value in the initial hot news to obtain intermediate hot news.
103. And determining the intermediate hot news with the heat degree larger than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree larger than the heat degree threshold value as target hot news.
The scheme of the method provided by the embodiment of the present application can be understood by referring to the description and examples in the above system scheme, and repeated description is not repeated here.
According to the embodiment of the application, hot news with the highest rank in hot news published by each news publishing source in the whole network can be obtained and processed, and then the hot news with the highest rank in the hot news published by each news publishing source in the whole network can be obtained. Therefore, the efficiency of acquiring hot spot news in the whole network is improved.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the acquiring initial hot news from each news distribution source in the multiple news distribution sources may include:
determining a hot news layout corresponding to each news release source;
and acquiring the respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source.
In the embodiment of the present application, the hot news layout corresponding to each news issue source is usually different, for example: as shown in fig. 3A, if the news distribution source 1 is a station a news, the hot news page of the station a is a hot news center. As shown in fig. 3B, the news distribution source 2 is B station news, and the hot news page of the B station news is the hot news on the left. As shown in fig. 3C, the news distribution source 3 is the C station news, and the hot news page of the C station news is the hot news top. Therefore, when the server acquires the initial hot news from each news release source, the server acquires the hot news of each news release source at a corresponding position on the hot news layout according to the hot news layout corresponding to each news release source. If the news distribution source 1 is aimed at, the initial hot news of the news distribution source 1 is obtained from the middle of the layout, if the news distribution source 2 is aimed at, the initial hot news of the news distribution source 2 is obtained from the left side of the layout, and if the news distribution source 3 is aimed at, the initial hot news of the news distribution source 3 is obtained from the left side of the layout. Therefore, the efficiency of acquiring the initial hot news can be improved without searching the full page.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the acquiring the hot news of each news issue source may include:
and acquiring hot news of each news release source in an acquisition time period, wherein the hot news in the acquisition time period comprises the hot news at the initial moment of the acquisition time period and newly-added hot news after the news release source page is refreshed in the acquisition process.
In the embodiment of the application, the initial hot news of each news release source is updated, and if the server acquires the initial hot news of each news release source in real time to update in real time, the computing resources are wasted, so the server can acquire the hot news of each news release source at regular time and can specify an acquisition time period, the initial hot news of each news release source is acquired at the initial moment of the acquisition time period, and the hot news of each news release source is acquired at the end moment of the acquisition time period. And real-time acquisition is not needed, so that the accuracy of the data is ensured, and the waste of computing resources is reduced. The new hot news that will be added during the acquisition period can be understood with reference to fig. 4. At the start time of the acquisition period, as shown in fig. 4 (a), the page includes hot news 1, hot news 2, and hot news 3, and at the end time of the acquisition period, as shown in fig. 4 (b), the page includes hot news 1, hot news 2, and hot news 4. Then the hot news 4 is used to replace the hot news 3 after the page is refreshed, and the server acquires the hot news 1, the hot news 2, the hot news 3, and the hot news 4 in the acquisition time period.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the clustering initial hot news of which the similarity is greater than the similarity threshold in the initial hot news to obtain intermediate hot news may include:
determining the field to which the initial hot news belongs;
and clustering the initial hot news with the similarity larger than the similarity threshold into intermediate hot news aiming at the initial hot news in the same field.
In the process of clustering the initial hot news, in order to reduce the calculation amount of the word-by-word analysis of the news, the field to which each initial hot news belongs may be determined, and the fields may include: social, financial, scientific and educational aspects, and then similarity calculation and clustering are performed on the initial hot news in the same field.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the method may further include:
aiming at initial hot news in the same field, determining an entity and a keyword, wherein the entity is a participant object in the news;
and determining the similarity of each initial hot news in the same field according to the entity and the keyword.
In this embodiment, the entity may be a company, a unit, or a person in news, and the keyword may be a verb or a noun related to the entity, such as: the love art recruits news on the market, wherein the love art can be an entity, and the recruiting can be keywords.
For initial hot news in the same field, the same or very similar hot news may exist, or different or dissimilar hot news may exist, so that the similarity needs to be further calculated according to the entity and the keyword.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the determining, according to the entity and the keyword, the similarity of each initial hot news in the same field may include:
determining the occurrence frequency and position of the same entity in each initial hot news in the same field;
determining initial hot news meeting initial clustering conditions in the same field according to the times and the positions;
and for each initial hot news in the initial hot news meeting the initial clustering conditions, carrying out similarity scoring on each initial hot news according to keywords.
In the embodiment of the present application, the more times the same entity appears in different hot news, the higher the correlation of the different hot news is, of course, the less times the same entity appears may be, but the more important the position of the occurrence is, for example, the position of the occurrence in the beginning of the news, the higher the correlation of the different hot news is. Therefore, when the initial hot news meeting the initial clustering condition is determined, a weighted calculation can be performed according to the occurrence times and positions of the entities, and then which news belong to the initial hot news meeting the initial clustering condition is determined according to the calculation result. And determining similarity according to the keywords aiming at the initial hot news meeting the initial clustering condition, wherein if two initial hot news are provided, one is the Aiqi skill recruiting and marketing, and the other is the Aiqi skill purchasing other video websites. When an entity judges, the occurrence frequency of "love art" in two news is many, and the occurrence position is also important, but when the keywords are judged, one keyword is "fund raising" and the other keyword is "purchase", the similarity of the two hot news can also be determined to be not high, for example: the similarity threshold is 8 points, in this case, the similarity is calculated to be 3 points, and if the similarity is less than 8 points, the final clustering condition is not satisfied. If two initial hot news are available, one is the love art fund and the other is the love art fund and the amount of the love art fund is 5 hundred million dollars. When entity judgment is performed, the number of occurrences of "love art" in two news is large, and the locations of the occurrences are also important, and when keywords are judged, one keyword is "fund raising", and the other keyword is "fund raising", it can also be determined that the similarity of the two hot news is high, for example: the similarity threshold is 8 points, in this case, the similarity is calculated to be 9 points, and if the similarity is greater than 8 points, the final clustering condition is satisfied. The two hot news can be clustered into one middle hot news, and the clustered middle hot news can retain all contents of the two hot news and only make an event progress correlation according to the time sequence.
Optionally, in another embodiment of the method for obtaining hot news provided in the embodiment of the present application, the determining a domain to which the initial hot news belongs may include:
and determining the field of the initial hot news according to the link information of the news release source from which the initial hot news comes and the text keywords of the initial hot news.
In the embodiment of the application, the link information of the news release source generally includes prompt information about a field, such as sports news, and the link information may also include prompt information that "sport" appears more often, and certainly, there may be no prompt information related to the field in the link information of the news release source, so that the field to which the initial hot news belongs needs to be further determined according to the text keyword.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the determining, according to the link information of the news release source from which the initial hot news comes and the text keyword of the initial hot news, a field to which the initial hot news belongs may include:
analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs;
calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field;
and if the cosine similarity is larger than a preset threshold value, the estimated field is the field to which the initial hot news belongs.
In the embodiment of the application, the text keywords can be extracted from the text of the initial hot news, the text keywords can be extracted by a term frequency-inverse document frequency (TF-IDF) technology, and the TF-IDF technology is a common weighting technology for information retrieval and data mining.
In the embodiment of the application, the domain determined in the link information of the news publishing source is determined as the pre-estimation domain, and then cosine similarity calculation is performed according to the text keywords and the central vector in the pre-estimation domain, wherein the initial pre-estimation domain central vector is the vector of the text keywords of the first hot news. If the cosine similarity of the text keyword vector is greater than a preset threshold, the estimated field is judged to be the field to which the initial hot news belongs, and if the cosine similarity of the text keyword vector is less than the preset threshold, the estimated field is judged not to be the field to which the initial hot news belongs. When the pre-estimated field is the field to which the initial hot news belongs, the text keyword vector of the hot news can be added to the field set, and the central vector is changed into the geometric central point of two similar vectors. As the content gradually adds iterations, the center vector will converge.
Optionally, in another embodiment of the method for acquiring hot news provided in the embodiment of the present application, the method may further include:
determining the popularity of the intermediate hot news according to the authority level of the news release source to which the initial hot news corresponding to the intermediate hot news belongs, the browsed quantity and the concerned quantity of the corresponding initial hot news;
and determining the popularity of the non-clustered initial hot news according to the authority level, the browsed amount and the concerned amount of the news release source to which the non-clustered initial hot news belongs.
In the embodiment of the application, the authority of a news release source from which the initial hot news corresponding to the intermediate hot news comes, the browsing times of the user, the approval times, the comment times and the comment times of the user and the like can be considered when determining the popularity of the intermediate hot news. Meanwhile, the weight of news can be increased according to hundred-degree ranking and microblog hot spots, and the news with the top rank can be finally calculated.
The process of acquiring hot news described above in the embodiment of the present application mainly includes three processes of acquiring initial hot news, clustering the acquired initial hot news, and calculating the hot degree of the hot news, and for the three processes, reference may also be made to the processes in the following drawings for understanding.
As shown in fig. 5, the acquisition process for the initial hot news may include:
201. and acquiring a news release source list.
The news feed may be a different news web site.
202. And determining a hot news layout.
Since the arrangement areas of the hot news on different news websites may be different, different hot news acquisition templates can be configured for different news websites.
203. Page changes are monitored.
The step is to monitor the page change of the hot news on each news website.
204. Pulling the initial hot news.
When the initial hot news is pulled, a javascript script can be called to automatically refresh the page, and incremental news is obtained according to data changes before and after refreshing.
205. And extracting the text of the hot news.
The extracted text may be stored in a database.
206. And generating the abstract.
The abstract can be generated by extracting the core content in the body through a deep neural network.
207. And (6) incremental warehousing.
Hot news for the increment is also stored in the database.
As shown in fig. 6, the clustering process for the obtained initial hot news may include:
301. the link information is analyzed.
And analyzing whether the suggestive information related to the field exists or not from the link to determine the pre-estimated field.
302. And extracting the text keywords.
Keywords are extracted from the text content.
303. And determining the domain similarity.
And determining whether the estimated field is the field to which the initial hot news belongs according to the text keywords.
304. And determining the estimated field as the field to which the initial hot news belongs according to the field similarity.
The processes 301 to 304 described above may be understood as processes of domain classification.
305. Entities are extracted from the initial hot news.
306. Keywords are extracted from the initial hot news.
307. And determining that the similarity of the initial hot news in the same field is greater than a threshold value according to the entity and the keyword, and the initial hot news belong to the same or similar news.
308. And carrying out hierarchical clustering.
Hierarchical clustering is to associate two or more initial hot news, and the association relationship between the two or more initial hot news can be established only according to the time sequence without changing any one of the initial hot news.
The above 305 to 308 may be understood as a process of clustering.
As shown in fig. 7, the hot news popularity calculation process may include:
401. and obtaining the authority level of the news release source.
And the number of browsed initial hot news by the user, and indexes such as praise and comment can be obtained.
402. And news timeliness.
The timeliness of news is used as a weight for calculating the popularity.
403. The number of matches is weighted.
If a plurality of news distribution sources all distribute the same or similar initial hot news, the initial hot news can be promoted according to the distribution number.
404. And scoring the news quality.
The news quality affects whether the user will browse or not, so the news quality also affects the popularity of the news.
The above 401 to 404 can be understood as a base score.
405. Hundred degree rank weighting.
406. And weighting the microblog hot spots.
And for the initial hot news subjected to basic scoring, weighting and adjusting the hot score according to the ranking position of the initial hot news in hundredths and the hot situation in the microblog.
In the above embodiments, all of the processes of the server acquiring the hot news are described, and a process of interactively displaying the acquired target hot news by the server and the terminal in the embodiment of the present application is introduced below with reference to fig. 8.
As shown in fig. 8, another embodiment of the system for acquiring hot news provided by the embodiment of the present application includes a server 10 and a plurality of terminals 40, and the server 10 and the terminals 40 communicate via a network 20. Any one of the terminals 40 for which the news application provided in the embodiments of the present application is installed may request hot news from the server. The process may be that the terminal 40 responds to a selection operation of the news application, and the selection operation may be a click operation of the news application by the user. The terminal 40 sends a hot spot request message for the news application to the server 10, and the server 10 determines current target hot news, which is the latest target hot news when the hot spot request is received. The terminal 40 receives the target hot news returned by the server 10 and then displays the target hot news.
In the embodiment of the present application, the display style of the target hot news may be that the terminal 40 allocates each target hot news to a preset display style for displaying according to the hot ranking of the multiple target hot news returned by the server 10.
The terminal 40 may receive the display style indication information returned by the server 10, and then the terminal displays the current target hot news according to the display style indication information.
The display style may be bold display for the news with the highest popularity, or display with color, or display in the form of a special display frame, and for other news with the lowest popularity, the display style may be displayed in different fonts, different font sizes, or different display frame sizes, respectively, although the personalized display manner in the embodiment of the present application is not limited to the above mentioned cases.
As shown in fig. 9, different target hot news may have different fonts in the display style of the terminal, for example, there are regular fonts and song fonts, and the font size may be different, and the form of the display frame may be different, for example, there are rectangles and ellipses, but of course, there may be colors and pictures in the display process of the actual product, and this multi-form hot news display mode may improve the spreading effect of the news.
As shown in fig. 10, an embodiment of the method for obtaining hot news through interaction between a server and a terminal according to the embodiment of the present application includes:
501. the terminal responds to the selection operation of the news application.
502. And the terminal sends a hot spot request message aiming at the news application to the server.
503. And after receiving the hot spot request message, the server determines target hot spot news.
The target hot news is obtained by the server acquiring initial hot news from each news release source in a plurality of news release sources and processing the initial hot news, and the initial hot news is the hot news of each news release source.
504. And the server sends target hot news and display style indication information of the target hot news to the terminal.
505. And the terminal displays the current target hot news according to the display style indication information.
The above is a description of a system and a method for acquiring hot news, and a server and a terminal in the embodiment of the present application are introduced below with reference to the accompanying drawings.
As shown in fig. 11, an embodiment of the server 60 provided in the embodiment of the present application includes:
an obtaining unit 601, configured to obtain initial hot news from each news release source in multiple news release sources, where the initial hot news is hot news of each news release source;
a clustering unit 602, configured to cluster the initial hot news with the similarity greater than a similarity threshold value in the initial hot news acquired by the acquiring unit 601 to obtain middle hot news;
the determining unit 603 is configured to determine, as target hot news, intermediate hot news obtained by clustering performed by the clustering unit 602 with the heat degree greater than the heat degree threshold, and initial hot news which is not clustered and has the heat degree greater than the heat degree threshold.
According to the embodiment of the application, hot news with the highest rank in hot news published by each news publishing source in the whole network can be obtained and processed, and then the hot news with the highest rank in the hot news published by each news publishing source in the whole network can be obtained. Therefore, the efficiency of acquiring hot spot news in the whole network is improved.
Optionally, the obtaining unit 601 is configured to:
determining a hot news layout corresponding to each news release source;
and acquiring the respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source.
Optionally, the obtaining unit 601 is configured to: and acquiring the hot news of each news release source in an acquisition time period, wherein the hot news in the acquisition time period comprises the hot news at the initial moment of the acquisition time period and newly-added hot news after the news release source page is refreshed in the acquisition process.
Optionally, the clustering unit 602 is configured to:
determining the field to which the initial hot news belongs;
and clustering the initial hot news with the similarity larger than the similarity threshold into intermediate hot news aiming at the initial hot news in the same field.
Optionally, the clustering unit 602 is further configured to:
aiming at initial hot news in the same field, determining an entity and a keyword, wherein the entity is a participant object in the news;
and determining the similarity of each initial hot news in the same field according to the entity and the keyword.
Optionally, the clustering unit 602 is configured to:
determining the occurrence times and positions of the same entity in each initial hot news in the same field;
determining initial hot news meeting initial clustering conditions in the same field according to the times and the positions;
and for each initial hot news in the initial hot news meeting the initial clustering conditions, carrying out similarity scoring on each initial hot news according to keywords.
Optionally, the clustering unit 602 is configured to:
and determining the field of the initial hot news according to the link information of the news release source from which the initial hot news comes and the text keywords of the initial hot news.
Optionally, the clustering unit 602 is configured to:
analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs;
calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field;
and if the cosine similarity is larger than a preset threshold value, the estimated field is the field to which the initial hot news belongs.
Optionally, the determining unit 603 is further configured to:
determining the popularity of the intermediate hot news according to the authority level of the news release source to which the initial hot news corresponding to the intermediate hot news belongs, the browsed quantity and the concerned quantity of the corresponding initial hot news;
and determining the popularity of the non-clustered initial hot news according to the authority level, the browsed amount and the concerned amount of the news release source to which the non-clustered initial hot news belongs.
Optionally, as shown in fig. 12, the server provided in the embodiment of the present application further includes a receiving unit 604 and a sending unit 605,
the receiving unit 604 is configured to receive a hotspot request sent by a terminal;
the determining unit 603 is further configured to determine the current target hot news according to the hot request;
the sending unit 605 is configured to send the current target hot news to the terminal.
Optionally, the sending unit 605 is further configured to send display style indication information of the current target hot news to the terminal, where the display style indication information is used to indicate that the terminal displays the current target hot news according to the display style.
Optionally, the sending unit 605 is further configured to send a popularity ranking of each target hot news in the current target hot news to the terminal, where the popularity ranking is used for the terminal to allocate the target hot news to a preset display style for displaying.
The above description of the server 60 can be understood with reference to the corresponding description of the server side in the system and method embodiments of fig. 1 to fig. 10, and is not repeated herein.
As shown in fig. 13, an embodiment of the terminal 70 provided in the embodiment of the present application includes:
an input unit 701 for responding to a selection operation for a news application;
a sending unit 702, configured to send a hot spot request message for the news application to a server according to the selection operation input by the input unit 701;
a receiving unit 703, configured to receive target hot news sent by the server, where the target hot news is initial hot news acquired by the server from each news release source in multiple news release sources, and the initial hot news is obtained by processing the initial hot news, and the initial hot news is hot news of each news release source;
a displaying unit 704, configured to display the target hot news received by the receiving unit 703.
According to the embodiment of the application, hot news with the highest rank in hot news published by each news publishing source in the whole network can be obtained and processed, and then the hot news with the highest rank in the hot news published by each news publishing source in the whole network can be obtained. Therefore, the efficiency of acquiring hot spot news in the whole network is improved.
Optionally, the receiving unit 703 is further configured to receive display style indication information of the target hot news, which is sent by the server;
the display unit 704 is configured to display the target hot news according to the display style indication information.
Optionally, the receiving unit 703 is further configured to receive a popularity rank of each target hot news in the target hot news sent by the server;
the display unit 704 is configured to allocate the target hot news to a preset display style according to the popularity ranking for displaying.
The diversified display forms provided by the embodiment of the application can be beneficial to improving the propagation effect of hot news.
The above description of the terminal 70 can also be understood by referring to the related description of the terminal side in the foregoing embodiments, and will not be repeated herein.
Fig. 14 is a schematic structural diagram of a server 60 according to an embodiment of the present application. The server 60 includes a processor 610, a memory 640, and an input/output (I/O) interface 630. The memory 640 may include a read-only memory and a random access memory, and provides operating instructions and data to the processor 610. A portion of the memory 640 may also include non-volatile random access memory (NVRAM).
In some embodiments, memory 640 stores the following elements, executable modules or data structures, or a subset or expanded set thereof:
in the embodiment of the present application, in the process of acquiring the hot news, by calling the operation instruction stored in the memory 640 (the operation instruction may be stored in the operating system),
the processor 610 is configured to:
acquiring initial hot news from each news release source in a plurality of news release sources, wherein the initial hot news is the hot news of each news release source;
clustering the initial hot news with the similarity larger than a similarity threshold value in the initial hot news to obtain intermediate hot news;
and determining the intermediate hot news with the heat degree larger than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree larger than the heat degree threshold value as target hot news.
According to the embodiment of the application, the hot news with the highest rank in the hot news published by each news publishing source in the whole network can be obtained by obtaining and processing the hot news of each news publishing source in the whole network. Therefore, the efficiency of acquiring hot spot news in the whole network is improved.
Processor 610 controls the operation of server 60, and processor 610 may also be referred to as a CPU (Central Processing Unit). Memory 640 may include both read-only memory and random access memory and provides instructions and data to processor 610. A portion of the memory 640 may also include non-volatile random access memory (NVRAM). The various components of the server 60 in a particular application are coupled together by a bus system 620, where the bus system 620 may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, however, the various buses are labeled in the figure as bus system 620.
The method disclosed in the embodiments of the present application may be applied to the processor 610, or may be implemented by the processor 610. The processor 610 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 610. The processor 610 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 640, and the processor 610 reads the information in the memory 640 and performs the steps of the above method in combination with the hardware thereof.
Optionally, the processor 610 is configured to:
determining a hot news layout corresponding to each news release source;
and acquiring the respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source.
Optionally, the processor 610 is configured to:
and acquiring hot news of each news release source in an acquisition time period, wherein the hot news in the acquisition time period comprises the hot news at the initial moment of the acquisition time period and newly-added hot news after the news release source page is refreshed in the acquisition process.
Optionally, the processor 610 is configured to:
determining the field to which the initial hot news belongs;
and clustering the initial hot news with the similarity larger than the similarity threshold into intermediate hot news aiming at the initial hot news in the same field.
Optionally, the processor 610 is configured to:
aiming at initial hot news in the same field, determining an entity and a keyword, wherein the entity is a participant object in the news;
and determining the similarity of each initial hot news in the same field according to the entity and the keyword.
Optionally, the processor 610 is configured to:
determining the occurrence frequency and position of the same entity in each initial hot news in the same field;
determining initial hot news meeting initial clustering conditions in the same field according to the times and the positions;
and for each initial hot news in the initial hot news meeting the initial clustering conditions, carrying out similarity scoring on each initial hot news according to keywords.
Optionally, the processor 610 is configured to:
and determining the field of the initial hot news according to the link information of the news release source from which the initial hot news comes and the text keywords of the initial hot news.
Optionally, the processor 610 is configured to:
analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs;
calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field;
and if the cosine similarity is larger than a preset threshold value, the estimated field is the field to which the initial hot news belongs.
Optionally, the processor 610 is configured to:
determining the popularity of the intermediate hot news according to the authority level of the news release source to which the initial hot news corresponding to the intermediate hot news belongs, the browsed quantity and the concerned quantity of the corresponding initial hot news;
and determining the popularity of the non-clustered initial hot news according to the authority level, the browsed amount and the concerned amount of the news release source to which the non-clustered initial hot news belongs.
Optionally, the I/O interface 630 is configured to receive a hotspot request sent by the terminal;
the processor 610 is configured to determine the current target hot news according to the hot request;
the I/O interface 630 is configured to send the current target hot news to the terminal.
Optionally, the I/O interface 630 is further configured to send display style indication information of the current target hot news to the terminal, where the display style indication information is used to indicate the terminal to display the current target hot news according to the display style.
Optionally, the I/O interface 630 is further configured to send a popularity ranking of each target hot news in the current target hot news to the terminal, where the popularity ranking is used for the terminal to allocate the target hot news to a preset display style for displaying.
The above description of the server 60 can be understood with reference to the corresponding description of the server portion of fig. 1-10.
The process of acquiring the news hotspot is executed by a terminal device, for example, any terminal device such as a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), a vehicle-mounted computer, and the like, taking the terminal as the mobile phone:
fig. 15 is a block diagram showing a partial structure of a cellular phone related to a terminal device provided in an embodiment of the present invention. Referring to fig. 15, the cellular phone includes: radio Frequency (RF) circuit 1110, memory 1120, input unit 1130, display unit 1140, sensor 1150, audio circuit 1160, wireless fidelity (WiFi) module 1170, processor 1180, and camera 1190. Those skilled in the art will appreciate that the handset configuration shown in fig. 15 is not intended to be limiting and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
The following specifically describes each constituent component of the mobile phone with reference to fig. 15:
the RF circuit 1110 may be used for transmitting and receiving information or signals during a call, and the RF circuit 1110 is also a transceiver. Specifically, after receiving downlink information of the base station, the downlink information is processed by the processor 1180; in addition, data for designing uplink is transmitted to the base station. In general, RF circuit 1110 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), email, short Messaging Service (SMS), and the like.
The memory 1120 may be used to store software programs and modules, and the processor 1180 may execute various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1120. The memory 1120 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 1130 may be used to receive a user selection operation and generate a key signal input related to user setting and function control of the cellular phone. For example: responding to the selection operation of the user to the news application. Specifically, the input unit 1130 may include a touch panel 1131 and other input devices 1132. Touch panel 1131, also referred to as a touch screen, can collect touch operations of a user on or near the touch panel 1131 (for example, operations of the user on or near touch panel 1131 by using any suitable object or accessory such as a finger or a stylus pen), and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1131 may include two parts, namely, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1180, and can receive and execute commands sent by the processor 1180. In addition, the touch panel 1131 can be implemented by using various types, such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 1130 may include other input devices 1132 in addition to the touch panel 1131. In particular, other input devices 1132 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 1140 may be used to display the target hot news. The Display unit 1140 may include a Display panel 1141, and optionally, the Display panel 1141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1131 can cover the display panel 1141, and when the touch panel 1131 detects a touch operation on or near the touch panel, the touch panel is transmitted to the processor 1180 to determine the type of the touch event, and then the processor 1180 provides a corresponding visual output on the display panel 1141 according to the type of the touch event. Although in fig. 15, touch panel 1131 and display panel 1141 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, touch panel 1131 and display panel 1141 may be integrated to implement the input and output functions of the mobile phone.
The handset may also include at least one sensor 1150, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1141 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1141 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing gestures of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometers and taps), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, the description is omitted here.
Audio circuitry 1160, speakers 1161, and microphone 1162 may provide an audio interface between a user and a cell phone. The audio circuit 1160 may transmit the electrical signal converted from the received audio data to the speaker 1161, and convert the electrical signal into a sound signal for output by the speaker 1161; on the other hand, the microphone 1162 converts the collected sound signal into an electrical signal, which is received by the audio circuit 1160 and converted into audio data, which is then processed by the audio data output processor 1180 and then sent to, for example, another cellular phone via the RF circuit 1110, or the audio data is output to the memory 1120 for further processing.
WiFi belongs to short-distance wireless transmission technology, and the cell phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 1170, and provides wireless broadband internet access for the user. Although fig. 15 shows the WiFi module 1170, it is understood that it does not belong to the essential constitution of the handset, and can be omitted entirely as necessary within the scope not changing the essence of the invention.
The processor 1180 is a control center of the mobile phone, and connects various parts of the whole mobile phone by using various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1120 and calling data stored in the memory 1120. Optionally, processor 1180 may include one or more processing units; preferably, the processor 1180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated within processor 1180.
The camera 1190 is used for collecting images.
The mobile phone further includes a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the processor 1180 through a power management system, so that functions of managing charging, discharging, power consumption, and the like are implemented through the power management system.
Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.
In this embodiment of the present invention, the processor 1180 included in the terminal further has the following control functions:
responding to the selection operation of the news application;
sending a hot spot request message aiming at the news application to a server;
receiving target hot news sent by the server, wherein the target hot news is initial hot news acquired by the server from each news release source in a plurality of news release sources and is obtained by processing the initial hot news, and the initial hot news is the hot news of each news release source;
and displaying the target hot news.
Optionally, when receiving the target hot news sent by the server, the method may further include:
receiving display style indication information of the target hot news sent by the server;
the presenting the target hot news may include:
and displaying the target hot news according to the display style indication information.
Optionally, when receiving the target hot news sent by the server, the method may further include:
receiving the popularity ranking of each target hot news in the target hot news sent by the server;
the presenting the target hot news may include:
and distributing the target hot news into a preset display style according to the hot ranking for displaying.
The above description of the functions of the terminal can also be understood by referring to the relevant contents at the terminal side in the embodiments described in fig. 1 to fig. 10, and will not be repeated herein.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
The method, the device and the computer-readable storage medium for acquiring hot news provided by the embodiment of the application are introduced in detail, a specific example is applied in the description to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (16)

1. A method for obtaining hot news is characterized by comprising the following steps:
determining a hot news layout corresponding to each news release source, wherein the hot news layout corresponding to different news release sources has different hot news arrangement areas, and different hot news acquisition templates are configured for different news release sources;
acquiring respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source to serve as initial hot news;
analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs;
calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field;
if the cosine similarity is larger than a preset threshold value, determining that the pre-estimated field is the field to which the initial hot news belongs, wherein if the pre-estimated field is the field to which the initial hot news belongs, adding a text keyword vector of the initial hot news into a field set of the pre-estimated field, and changing a central vector of the pre-estimated field into a geometric central point of two similar vectors to gradually converge the central vector of the pre-estimated field;
clustering the initial hot news with similarity larger than a similarity threshold into intermediate hot news aiming at the initial hot news in the same belonging field;
and determining the intermediate hot news with the heat degree larger than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree larger than the heat degree threshold value as target hot news.
2. The method of claim 1, wherein the obtaining of the respective hot news of each news distribution source comprises:
and acquiring hot news of each news release source in an acquisition time period, wherein the hot news in the acquisition time period comprises the hot news at the initial moment of the acquisition time period and newly-added hot news after the news release source page is refreshed in the acquisition process.
3. The method of claim 1, further comprising:
aiming at initial hot news in the same field, determining an entity and a keyword, wherein the entity is a participant object in the news;
and determining the similarity of each initial hot news in the same field according to the entity and the keyword.
4. The method of claim 3, wherein the determining the similarity of the initial hot news in the same domain according to the entity and the keyword comprises:
determining the occurrence times and positions of the same entity in each initial hot news in the same field;
determining initial hot news meeting initial clustering conditions in the same field according to the times and the positions;
and for each initial hot news in the initial hot news meeting the initial clustering conditions, performing similarity scoring on each initial hot news according to keywords.
5. The method of any of claims 1-2, further comprising:
determining the popularity of the intermediate hot news according to the authority level of the news release source to which the initial hot news corresponding to the intermediate hot news belongs, the browsed quantity and the concerned quantity of the corresponding initial hot news;
and determining the popularity of the non-clustered initial hot news according to the authority level, the browsed amount and the concerned amount of the news release source to which the non-clustered initial hot news belongs.
6. The method of claims 1-2, further comprising:
receiving a hotspot request sent by a terminal;
determining the current target hot news according to the hot request;
and sending the current target hot news to the terminal.
7. The method of claim 6, wherein when sending the current target hot news to the terminal, the method further comprises:
and sending display style indication information of the current target hot news to the terminal, wherein the display style indication information is used for indicating the terminal to display the current target hot news according to the display style.
8. The method of claim 6, wherein when sending the current target hot news to the terminal, the method further comprises:
and sending the popularity ranking of each target hot news in the current target hot news to the terminal, wherein the popularity ranking is used for the terminal to distribute the target hot news to a preset display style for displaying.
9. A method for obtaining hot news is characterized by comprising the following steps:
responding to the selection operation of the news application;
sending a hotspot request message for the news application to a server;
receiving target hot news sent by the server, wherein the target hot news is obtained by acquiring initial hot news from each news release source in a plurality of news release sources by the server and processing the initial hot news, and the initial hot news is the hot news of each news release source;
displaying the target hot news;
the obtaining of the initial hot news from each news distribution source of the plurality of news distribution sources includes: determining a hot news layout corresponding to each news release source, wherein the hot news layout corresponding to different news release sources has different hot news arrangement areas, and different hot news acquisition templates are configured for different news release sources; acquiring respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source to serve as initial hot news;
the target hot news comprises intermediate hot news with the heat degree larger than a heat degree threshold value and non-clustered initial hot news with the heat degree larger than the heat degree threshold value, and the determination process of the intermediate hot news comprises the following steps: analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs; calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field; if the cosine similarity is larger than a preset threshold value, determining that the pre-estimated field is the field to which the initial hot news belongs, wherein if the pre-estimated field is the field to which the initial hot news belongs, adding a text keyword vector of the initial hot news into a field set of the pre-estimated field, and changing a central vector of the pre-estimated field into a geometric central point of two similar vectors to gradually converge the central vector of the pre-estimated field; and clustering the initial hot news with similarity larger than a similarity threshold into intermediate hot news aiming at the initial hot news in the same belonging field.
10. The method of claim 9, wherein when receiving the target hot news sent by the server, the method further comprises:
receiving display style indication information of the target hot news sent by the server;
the displaying the target hot news comprises the following steps:
and displaying the target hot news according to the display style indication information.
11. The method of claim 9, wherein when receiving the target hot news sent by the server, the method further comprises:
receiving the popularity ranking of each target hot news in the target hot news sent by the server;
the displaying the target hot news comprises the following steps:
and distributing the target hot news into a preset display style according to the hot ranking for displaying.
12. A server, comprising:
the acquisition unit is used for determining the hot news layout corresponding to each news release source, wherein the hot news layout corresponding to different news release sources has different hot news arrangement areas, and different hot news acquisition templates are configured for different news release sources; acquiring respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source to serve as initial hot news;
the clustering unit is used for analyzing link information of a news release source from which the initial hot news comes and determining an estimation field to which the initial hot news belongs; calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field; if the cosine similarity is larger than a preset threshold value, determining that the pre-estimated field is the field to which the initial hot news belongs, wherein if the pre-estimated field is the field to which the initial hot news belongs, adding a text keyword vector of the initial hot news into a field set of the pre-estimated field, and changing a central vector of the pre-estimated field into a geometric central point of two similar vectors to gradually converge the central vector of the pre-estimated field; clustering the initial hot news with similarity larger than a similarity threshold into intermediate hot news aiming at the initial hot news in the same belonging field;
and the determining unit is used for determining the intermediate hot news obtained by clustering the clustering unit with the heat degree greater than the heat degree threshold value and the initial hot news which is not clustered and has the heat degree greater than the heat degree threshold value as the target hot news.
13. A terminal, comprising:
an input unit for responding to a selection operation for a news application;
the sending unit is used for sending a hot spot request message aiming at the news application to a server according to the selection operation input by the input unit;
the receiving unit is configured to receive target hot news sent by the server, where the target hot news is initial hot news acquired by the server from each of a plurality of news distribution sources, and is obtained by processing the initial hot news, and the initial hot news is hot news of each news distribution source;
the display unit is used for displaying the target hot news received by the receiving unit;
the obtaining of the initial hot news from each news distribution source of the plurality of news distribution sources includes: determining a hot news layout corresponding to each news release source, wherein the hot news layouts corresponding to different news release sources have different arrangement areas of the hot news, and different hot news acquisition templates are configured for the different news release sources; acquiring respective hot news of each news release source from the corresponding position of the hot news layout corresponding to each news release source to serve as initial hot news;
the target hot news comprises intermediate hot news with the popularity greater than the popularity threshold and non-clustered initial hot news with the popularity greater than the popularity threshold, and the determination process of the intermediate hot news comprises the following steps: analyzing link information of a news release source from which the initial hot news comes, and determining an estimation field to which the initial hot news belongs; calculating cosine similarity according to the text keywords of the initial hot news and the center vector of the estimated field; if the cosine similarity is larger than a preset threshold value, determining that the pre-estimated field is the field to which the initial hot news belongs, wherein if the pre-estimated field is the field to which the initial hot news belongs, adding a text keyword vector of the initial hot news into a field set of the pre-estimated field, and changing a central vector of the pre-estimated field into a geometric central point of two similar vectors to gradually converge the central vector of the pre-estimated field; and clustering the initial hot news with similarity larger than a similarity threshold into intermediate hot news aiming at the initial hot news in the same belonging field.
14. A server, characterized in that the server comprises: an input/output (I/O) interface, a processor, and a memory, the memory having stored therein instructions for obtaining hot news according to any one of claims 1-11;
the processor is configured to execute the instructions for obtaining hot news stored in the memory, and to perform the steps of the method for obtaining hot news according to any one of claims 1 to 11.
15. A terminal, characterized in that the terminal comprises: an input device, a transceiver, a display, a processor and a memory, wherein the memory stores the instruction for acquiring hot news according to any one of claims 9-11, and the processor is used for controlling the input device, the transceiver and the display to perform corresponding operations;
the input device is used for responding to the selection operation of any one of claims 9-11;
the transceiver is configured to perform the steps of message transmission and information reception as claimed in any of claims 9-11:
the display is used for showing the target hot news in any one of claims 9-11.
16. A computer-readable storage medium having stored therein instructions for obtaining news hotspots, which when run on a computer, cause the computer to perform the method of any of claims 1-8 above, or perform the method of any of claims 9-11 above.
CN201810552297.0A 2018-05-31 2018-05-31 Method, device and storage medium for acquiring news hotspots Active CN108897774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810552297.0A CN108897774B (en) 2018-05-31 2018-05-31 Method, device and storage medium for acquiring news hotspots

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810552297.0A CN108897774B (en) 2018-05-31 2018-05-31 Method, device and storage medium for acquiring news hotspots

Publications (2)

Publication Number Publication Date
CN108897774A CN108897774A (en) 2018-11-27
CN108897774B true CN108897774B (en) 2023-04-18

Family

ID=64343568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810552297.0A Active CN108897774B (en) 2018-05-31 2018-05-31 Method, device and storage medium for acquiring news hotspots

Country Status (1)

Country Link
CN (1) CN108897774B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110471963B (en) * 2019-08-14 2022-04-05 北京市商汤科技开发有限公司 Data processing method, device and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066537A (en) * 2017-03-06 2017-08-18 广州神马移动信息科技有限公司 Hot news generation method, equipment, electronic equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3974511B2 (en) * 2002-12-19 2007-09-12 インターナショナル・ビジネス・マシーンズ・コーポレーション Computer system for generating data structure for information retrieval, method therefor, computer-executable program for generating data structure for information retrieval, computer-executable program for generating data structure for information retrieval Stored computer-readable storage medium, information retrieval system, and graphical user interface system
CN105045890A (en) * 2015-07-29 2015-11-11 百度在线网络技术(北京)有限公司 Method and device for determining hot news in target news source
CN105224699B (en) * 2015-11-17 2020-01-03 Tcl集团股份有限公司 News recommendation method and device
CN107784010B (en) * 2016-08-29 2021-12-17 南京尚网网络科技有限公司 Method and equipment for determining popularity information of news theme
CN108090157B (en) * 2017-12-12 2018-11-06 百度在线网络技术(北京)有限公司 A kind of hot news method for digging, device and server

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066537A (en) * 2017-03-06 2017-08-18 广州神马移动信息科技有限公司 Hot news generation method, equipment, electronic equipment

Also Published As

Publication number Publication date
CN108897774A (en) 2018-11-27

Similar Documents

Publication Publication Date Title
WO2018086462A1 (en) Method and device for generating loading, push, and interaction information of service data
CN108763579B (en) Search content recommendation method and device, terminal device and storage medium
US9241242B2 (en) Information recommendation method and apparatus
CN108156508B (en) Barrage information processing method and device, mobile terminal, server and system
US20150154303A1 (en) System and method for providing content recommendation service
CN110309357B (en) Application data recommendation method, model training method, device and storage medium
CN109948090B (en) Webpage loading method and device
CN108427761B (en) News event processing method, terminal, server and storage medium
CN107315487B (en) Input processing method and device and electronic equipment
CN110633438B (en) News event processing method, terminal, server and storage medium
TW201512865A (en) Method for searching web page digital data, device and system thereof
CN111078986A (en) Data retrieval method, device and computer readable storage medium
CN104281610B (en) The method and apparatus for filtering microblogging
CN112784142A (en) Information recommendation method and device
CN107547646B (en) Application program pushing method and device, terminal and computer readable storage medium
CN108205568A (en) Method and device based on label selection data
CN108491502B (en) News tracking method, terminal, server and storage medium
CN111625737B (en) Label display method, device, equipment and storage medium
CN112925878B (en) Data processing method and device
US20220197939A1 (en) Image-based search method, server, terminal, and medium
CN117289831A (en) Page interaction method and device, electronic equipment and storage medium
CN108897774B (en) Method, device and storage medium for acquiring news hotspots
CN111666485B (en) Information recommendation method, device and terminal
CN112348614B (en) Method and device for pushing information
JP2015069386A (en) Server device, program, and communication method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Gao Rui

Inventor after: Li Hao

Inventor after: Wu Yizhu

Inventor before: Li Hao

Inventor before: Wu Yizhu

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant