CN111382365A - Method and apparatus for outputting information - Google Patents

Method and apparatus for outputting information Download PDF

Info

Publication number
CN111382365A
CN111382365A CN202010196580.1A CN202010196580A CN111382365A CN 111382365 A CN111382365 A CN 111382365A CN 202010196580 A CN202010196580 A CN 202010196580A CN 111382365 A CN111382365 A CN 111382365A
Authority
CN
China
Prior art keywords
event
similarity
event name
search
unit configured
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010196580.1A
Other languages
Chinese (zh)
Other versions
CN111382365B (en
Inventor
潘禄
陈玉光
李法远
韩翠云
刘远圳
黄佳艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010196580.1A priority Critical patent/CN111382365B/en
Publication of CN111382365A publication Critical patent/CN111382365A/en
Application granted granted Critical
Publication of CN111382365B publication Critical patent/CN111382365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The embodiment of the disclosure discloses a method and a device for outputting information, and relates to the technical field of knowledge graphs. One embodiment of the method comprises: acquiring an event name of an event with the popularity to be inquired; recalling the candidate document set through the event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with the first similarity larger than a preset first threshold value to a related document set; and calculating the weighted sum of the characteristics of the relevant documents as the heat of the event and outputting the heat. The implementation method can accurately reflect the heat of the user to the event, help the system to understand the user requirements and improve the user experience.

Description

Method and apparatus for outputting information
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for outputting information.
Background
With the rapid popularization of the internet, network information is explosively increased, and everyone needs to spend a great deal of energy to screen the information. When a user wants to know what happens recently or pay attention to a person or an organization, important information needs to be selected from a large number of news assets which are not screened and sorted. At present, certain progress is made in event aggregation and extraction, and the existing search engine/recommendation system can recommend information according to whether an event is the event or not, help users to screen irrelevant information, and improve the efficiency of obtaining important information by the users. However, the user attention degrees in the event life cycle are different among different events, and in order to better improve the user experience of the events in the application process, the user requirements can be sensed and understood in advance by calculating the event heat and sequencing the attention sequence of the current event user, so that the user experience of the user on information acquisition is further improved.
According to the existing technical scheme, the event popularity is calculated by taking information of a certain social network site as a main characteristic source, so that the popularity calculation has a large deviation, mainly, a user of the certain social network site cannot represent the attention points of most users, the user not only concerns the events on the social network site, but also actively clicks recommended articles and other ways to pay attention to the events, and therefore, no matter a search engine, the social network site, a recommendation system, a portal site and the like, the popularity of the user to the events is difficult to accurately reflect only through one or two items.
Disclosure of Invention
Embodiments of the present disclosure propose methods and apparatuses for outputting information.
In a first aspect, an embodiment of the present disclosure provides a method for outputting information, including: acquiring an event name of an event with the popularity to be inquired; recalling the candidate document set through the event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with the first similarity larger than a preset first threshold value to a related document set; and calculating the weighted sum of the characteristics of the relevant documents as the heat of the event and outputting the heat.
In some embodiments, the method further comprises: recalling the first search term set by the event name; calculating second similarity between the event name and each first search term; adding the first search word with the second similarity larger than a preset second threshold value to the precise search word set; the heat is corrected using a weighted sum of the features of each of the accurately searched words.
In some embodiments, the method further comprises: extracting a core entity from the event name; recalling the second search term set through the core entity, and extracting the core entity of each second search term; calculating a third similarity between the core entity of the event name and the core entity of each second search term; adding a second search word with a third similarity larger than a preset third threshold value to the universal search word set; the heat is corrected using a weighted sum of the features of the respective generic search terms.
In some embodiments, the characteristics of the relevant documents include at least one of: the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents.
In some embodiments, the characteristics of the refined search terms include historical exposure times.
In a second aspect, an embodiment of the present disclosure provides an apparatus for outputting information, including: an acquisition unit configured to acquire an event name of an event of a degree of hotness to be queried; a first recall unit configured to recall the candidate document set by an event name; a first calculation unit configured to calculate a first similarity between the event name and each candidate document; a first filtering unit configured to add candidate documents having a first similarity greater than a predetermined first threshold to a set of related documents; and an output unit configured to calculate a weighted sum of the features of the respective related documents as a heat of the event and output.
In some embodiments, the apparatus further comprises: a second recall unit configured to recall the first search term set by an event name; a second calculation unit configured to calculate a second similarity between the event name and each of the first search terms; a second filtering unit configured to add the first search word having the second similarity greater than a predetermined second threshold to the refined search word set; a first correcting unit configured to correct the heat using a weighted sum of the features of the respective precise search words.
In some embodiments, the apparatus further comprises: an extraction unit configured to extract a core entity from the event name; a third recall unit configured to recall the second search term set through the core entity and extract the core entity of each second search term; a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term; a third filtering unit configured to add a second search word of which a third similarity is greater than a predetermined third threshold to the generic search word set; a second correcting unit configured to correct the degree of heat using a weighted sum of the features of the respective generic search words.
In some embodiments, the characteristics of the relevant documents include at least one of: the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents.
In some embodiments, the characteristics of the refined search terms include historical exposure times.
In a third aspect, an embodiment of the present disclosure provides an electronic device for outputting information, including: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the first aspects.
In a fourth aspect, embodiments of the disclosure provide a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method as in any one of the first aspect.
According to the method and the device for outputting the information, in order to reduce deviation existing in the process of calculating the heat degree, multi-source data and characteristics are introduced, the multi-source characteristics are associated with the events through the technology of associating the data and the characteristics with the events, then the final heat degree value is obtained through weighting each characteristic, and the value can be used as the sequence of the events and is provided for a search engine/recommendation system to sequence and operate the related information of the events.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for outputting information, according to the present disclosure;
FIG. 3 is a flow diagram of yet another embodiment of a method for outputting information according to the present disclosure;
FIG. 4 is a schematic diagram of one application scenario of a method for outputting information in accordance with the present disclosure;
FIG. 5 is a structural schematic diagram of one embodiment of an apparatus for outputting information according to the present disclosure;
FIG. 6 is a schematic block diagram of a computer system suitable for use with an electronic device implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and the features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the disclosed method for outputting information or apparatus for outputting information may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as news-like applications, web browser applications, shopping-like applications, search-like applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting news browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer iv, mpeg compression standard Audio Layer 4), laptop and desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background analysis server that provides a hotness analysis service for news displayed on the terminal devices 101, 102, 103. The background analysis server may analyze and perform other processing on the received event popularity analysis request and other data, and feed back a processing result (e.g., popularity of news) to the terminal device.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for outputting information provided by the embodiment of the present disclosure is generally performed by the server 105, and accordingly, the apparatus for outputting information is generally disposed in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for outputting information in accordance with the present disclosure is shown. The method for outputting information comprises the following steps:
step 201, obtaining the event name of the event with the popularity to be inquired.
In the present embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for outputting information may acquire an event name of an event to be queried hotly from an event library. A plurality of events expressing the same event are grouped into a cluster in the event library, and share an event name.
Some terms are explained below:
event is a cluster, event discovery is that a resource cluster describing a certain event is obtained through clustering, and the event is equivalent to the event cluster and the cluster in the text.
Event name: a segment describing the core content of an event.
An event core entity: the entities involved in the core of the event that occurred include people, institutions, places, and the like.
Precision event query (search term): the main content of the query is to represent the event itself.
Generic event query: the query itself does not contain information about the event itself, but contains the key entities of the event, such as "XX divorce review", and the universal query is "XX".
In step 202, the candidate document set is recalled by the event name.
In this embodiment, the candidate documents may be recalled via an existing recall tool, such as an elastic search. The candidate documents may be segments of text that are intercepted from microblogs, websites, and the like.
Step 203, calculating a first similarity between the event name and each candidate document.
In the embodiment, the first similarity, for example, the cosine similarity, may be calculated by a common similarity calculation method. The event name and the text in each candidate document may be tokenized and then similarity calculations may be performed.
Step 204, candidate documents with the first similarity larger than a predetermined first threshold are added to the relevant document set.
In this embodiment, the most similar document is selected from the candidate documents, and the original source document is found, i.e. the related document. The related documents may be original documents originating from microblogs, websites, and the like.
In step 205, a weighted sum of the features of the relevant documents is calculated as the heat of the event and output.
In this embodiment, the characteristics of the relevant document may include at least one of: the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents. Other features may also be included, such as click volume.
With further reference to fig. 3, a flow 300 of yet another embodiment of a method for outputting information is shown. The process 300 of the method for outputting information includes the steps of:
step 301, obtaining an event name of an event with a popularity to be queried.
In step 302, the candidate document set is recalled by the event name.
Step 303, calculating a first similarity between the event name and each candidate document.
Step 304, candidate documents with a first similarity greater than a predetermined first threshold are added to the relevant document set.
The steps 301-304 are substantially the same as the steps 201-204, and therefore will not be described again.
Step 305, recalling the first search term set by the event name.
In this embodiment, the search term, i.e. query, is a search keyword/sentence input by the user through the search engine. The method can be recalled through an existing search tool, and also can be recalled from a query library through a pre-trained neural network model. The model is used for distinguishing the probability that the event name and the query describe the same event. During training, the event name and the query describing the same event can be used as positive samples, the event name and the query describing different events can be used as negative samples, training is carried out according to a conventional neural network training method, and the probability that the event name and the query describe the same event can be judged by an obtained model. And inputting the words in the query library and the event names into the model in sequence, and recalling the query if the probability obtained by a certain query is higher than a normalized threshold value.
Step 306, calculating a second similarity between the event name and each first search term.
In this embodiment, the second similarity may still be calculated by cosine similarity or the like.
Step 307, adding the first search term with the second similarity greater than a predetermined second threshold to the refined search term set.
In this embodiment, the search term recalled directly by the event name is an exact search, which is distinguished from the search term recalled later by the core entity. Precision event query (search term): the main content of the query is to represent the event itself. Generic event query: the query itself does not contain information about the event itself, but contains the key entities of the event, such as "XX divorce review", and the universal query is "XX".
At step 308, the core entity is extracted from the event name.
In this embodiment, the core entity may be extracted when creating the event library, or may be extracted when using a flood search. Core entity: entities involved in the core of the event that occurred include people, institutions, places, and the like. For example, "XX divorce double trial," the core entity is "XX". Entities can be extracted through the model, using NER (named entity recognition) techniques.
Step 309, recalling the second search term set through the core entity, and extracting the core entity of each second search term.
In this embodiment, for each second search term, the core entity is extracted therefrom. Can be extracted by a model marking method.
In step 310, a third similarity between the core entity of the event name and the core entities of the second search terms is calculated.
In this embodiment, the third similarity may be calculated by a rule matching method.
And 311, adding the second search term with the third similarity larger than a preset third threshold value to the universal search term set.
In this embodiment, the second search term with higher similarity is retained as the general search term.
And step 312, calculating the weighted sum of the characteristics of the relevant documents, the characteristics of the accurate search words and the characteristics of the universal search words as the heat of the event and outputting the heat.
In the present embodiment, the features of both the refined search term and the generalized search term include the number of historical impressions (pv). Click volume and other features may also be included. The features of the related documents, the features of the refined search terms, and the features of the general search terms are respectively given different weights, and then the weighted sum is calculated. The sum of all weights is 1. The weights of the history display times of the related document number and the accurate event query are larger, and the weights of the history display times of the related document forwarding number, the related document comment number and the query of the universal event are smaller.
With continued reference to fig. 4, fig. 4 is a schematic diagram of an application scenario of the method for outputting information according to the present embodiment. In the application scenario of fig. 4, the following steps are performed:
1. and acquiring an event from the event library which is clustered, wherein the event comprises information such as an event name, an entity of a core for extracting the event, a document of the event cluster and the like.
2. Calculating the characteristics contained in the multi-source document; 1) recalling the candidate document set through the event name; 2) Calculating event similarity between the event name and the candidate document; 3) obtaining a set of relevant documents of the event candidate in a card threshold mode; 4) the characteristics contained in the documents are respectively given, and the characteristics comprise the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the number of comments below the related documents.
3. Calculating the precise event query characteristics: 1) recalling the candidate query from the query library through the event name; 2) calculating the similarity of the event name and the query; 2) obtaining an accurate event related query list through a card similarity threshold; 3) counting total pv of query related to the precise event;
4. calculating a generic event query feature: 1) recalling candidate query from the query library through the core entity of the event name; 2) calculating the similarity of the core entity of the event name and the core entity of the query; 2) obtaining a query list related to the generic event through a card similarity threshold; 3) Counting total pv of query related to the generic event;
5. and weighting and calculating the calculated characteristics to obtain the event heat.
The method has wide application value in search engines and recommendation systems; the method sequences the events, helps the system to understand the user requirements, improves the user experience, and can know the change of the outside world at the highest speed.
With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for outputting information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 5, the apparatus 500 for outputting information of the present embodiment includes: an acquisition unit 501, a first recall unit 502, a first calculation unit 503, a first filtering unit 504, and an output unit 505. The obtaining unit 501 is configured to obtain an event name of an event with a hotness to be queried; a first recall unit 502 configured to recall the candidate document set by an event name; a first calculation unit 503 configured to calculate a first similarity between the event name and each candidate document; a first filtering unit 504 configured to add candidate documents having a first similarity greater than a predetermined first threshold to a set of related documents; and an output unit 505 configured to calculate a weighted sum of features of the respective related documents as a heat of the event and output.
In the present embodiment, a device acquisition unit 501 for outputting information, a first recall unit 502, a first calculation unit 503, a first filtering unit 504, and an output unit 505. Reference may be made to step 201, step 202, step 203, step 204 and step 205 in the corresponding embodiment of fig. 2.
In some optional implementations of this embodiment, the apparatus 500 further includes: a second recall unit configured to recall the first search term set by an event name; a second calculation unit configured to calculate a second similarity between the event name and each of the first search terms; a second filtering unit configured to add the first search word having the second similarity greater than a predetermined second threshold to the accurate search word set; a first correcting unit configured to correct the heat using a weighted sum of the features of the respective precise search words.
In some optional implementations of this embodiment, the apparatus 500 further includes: an extraction unit configured to extract a core entity from the event name; a third recall unit configured to recall the second search term set through the core entity and extract the core entity of each second search term; a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term; a third filtering unit configured to add a second search word of which a third similarity is greater than a predetermined third threshold to the generic search word set; a second correcting unit configured to correct the degree of heat using a weighted sum of the features of the respective generic search words.
In some optional implementations of the embodiment, the characteristics of the relevant document include at least one of: the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents.
In some alternative implementations of the present embodiment, the characteristics of the refined search terms include history exposure times.
Referring now to FIG. 6, a schematic diagram of an electronic device (e.g., a server or terminal device of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The terminal device/server shown in fig. 6 is only an example, and should not bring any limitation to the function and use range of the embodiments of the present disclosure.
As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate wirelessly or by wire with other devices to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in embodiments of the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an event name of an event with the popularity to be inquired; recalling the candidate document set by the event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with a first similarity degree larger than a preset first threshold value to a related document set; and calculating the weighted sum of the characteristics of the relevant documents as the heat of the event and outputting the heat.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a first recall unit, a first calculation unit, a first filtering unit, and an output unit. The names of these units do not in some cases constitute a limitation to the unit itself, and for example, the acquiring unit may also be described as a "unit that acquires event names of events to be queried for hotness".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present disclosure is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (12)

1. A method for outputting information, comprising:
acquiring an event name of an event with the popularity to be inquired;
recalling a set of candidate documents by the event name;
calculating first similarity between the event name and each candidate document;
adding candidate documents with the first similarity larger than a preset first threshold value to a related document set;
and calculating the weighted sum of the characteristics of the relevant documents as the heat of the event and outputting the heat.
2. The method of claim 1, wherein the method further comprises:
recalling a first set of search terms by the event name;
calculating a second similarity between the event name and each first search term;
adding the first search word with the second similarity larger than a preset second threshold value to the precise search word set;
the heat is corrected using a weighted sum of the features of each exact search term.
3. The method according to claim 1 or 2, wherein the method further comprises:
extracting a core entity from the event name;
recalling a second search term set through the core entity, and extracting the core entity of each second search term;
calculating a third similarity between the core entity of the event name and the core entity of each second search term;
adding a second search word with a third similarity larger than a preset third threshold value to the universal search word set;
the heat is corrected using a weighted sum of the features of the respective generic search terms.
4. The method of claim 1, wherein the characteristics of the relevant documents include at least one of:
the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents.
5. The method of claim 2, wherein the characteristics of the refined search terms include historical impressions.
6. An apparatus for outputting information, comprising:
an acquisition unit configured to acquire an event name of an event of a degree of hotness to be queried;
a first recall unit configured to recall a set of candidate documents by the event name;
a first calculation unit configured to calculate a first similarity between the event name and each candidate document;
a first filtering unit configured to add candidate documents having a first similarity greater than a predetermined first threshold to a set of related documents;
and the output unit is configured to calculate the weighted sum of the characteristics of the relevant documents as the heat of the event and output the heat.
7. The apparatus of claim 6, wherein the apparatus further comprises:
a second recall unit configured to recall the first search term set by the event name;
a second calculation unit configured to calculate a second similarity between the event name and each of the first search terms;
a second filtering unit configured to add the first search word having the second similarity greater than a predetermined second threshold to the refined search word set;
a first correcting unit configured to correct the degree of heat using a weighted sum of features of the respective precise search words.
8. The apparatus of claim 6 or 7, wherein the apparatus further comprises:
an extraction unit configured to extract a core entity from the event name;
a third recall unit configured to recall the second search term set through the core entity and extract the core entity of each second search term;
a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term;
a third filtering unit configured to add a second search word of which a third similarity is greater than a predetermined third threshold to the generic search word set;
a second correcting unit configured to correct the degree of heat using a weighted sum of features of the respective broad search terms.
9. The apparatus of claim 6, wherein the characteristics of the relevant document comprise at least one of:
the number of related documents, the sum of the forwarding numbers of the related documents and the sum of the comments below the related documents.
10. The apparatus of claim 7, wherein the characteristics of the refined search terms include historical impressions.
11. An electronic device for outputting information, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.
CN202010196580.1A 2020-03-19 2020-03-19 Method and device for outputting information Active CN111382365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010196580.1A CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010196580.1A CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Publications (2)

Publication Number Publication Date
CN111382365A true CN111382365A (en) 2020-07-07
CN111382365B CN111382365B (en) 2023-07-28

Family

ID=71217314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010196580.1A Active CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Country Status (1)

Country Link
CN (1) CN111382365B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722593A (en) * 2021-08-31 2021-11-30 北京百度网讯科技有限公司 Event data processing method and device, electronic equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102937960A (en) * 2012-09-06 2013-02-20 北京邮电大学 Device and method for identifying and evaluating emergency hot topic
CN104636487A (en) * 2015-02-26 2015-05-20 湖北光谷天下传媒股份有限公司 Advertising information management method
US20160364488A1 (en) * 2015-06-12 2016-12-15 Baidu Online Network Technology (Beijing) Co., Ltd Microblog-based event context acquiring method and system
US20170235820A1 (en) * 2016-01-29 2017-08-17 Jack G. Conrad System and engine for seeded clustering of news events
CN107491547A (en) * 2017-08-28 2017-12-19 北京百度网讯科技有限公司 Searching method and device based on artificial intelligence
CN107491518A (en) * 2017-08-15 2017-12-19 北京百度网讯科技有限公司 Method and apparatus, server, storage medium are recalled in one kind search
CN107885873A (en) * 2017-11-28 2018-04-06 百度在线网络技术(北京)有限公司 Method and apparatus for output information
GB201808875D0 (en) * 2018-05-31 2018-07-18 Uxlabs Ltd Method, apparatus and computer program for information retrieval using query expansion
CN108572990A (en) * 2017-03-14 2018-09-25 百度在线网络技术(北京)有限公司 Information-pushing method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102937960A (en) * 2012-09-06 2013-02-20 北京邮电大学 Device and method for identifying and evaluating emergency hot topic
CN104636487A (en) * 2015-02-26 2015-05-20 湖北光谷天下传媒股份有限公司 Advertising information management method
US20160364488A1 (en) * 2015-06-12 2016-12-15 Baidu Online Network Technology (Beijing) Co., Ltd Microblog-based event context acquiring method and system
US20170235820A1 (en) * 2016-01-29 2017-08-17 Jack G. Conrad System and engine for seeded clustering of news events
CN108572990A (en) * 2017-03-14 2018-09-25 百度在线网络技术(北京)有限公司 Information-pushing method and device
CN107491518A (en) * 2017-08-15 2017-12-19 北京百度网讯科技有限公司 Method and apparatus, server, storage medium are recalled in one kind search
CN107491547A (en) * 2017-08-28 2017-12-19 北京百度网讯科技有限公司 Searching method and device based on artificial intelligence
CN107885873A (en) * 2017-11-28 2018-04-06 百度在线网络技术(北京)有限公司 Method and apparatus for output information
GB201808875D0 (en) * 2018-05-31 2018-07-18 Uxlabs Ltd Method, apparatus and computer program for information retrieval using query expansion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIANGFEI KONG等: "Deep news event ranker based on user relevant query", 《2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS》, pages 363 - 367 *
童旺宇等: "基于信息资源开放整合的图书检索结果排序优化", 《图书馆学研究》, no. 23, pages 65 - 71 *
郝晓波: "面向微信数据的事件发现及其热度计算方法研究", 《万方学位论文库》, pages 1 - 88 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722593A (en) * 2021-08-31 2021-11-30 北京百度网讯科技有限公司 Event data processing method and device, electronic equipment and medium
CN113722593B (en) * 2021-08-31 2024-01-16 北京百度网讯科技有限公司 Event data processing method, device, electronic equipment and medium

Also Published As

Publication number Publication date
CN111382365B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN107679211B (en) Method and device for pushing information
WO2020156389A1 (en) Information pushing method and device
CN111368185B (en) Data display method and device, storage medium and electronic equipment
CN111414498A (en) Multimedia information recommendation method and device and electronic equipment
CN111460288B (en) Method and device for detecting news event
CN112052297B (en) Information generation method, apparatus, electronic device and computer readable medium
US20220391425A1 (en) Method and apparatus for processing information
CN111324700A (en) Resource recall method and device, electronic equipment and computer-readable storage medium
CN111078849B (en) Method and device for outputting information
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN114357325A (en) Content search method, device, equipment and medium
CN113590756A (en) Information sequence generation method and device, terminal equipment and computer readable medium
CN111339452B (en) Method, terminal, server and system for displaying search result
CN111262744B (en) Multimedia information transmitting method, backup server and medium
CN111382365B (en) Method and device for outputting information
CN112148962B (en) Method and device for pushing information
CN110895587B (en) Method and device for determining target user
CN111382262A (en) Method and apparatus for outputting information
CN111782933A (en) Method and device for recommending book list
CN111597441B (en) Information processing method and device and electronic equipment
CN111737571B (en) Searching method and device and electronic equipment
CN114239501A (en) Contract generation method, apparatus, device and medium
CN113220922A (en) Image searching method and device and electronic equipment
CN113283115B (en) Image model generation method and device and electronic equipment
CN111339770A (en) Method and apparatus for outputting information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant