CN111382365B - Method and device for outputting information - Google Patents

Method and device for outputting information Download PDF

Info

Publication number
CN111382365B
CN111382365B CN202010196580.1A CN202010196580A CN111382365B CN 111382365 B CN111382365 B CN 111382365B CN 202010196580 A CN202010196580 A CN 202010196580A CN 111382365 B CN111382365 B CN 111382365B
Authority
CN
China
Prior art keywords
event
event name
unit configured
similarity
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010196580.1A
Other languages
Chinese (zh)
Other versions
CN111382365A (en
Inventor
潘禄
陈玉光
李法远
韩翠云
刘远圳
黄佳艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010196580.1A priority Critical patent/CN111382365B/en
Publication of CN111382365A publication Critical patent/CN111382365A/en
Application granted granted Critical
Publication of CN111382365B publication Critical patent/CN111382365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for outputting information, and relates to the technical field of knowledge maps. One embodiment of the method comprises the following steps: acquiring an event name of an event with heat to be queried; recall the candidate document set by event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with the first similarity greater than a predetermined first threshold to a set of related documents; and calculating and outputting the weighted sum of the characteristics of each related document as the heat of the event. The embodiment can accurately reflect the heat of the user to the event, help the system understand the user demand and promote the user experience.

Description

Method and device for outputting information
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for outputting information.
Background
With the rapid popularization of the internet, network information has been increasing explosively, and everyone needs to spend a great deal of effort to screen the information. When a user wants to know what happens recently or pay attention to a person or an organization, important information needs to be selected from a large number of news information which is not filtered and finished. At present, a certain progress is made in the aggregation and extraction of events, and the current search engine/recommendation system can recommend information according to whether the events are the events or not, so that a user can be helped to screen irrelevant information, and the efficiency of obtaining important information by the user is improved. But different events have different user attention degrees in the event life cycle, so that the user attention sequence of the current event user can be sequenced by calculating the event heat degree in order to better promote the user experience of the event in the application process, the user requirements can be perceived and understood in advance, and the user experience of information acquisition is further promoted.
In the prior art, the event heat is calculated by taking the information of a certain social network site as a main characteristic source, and the mode leads to larger deviation in heat calculation, mainly, the user of the certain social network site cannot represent the attention points of most users, the user focuses on the event on the social network site, and meanwhile focuses on the event by various modes such as active search, active clicking on recommended articles and the like, so that the heat of the user on the event is difficult to accurately reflect only through one or two of the search engines, the social network sites, the recommendation systems, the portal sites and the like.
Disclosure of Invention
Embodiments of the present disclosure propose methods and apparatus for outputting information.
In a first aspect, embodiments of the present disclosure provide a method for outputting information, comprising: acquiring an event name of an event with heat to be queried; recall the candidate document set by event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with the first similarity greater than a predetermined first threshold to a set of related documents; and calculating and outputting the weighted sum of the characteristics of each related document as the heat of the event.
In some embodiments, the method further comprises: recall the first set of search terms by event name; calculating a second similarity between the event name and each first search term; adding a first search term having a second similarity greater than a predetermined second threshold to the set of precise search terms; the heat is corrected using a weighted sum of features of each exact search term.
In some embodiments, the method further comprises: extracting a core entity from the event name; recall the second search word set through the core entity, and extract the core entity of each second search word; calculating a third similarity between the core entity of the event name and the core entity of each second search term; adding a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms; the heat is corrected using a weighted sum of features of the broad search terms.
In some embodiments, the characteristics of the relevant document include at least one of: the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents.
In some embodiments, the features of the precision search term include historical impressions.
In a second aspect, embodiments of the present disclosure provide an apparatus for outputting information, comprising: the acquisition unit is configured to acquire event names of events of which the heat is to be queried; a first recall unit configured to recall the candidate document set by event name; a first calculation unit configured to calculate a first similarity between the event name and each candidate document; a first filtering unit configured to add candidate documents having a first similarity greater than a predetermined first threshold to the set of related documents; and an output unit configured to calculate a weighted sum of features of each relevant document as a heat degree of the event and output.
In some embodiments, the apparatus further comprises: a second recall unit configured to recall the first set of search terms by event name; a second calculation unit configured to calculate a second similarity between the event name and each of the first search words; a second filtering unit configured to add a first search term having a second similarity greater than a predetermined second threshold to the set of accurate search terms; a first correction unit configured to correct the hotness using a weighted sum of features of each exact search term.
In some embodiments, the apparatus further comprises: an extraction unit configured to extract a core entity from the event name; a third recall unit configured to recall the second search word set through the core entity and extract the core entity of each second search word; a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term; a third filtering unit configured to add a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms; and a second correction unit configured to correct the heat using a weighted sum of features of the respective broad search terms.
In some embodiments, the characteristics of the relevant document include at least one of: the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents.
In some embodiments, the features of the precision search term include historical impressions.
In a third aspect, embodiments of the present disclosure provide an electronic device for outputting information, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method as in any of the first aspects.
In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any of the first aspects.
In order to reduce deviation existing in the calculation of the heat, the method and the device for outputting the information introduce multi-source data and characteristics, the multi-source characteristics are associated with the event through the technology of researching and developing the association of the data and the characteristics with the event, and then final heat values are obtained through weighting the characteristics, and the values can be used as the ordering of the event and provided for a search engine/recommendation system to order and operate the characteristics of the event related information.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
FIG. 2 is a flow chart of one embodiment of a method for outputting information according to the present disclosure;
FIG. 3 is a flow chart of yet another embodiment of a method for outputting information according to the present disclosure;
FIG. 4 is a schematic illustration of one application scenario of a method for outputting information according to the present disclosure;
FIG. 5 is a schematic structural diagram of one embodiment of an apparatus for outputting information according to the present disclosure;
fig. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the methods of the present disclosure for outputting information or apparatuses for outputting information may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as news-like applications, web browser applications, shopping-like applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting news browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background analysis server providing a heat analysis service for news displayed on the terminal devices 101, 102, 103. The background analysis server may analyze the received data such as the event heat analysis request, and feedback the processing result (for example, the heat of news) to the terminal device.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., a plurality of software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be noted that, the method for outputting information provided by the embodiments of the present disclosure is generally performed by the server 105, and accordingly, the apparatus for outputting information is generally provided in the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of one embodiment of a method for outputting information according to the present disclosure is shown. The method for outputting information comprises the following steps:
step 201, obtaining an event name of an event of which the heat is to be queried.
In this embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for outputting information may acquire an event name of an event of which the heat is to be queried from an event library. The event library groups a plurality of events representing the same event into a cluster, and the events share an event name.
Some terms are described below:
the event is a cluster, and the event discovery is to obtain a resource cluster describing a certain event through clustering, wherein the event is equal to the event cluster.
Event name: fragments of event core content are described.
Event core entity: the participating entities of the core in the event that occurs include people, institutions, places, etc.
Exact event query (search term): the main content of the query is the representation of the event itself.
Pan event query: the query itself does not contain information about the event itself, but contains key entities for the event, such as "XX divorce case divorce", and the general query is "XX".
Step 202 recalls a candidate document set by event name.
In this embodiment, candidate documents may be recalled by existing recall tools, such as an elastic search. The candidate documents may be text segments intercepted from a microblog, website, or the like.
In step 203, a first similarity between the event name and each candidate document is calculated.
In this embodiment, the first similarity may be calculated by a common similarity calculation method, for example, cosine similarity. The event name and text in each candidate document may be segmented and then similarity calculated.
In step 204, candidate documents having a first similarity greater than a predetermined first threshold are added to the set of related documents.
In this embodiment, the most similar document is selected from the candidate documents, and the original source document is found, namely the relevant document. The relevant document may be an original document derived from a microblog, a website, or the like.
Step 205, a weighted sum of the features of each relevant document is calculated as the heat of the event and output.
In this embodiment, the characteristics of the relevant document may include at least one of: the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents. Other features such as click volume may also be included.
With further reference to fig. 3, a flow 300 of yet another embodiment of a method for outputting information is shown. The flow 300 of the method for outputting information comprises the steps of:
step 301, obtaining an event name of an event of which the heat is to be queried.
Step 302, recall a candidate document set by event name.
In step 303, a first similarity between the event name and each candidate document is calculated.
In step 304, candidate documents having a first similarity greater than a predetermined first threshold are added to the set of related documents.
Steps 301-304 are substantially identical to steps 201-204 and are therefore not described in detail.
Step 305 recalls the first set of search terms with the event name.
In this embodiment, the search term, i.e., query, is a search keyword/sentence input by the user through the search engine. Can be recalled by the existing searching tool, and can also be recalled from the query library by a pre-trained neural network model. The model is used for judging the probability that the event name and the query describe the same event. During training, the event names and the query describing the same event can be used as positive samples, the event names and the query describing different events can be used as negative samples, training is performed according to a conventional neural network training method, and the obtained model can judge the probability that the event names and the query describe the same event. And inputting words in the query library and event names into the model in sequence, and recalling a query if the probability obtained by the query is higher than a normalization threshold.
Step 306, a second similarity between the event name and each of the first search terms is calculated.
In this embodiment, the second similarity may be calculated by cosine similarity or the like.
Step 307, adding the first search term having a second similarity greater than a predetermined second threshold to the set of precise search terms.
In this embodiment, the search term recalled directly by the event name is an exact search, as distinguished from the search terms recalled later by the core entity. Exact event query (search term): the main content of the query is the representation of the event itself. Pan event query: the query itself does not contain information about the event itself, but contains key entities for the event, such as "XX divorce case divorce", and the general query is "XX".
Step 308, extracting the core entity from the event name.
In this embodiment, the core entity may be extracted when creating the event library, or may be extracted when using a pan search. Core entity: the participating entities of the core in the event that occurs include people, institutions, places, etc. Such as "XX divorce case divorce", the core entity is "XX". The entities can be extracted by a model using NER (named entity recognition) techniques.
Step 309, recall the second set of search terms through the core entity and extract the core entity of each second search term.
In this embodiment, for each second search term, a core entity is extracted therefrom. Can be extracted by a model labeling method.
Step 310, calculating a third similarity between the core entity of the event name and the core entity of each second search term.
In this embodiment, the third similarity may be calculated by a rule matching method.
Step 311, adding a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms.
In this embodiment, the second search word with higher similarity is reserved as the broad search word.
Step 312, a weighted sum of the features of each relevant document, the features of the exact search term and the features of the broad search term is calculated as the heat of the event and output.
In this embodiment, the features of both the precise search term and the generic search term include the number of historical impressions (pv). Features such as click volume may also be included. The features of the relevant document, the features of the accurate search term and the features of the broad search term are respectively given different weights, and then a weighted sum is calculated. The sum of all weights is 1. The weight of the historical display times of the related document number and the accurate event query is larger, and the weight of the historical display times of the related document forwarding number, the related document comment number and the query of the universal event is smaller.
With continued reference to fig. 4, fig. 4 is a schematic diagram of an application scenario of the method for outputting information according to the present embodiment. In the application scenario of fig. 4, the following steps are performed:
1. an event is obtained from the clustered event library, and the event already contains information such as an event name, an entity extracting the core of the event, a document of the event cluster and the like.
2. Calculating the characteristics contained in the multi-source document; 1) Recall the candidate document set by event name; 2) Calculating the similarity of the event between the event name and the candidate document; 3) Obtaining a set of related documents of the event candidate in a card threshold mode; 4) The characteristics contained in the documents are respectively given and comprise the number of relevant documents, the sum of forwarding numbers of the relevant documents and the sum of comments below the relevant documents.
3. Calculating accurate event query characteristics: 1) Recall candidate queries from the query library by event name; 2) Calculating the similarity of the event by the event name and the query; 2) Obtaining an accurate event related query list through a card similarity threshold; 3) Counting the total pv of the accurate event related query;
4. calculating the general event query characteristics: 1) Recalling candidate queries from the query library through the core entity of the event name; 2) The core entity of the event name and the core entity of the query calculate the similarity of the core entity; 2) Obtaining a query list related to the universal event through a card similarity threshold; 3) Counting the total pv of the related query of the universal event;
5. and weighting and calculating each calculated characteristic to obtain the event heat.
The invention has wide application value in search engines and recommendation systems; the method provided by the invention sorts the events, helps the system understand the user requirements, improves the user experience, and can know the change of the outside at the highest speed.
With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of an apparatus for outputting information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for outputting information of the present embodiment includes: an acquisition unit 501, a first recall unit 502, a first calculation unit 503, a first filtering unit 504 and an output unit 505. Wherein, the obtaining unit 501 is configured to obtain an event name of an event of which the heat is to be queried; a first recall unit 502 configured to recall the candidate document set by event name; a first calculation unit 503 configured to calculate a first similarity between the event name and each candidate document; a first filtering unit 504 configured to add candidate documents having a first similarity greater than a predetermined first threshold to the set of related documents; an output unit 505 configured to calculate a weighted sum of features of each relevant document as the heat of the event and output.
In the present embodiment, an apparatus acquisition unit 501, a first recall unit 502, a first calculation unit 503, a first filtering unit 504, and an output unit 505 for outputting information. Reference may be made to steps 201, 202, 203, 204 and 205 in the corresponding embodiment of fig. 2.
In some optional implementations of this embodiment, the apparatus 500 further includes: a second recall unit configured to recall the first set of search terms by event name; a second calculation unit configured to calculate a second similarity between the event name and each of the first search words; a second filtering unit configured to add a first search term having a second similarity greater than a predetermined second threshold to the set of accurate search terms; a first correction unit configured to correct the hotness using a weighted sum of features of each exact search term.
In some optional implementations of this embodiment, the apparatus 500 further includes: an extraction unit configured to extract a core entity from the event name; a third recall unit configured to recall the second search word set through the core entity and extract the core entity of each second search word; a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term; a third filtering unit configured to add a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms; and a second correction unit configured to correct the heat using a weighted sum of features of the respective broad search terms.
In some optional implementations of this embodiment, the characteristics of the relevant document include at least one of: the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents.
In some alternative implementations of the present embodiment, the characteristics of the precise search term include historical impressions.
Referring now to fig. 6, a schematic diagram of an electronic device (e.g., server or terminal device of fig. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The terminal device/server illustrated in fig. 6 is merely an example, and should not impose any limitation on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 6 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 601. It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an event name of an event with heat to be queried; recall the candidate document set by event name; calculating a first similarity between the event name and each candidate document; adding candidate documents with the first similarity greater than a predetermined first threshold to a set of related documents; and calculating and outputting the weighted sum of the characteristics of each related document as the heat of the event.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a first recall unit, a first calculation unit, a first filter unit, and an output unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the acquisition unit may also be described as "a unit that acquires an event name of an event of which the heat is to be queried".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which any combination of features described above or their equivalents is contemplated without departing from the inventive concepts described. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (10)

1. A method for outputting information, comprising:
obtaining event names of events with heat to be queried from an event library, wherein a plurality of events expressing the same event are grouped into a cluster in the event library, and share one event name, wherein the event name is a segment describing event core content;
recalling a candidate document set through the event name;
calculating a first similarity between the event name and each candidate document;
adding the original source documents of the candidate documents with the first similarity greater than a predetermined first threshold to a related document set;
calculating and outputting a weighted sum of characteristics of each related document as the heat of the event, wherein the characteristics of the related document comprise at least one of the following:
the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents.
2. The method of claim 1, wherein the method further comprises:
recall a first set of search terms from the event name;
calculating a second similarity between the event name and each first search term;
adding a first search term having a second similarity greater than a predetermined second threshold to the set of precise search terms;
the hotness is corrected using a weighted sum of features of each exact search term.
3. The method according to claim 1 or 2, wherein the method further comprises:
extracting a core entity from the event name;
recalling the second search word set through the core entity, and extracting the core entity of each second search word;
calculating a third similarity between the core entity of the event name and the core entity of each second search term;
adding a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms;
the hotness is corrected using a weighted sum of features of the broad search terms.
4. The method of claim 2, wherein the characteristic of the exact search term comprises a historical number of impressions.
5. An apparatus for outputting information, comprising:
the acquisition unit is configured to acquire event names of events of which the heat is to be queried from an event library, wherein a plurality of events expressing the same event are grouped into a cluster in the event library, and share one event name, and the event name is a fragment describing the core content of the event;
a first recall unit configured to recall a set of candidate documents by the event name;
a first calculation unit configured to calculate a first similarity between the event name and each candidate document;
a first filtering unit configured to add the original source documents of the candidate documents having a first similarity greater than a predetermined first threshold to the set of related documents;
an output unit configured to calculate a weighted sum of features of each related document as a heat of the event and output, wherein the features of the related document include at least one of:
the number of relevant documents, the sum of the forwarding numbers of the relevant documents and the sum of the comment numbers under the relevant documents.
6. The apparatus of claim 5, wherein the apparatus further comprises:
a second recall unit configured to recall the first set of search terms via the event name;
a second calculation unit configured to calculate a second similarity between the event name and each of the first search words;
a second filtering unit configured to add a first search term having a second similarity greater than a predetermined second threshold to the set of accurate search terms;
a first correction unit configured to correct the hotness using a weighted sum of features of each exact search term.
7. The apparatus of claim 5 or 6, wherein the apparatus further comprises:
an extraction unit configured to extract a core entity from the event name;
a third recall unit configured to recall the second search word set through the core entity and extract a core entity of each second search word;
a third calculation unit configured to calculate a third similarity between the core entity of the event name and the core entity of each second search term;
a third filtering unit configured to add a second search term having a third similarity greater than a predetermined third threshold to the set of broad search terms;
and a second correction unit configured to correct the hotness using a weighted sum of features of the respective broad search terms.
8. The apparatus of claim 6, wherein the characteristic of the exact search term comprises a historical number of impressions.
9. An electronic device for outputting information, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
10. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-4.
CN202010196580.1A 2020-03-19 2020-03-19 Method and device for outputting information Active CN111382365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010196580.1A CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010196580.1A CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Publications (2)

Publication Number Publication Date
CN111382365A CN111382365A (en) 2020-07-07
CN111382365B true CN111382365B (en) 2023-07-28

Family

ID=71217314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010196580.1A Active CN111382365B (en) 2020-03-19 2020-03-19 Method and device for outputting information

Country Status (1)

Country Link
CN (1) CN111382365B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699314A (en) * 2020-12-25 2021-04-23 百度在线网络技术(北京)有限公司 Hot event determination method and device, electronic equipment and storage medium
CN113722593B (en) * 2021-08-31 2024-01-16 北京百度网讯科技有限公司 Event data processing method, device, electronic equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102937960A (en) * 2012-09-06 2013-02-20 北京邮电大学 Device and method for identifying and evaluating emergency hot topic
CN104636487A (en) * 2015-02-26 2015-05-20 湖北光谷天下传媒股份有限公司 Advertising information management method
GB201808875D0 (en) * 2018-05-31 2018-07-18 Uxlabs Ltd Method, apparatus and computer program for information retrieval using query expansion

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933129B (en) * 2015-06-12 2019-04-30 百度在线网络技术(北京)有限公司 Event train of thought acquisition methods and system based on microblogging
US11663254B2 (en) * 2016-01-29 2023-05-30 Thomson Reuters Enterprise Centre Gmbh System and engine for seeded clustering of news events
CN108572990B (en) * 2017-03-14 2021-05-25 上海优扬新媒信息技术有限公司 Information pushing method and device
CN107491518B (en) * 2017-08-15 2020-08-04 北京百度网讯科技有限公司 Search recall method and device, server and storage medium
CN107491547B (en) * 2017-08-28 2020-11-10 北京百度网讯科技有限公司 Search method and device based on artificial intelligence
CN107885873B (en) * 2017-11-28 2021-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for outputting information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102937960A (en) * 2012-09-06 2013-02-20 北京邮电大学 Device and method for identifying and evaluating emergency hot topic
CN104636487A (en) * 2015-02-26 2015-05-20 湖北光谷天下传媒股份有限公司 Advertising information management method
GB201808875D0 (en) * 2018-05-31 2018-07-18 Uxlabs Ltd Method, apparatus and computer program for information retrieval using query expansion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Deep news event ranker based on user relevant query;Xiangfei Kong等;《2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis》;363-367 *
基于信息资源开放整合的图书检索结果排序优化;童旺宇等;《图书馆学研究》(第23期);65-71 *
面向微信数据的事件发现及其热度计算方法研究;郝晓波;《万方学位论文库》;1-88 *

Also Published As

Publication number Publication date
CN111382365A (en) 2020-07-07

Similar Documents

Publication Publication Date Title
CN111414498B (en) Multimedia information recommendation method and device and electronic equipment
CN107679211B (en) Method and device for pushing information
WO2020156389A1 (en) Information pushing method and device
CN107885873B (en) Method and apparatus for outputting information
CN111368185A (en) Data display method and device, storage medium and electronic equipment
CN114443897B (en) Video recommendation method and device, electronic equipment and storage medium
CN111382365B (en) Method and device for outputting information
CN111460288B (en) Method and device for detecting news event
CN111078849B (en) Method and device for outputting information
CN110895587B (en) Method and device for determining target user
CN111737571B (en) Searching method and device and electronic equipment
CN113590756A (en) Information sequence generation method and device, terminal equipment and computer readable medium
CN111382262B (en) Method and device for outputting information
CN111597439A (en) Information processing method and device and electronic equipment
CN111597441B (en) Information processing method and device and electronic equipment
CN114239501A (en) Contract generation method, apparatus, device and medium
CN110598133A (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items
CN111897951A (en) Method and apparatus for generating information
CN113283115B (en) Image model generation method and device and electronic equipment
CN110543491A (en) Search method, search device, electronic equipment and computer-readable storage medium
CN114374738B (en) Information pushing method and device, storage medium and electronic equipment
CN114741407B (en) Condition query method and device and electronic equipment
CN113778387B (en) Method and device for generating code
CN112699289B (en) House source information aggregation display method and device, electronic equipment and computer readable medium
CN110619093B (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant