CN111104583A - Live broadcast room recommendation method, storage medium, electronic device and system - Google Patents

Live broadcast room recommendation method, storage medium, electronic device and system Download PDF

Info

Publication number
CN111104583A
CN111104583A CN201811178341.2A CN201811178341A CN111104583A CN 111104583 A CN111104583 A CN 111104583A CN 201811178341 A CN201811178341 A CN 201811178341A CN 111104583 A CN111104583 A CN 111104583A
Authority
CN
China
Prior art keywords
live broadcast
broadcast room
user
sorting
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811178341.2A
Other languages
Chinese (zh)
Other versions
CN111104583B (en
Inventor
何国宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Xingyi Network Technology Co ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201811178341.2A priority Critical patent/CN111104583B/en
Publication of CN111104583A publication Critical patent/CN111104583A/en
Application granted granted Critical
Publication of CN111104583B publication Critical patent/CN111104583B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a live broadcast room recommendation method, a storage medium, electronic equipment and a system, and relates to the field of internet live broadcast. And inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups. And establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database. And monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue. And acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting. The invention can recommend the live broadcast room in which the current client is interested to the client according to the real-time search content of the user.

Description

Live broadcast room recommendation method, storage medium, electronic device and system
Technical Field
The invention relates to the field of Internet live broadcast, in particular to a live broadcast room recommendation method, a storage medium, electronic equipment and a system.
Background
Live broadcasting attracts more and more people's attention as a broadcasting mode for simultaneously making and broadcasting television programs in the field along with the occurrence and development processes of time. The live broadcast platform is used as a polymerization platform with a plurality of live broadcast rooms, and besides some live broadcast rooms with people burning and exploding, the live broadcast rooms with a plurality of new people and people needing to be improved are provided. In order to improve the watching number of the new people in the live broadcast rooms and assist the users in finding favorite live broadcast rooms, the live broadcast platform generally recommends to the users.
However, in the live broadcast recommendation process, it is a common practice to calculate and infer the user's favorite live broadcast according to the user's offline behavior data (historical behavior data cached in the platform), and then recommend the user to the live broadcast. The offline behavior data generally includes past behaviors of the user, such as attention, watching, gift giving, bullet sending, and the like, that is, statistics are collected according to all behaviors of the user before. The common recommendation scheme can effectively utilize big data technology to calculate the historical interest of each user and carry out the personalized recommendation of thousands of people.
However, this recommendation has significant drawbacks: the first recommended live room is calculated based on the user's historical (today's past) behavior and does not reflect the user's real-time interests. For example, after a certain user pays attention to a new anchor, or has a new idea and interest, the user actually needs to receive recommendations about the new anchor, the new idea and the interest, and the recommendation system of the live broadcast platform recommends the user based on the historical interest of the user, so that the user feels difficult to find a currently preferred live broadcast room on the platform.
Secondly, as the live broadcast rooms are real-time, many users watch according to currently popular elements, for example, from one popular game to another popular game, the users watch along with the popularity, and the user behaviors only contain historical data, which have no currently popular information and are difficult to recommend more suitable popular live broadcast rooms to the users, so that the users feel that the platform cannot provide the most popular live broadcast content; again, for the account number logged in and the same person being used: the potential user may try to use the live platform, and the live platform recommends according to the historical information of the account, so that the potential user thinks that the live platform can recommend for a single aspect of interest, and the live platform loses the potential user. Finally, the live broadcast platform has a large number of users, and storing the historical data of each user occupies a large amount of storage space.
Therefore, a method for recommending a live broadcast room is needed to overcome the above drawbacks.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a live broadcast room recommending method, a storage medium, electronic equipment and a system, which can recommend a live broadcast room in which a current client is interested to the client according to the real-time search content of the user.
In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a live broadcast room recommending method for recommending, to a user, a live broadcast room in which the user is currently interested in a real-time manner, where the live broadcast room recommending method includes:
acquiring text information of each live broadcast room;
inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups;
establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database;
monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
and acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting.
As a preferred embodiment, the inputting the text information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of corresponding word groups includes:
differentiating the character information into a plurality of words and phrases combined together;
and scoring the phrases combined by the elements and the elements through a distributed search engine to obtain weights of the phrases combined by the corresponding elements and the elements, wherein the weights are scores of the relevancy of the text information of the live broadcast room corresponding to the elements or the phrases.
As a preferred embodiment, the detecting the search behavior in real time and acquiring the input sentence specifically includes: storing the input sentences each time according to the number of times, and acquiring the latest input sentences for a plurality of times.
As a preferred embodiment, the index word storage structure is a sorted set structure of the Redis database, and the double type score of the sorted set structure is used for storing the search timestamp.
As a preferred embodiment, the text information includes a live room title, a live room ID, an anchor ID, and an anchor nickname.
As a preferred embodiment, the text message is split using an ElasticSearch word segmentation tool, and the inverted index is stored in an ElasticSearch cluster.
As a preferred implementation scheme, a plurality of historical behavior recommendation live broadcast rooms are selected according to historical behavior characteristics of a user, and the historical behavior recommendation live broadcast rooms and a preset number of live broadcast rooms with optimal recommendation sequencing are sent to a client together.
In a second aspect, an embodiment of the present invention further provides a live broadcast room recommendation system, which includes:
the splitting module is used for obtaining the text information of all live broadcasting rooms, splitting the text information from complex to simple according to the composition of the text information, and establishing reverse sorting indexes step by step according to the complexity to store the reverse sorting indexes into a database;
the retrieval module is used for detecting the search behavior of the user in real time, acquiring the input sentence of the user, splitting the input sentence of the user to obtain retrieval words, and storing the retrieval words into the kafka message queue;
and the recommending module is used for acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching inverted sorting indexes in the database, sorting the searched live broadcasting rooms according to the number of the search words from most to least, and recommending the live broadcasting rooms with the top preset number of sorting.
In a third aspect, an embodiment of the present invention further provides a storage medium, where a computer program is stored on the storage medium, and when being executed by a processor, the computer program implements the method in the embodiment of the first aspect.
In a fourth aspect, an embodiment of the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program running on the processor, and the processor executes the computer program to implement the method in the first aspect.
Compared with the prior art, the invention has the advantages that:
(1) according to the live broadcast room recommending method, the storage medium, the electronic equipment and the system, words which are searched for several times recently are searched, data left by historical behaviors of the user are not considered, recommendation can be performed according to real-time interest and hobbies used for, and the fact that the live broadcast room recommended to the user is more suitable for the current needs of the user is guaranteed; meanwhile, word segmentation is carried out on the text information in the live broadcast room, and storage is carried out in an inverted index mode, so that the live broadcast room can be found more quickly and more definitely when the live broadcast room is searched according to the search words. In addition, since the real-time search behavior of the user is aimed at, the requirement is fast, stable and does not crash, the invention can better ensure the user experience by using the kafka queue and the real-time stream processing framework storm.
(2) According to the live broadcast room recommendation method, the storage medium, the electronic equipment and the system, when words are segmented, words are segmented according to the word elements, so that the text information of the live broadcast room can be better split, and therefore the live broadcast room more meeting the requirements of a user can be found when the text information is searched in the inverted index.
(3) According to the live broadcast room recommending method, the storage medium, the electronic equipment and the system, sentences input by a user each time are stored one by one according to times, meanwhile, the latest sentences input by the user are obtained, sampling according to interests and preferences generated by the user latest is guaranteed, meanwhile, the sentences are stored through the redis database, the sorted structure is a double type score, the pixel searching timestamp can be better stored, the platform can deal with a large number of clients, and the currently interesting live broadcast room is recommended for the platform in real time.
(4) According to the live broadcast room recommendation method, the storage medium, the electronic equipment and the system, the live broadcast room title, the live broadcast room ID, the anchor nickname and the like of each live broadcast room are taken as character information to be split, so that phrases which can be searched in inverted indexes are necessary and sufficient, and the probability of recommending the live broadcast room fitting with user experience is improved.
(5) According to the live broadcast room recommendation method, the storage medium, the electronic equipment and the system, the Elastic Search word segmentation tool is used for splitting the character information, the word segmentation tool can directly distribute the weight to the character information while splitting the character information, the subsequent recommendation of a platform is more convenient to carry out, meanwhile, the Elastic Search cluster is arranged to store the inverted index, the requirements of establishing and storing the inverted index after word segmentation are better met, the recommendation is more smooth, and the user experience is improved.
(6) The live broadcast room recommending method, the storage medium, the electronic equipment and the system not only recommend a live broadcast room obtained by deducting according to the real-time retrieval behavior of the user to the user, but also recommend live broadcast rooms recommended according to other data to the client, such as historical behavior data. Because the interests and hobbies of the user cannot be changed completely suddenly, the interests of the user on the other side can be temporarily increased, but the old interests still exist, and the live broadcast room for completely recommending the user to retrieve behavior inference in real time can still cause user discomfort, so that the user experience can be better improved by reasonably integrating the two recommendation types of live broadcast rooms to recommend the user.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings corresponding to the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of steps of a live broadcast room recommendation method of the present invention;
fig. 2 is a schematic structural diagram of a live broadcast room recommendation system according to the present invention.
In the figure: 1-splitting module, 2-searching module and 3-recommending module.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, embodiments of the present invention provide a live broadcast room recommendation method, a storage medium, an electronic device, and a system, which are capable of recommending a live broadcast room to a user in real time according to current interests and hobbies of the user by establishing an inverted index for each live broadcast room, the index splitting phrases according to position information of each live broadcast room, and retrieving in the inverted index according to input sentences searched by the user on a platform in real time.
In order to achieve the technical effects, the general idea of the application is as follows:
the embodiment of the invention provides a live broadcast room recommending method, which is used for recommending a live broadcast room in which a user is interested currently to the user in real time, and comprises the following steps:
acquiring text information of each live broadcast room;
inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups;
establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database;
monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
and acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting.
In summary, there is a need for a platform that is more interesting to users and attracts users to watch live broadcasts. Therefore, when a recommendation is made to a user, the historical behavior data of each user is generally recorded and then inferred according to the historical behavior of the user. However, the recommendation is not performed according to the current interests of the user, and when the user generates new interests and searches, the platform can only recommend the live broadcast room recommended according to the historical behavior data of the user to the client, so that the recommendation effect is greatly reduced, and the requirements of the user are not met.
The invention monitors the search content of the user in real time, acquires the input sentences for retrieval, and ensures that the current interest information of the user can be acquired in real time.
In order to analyze the interest information of the user, input sentences of the user are obtained, the input sentences are visual representations of the interest of the user and are information which can be obtained most directly by the live broadcast platform.
Furthermore, in order to reasonably search for the input sentences of the user, a set of search databases needs to be provided. It is therefore first necessary to create from live room information on the platform.
The words and sentences obtained by splitting the text information in the live broadcast room and the ID of the live broadcast room are used for establishing the reverse index, and during retrieval, the retrieval after splitting of the user input sentences can be carried out according to the index, so that the relevance is good and effective, and the rate is higher: the inverted index results from the need to look up records based on the values of attributes in practical applications. Each entry in such an index table includes an attribute value and the address of the record having the attribute value. Since the attribute value is not determined by the record but the position of the record is determined by the attribute value, it is called an inverted index.
Furthermore, the invention uses the kafka queue to store the search words, and uses the real-time stream processing framework storm to obtain the search words from the kafka queue, so that the requirements of the live broadcast platform on large data, massive users and real-time processing can be met: due to the real-time requirement of the live broadcast platform, the user needs to be recorded and analyzed currently, namely, behaviors of the user are recorded and analyzed nearly every time, the current interest of the user needs to be analyzed and recommended in real time, and more strict requirements are provided for the performance and the architecture of the platform: the system aims at a large number of users and a large number of data, and can transmit and analyze the data in real time.
And Kafka is a high-throughput distributed publish-subscribe message system, which can process all action flow data in a consumer-scale website, has the characteristics of high throughput, cluster differentiation support and the like, and can meet the requirement on storage and processing of search words.
The real-time stream processing framework storm is a distributed real-time computing system that can handle large amounts of streaming data with trustworthiness. The method can perform corresponding calculation in real time, is stable and does not collapse, and ensures that a live broadcast platform can still perform processing without blocking for mass users in the live broadcast peak period.
In order to better understand the technical solution, the following detailed description is made with reference to specific embodiments.
Example one
The embodiment of the invention provides a live broadcast room recommendation method, which comprises the following steps:
s1: and acquiring text information of each live broadcast room.
The invention needs to search the live broadcast room correspondingly, needs to establish a corresponding database/index for use in searching. Therefore, to establish a database/index of each live broadcast room on the live broadcast platform corresponding to the text related information, text information of each live broadcast room needs to be established.
As a preferred embodiment, the text message includes but is not limited to: a studio title, a studio ID, an anchor nickname. The live broadcasting room title, the live broadcasting room ID, the anchor ID and the anchor nickname of the live broadcasting room can reflect the content of the live broadcasting room in a wide range, so that the attributes of the live broadcasting room can be objectively and reasonably reflected through the database/index established by the characters, and the subsequent retrieval and judgment are convenient.
S2: and inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups.
After the text information in the live broadcast room is acquired, if the text information is only associated with the live broadcast room, the input sentence searched by the user usually cannot be completely matched with the text information, so that the text information needs to be processed to form a plurality of phrases. Thus, complete or incomplete matching of phrases can be performed at the time of retrieval.
Meanwhile, if the input sentences are matched with a plurality of live rooms, the platform needs to judge which one fits the interest of the user better, so that phrases need to be scored to obtain the weights of the phrases, and the phrases can be evaluated more easily under different weights, so that the fitted live rooms are recommended for the user.
It should be noted that the weight obtained in the splitting in step S2 is obtained by calculating the relevance of the word with respect to the corresponding text information according to a preset calculation formula during the splitting, and the preset calculation formula is not described in detail herein, but is a technical means commonly used by those skilled in the art during searching and retrieving.
As a preferred embodiment, the inputting the text information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of corresponding word groups includes:
differentiating the character information into a plurality of words and phrases combined together;
and scoring the phrases combined by the elements and the elements through a distributed search engine to obtain weights of the phrases combined by the corresponding elements and the elements, wherein the weights are scores of the relevancy of the text information of the live broadcast room corresponding to the elements or the phrases.
Furthermore, the word information is split by using an ElasticSearch word segmentation tool, wherein the ElasticSearch is a search server based on Lucene. It provides a distributed multi-user capable full-text search engine based on RESTful web interface. The Elasticsearch was developed in Java and published as open source under the Apache licensing terms, and is currently a popular enterprise-level search engine. The design is used in cloud computing, can achieve real-time search, and is stable, reliable, quick, and convenient to install and use.
For example, the text information of the live broadcast room ID96291 includes: the title of the live broadcast room is 'northeast big quail inter-voice artist', and the anchor nickname is 'northeast big quail'. Inputting the text information into an elastic search word segmentation tool to obtain a word group and corresponding weights (the weights are numerical values in parentheses): 96291(1.0), northeast large quail (0.95), inter-aural artist (0.5), northeast (0.4), large quail (0.3), inter-aural (0.2), artist (0.1). Similarly, a live broadcast room ID is 96200, the live broadcast room title is "northeast phase sound", and inputting the text information of the live broadcast room into the ElasticSearch word segmentation tool will obtain the word group and the corresponding weight (the weight is a numerical value in parentheses): 96200(1.0), northeast phase (0.95), northeast (0.6), phase (0.4).
S3: and establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database.
The inverted index results from the need to look up records based on the values of attributes in practical applications. Each entry in such an index table includes an attribute value and the address of the record having the attribute value. Since the attribute value is not determined by the record but the position of the record is determined by the attribute value, it is called an inverted index. During retrieval, the corresponding live broadcast room can be directly retrieved according to the retrieval words obtained by splitting the input sentences, and the efficiency and the speed are higher.
Further, the inverted index is stored in the ElasticSearch cluster.
Each field of the Elastic Search is stored in the index so that it can be retrieved, while the words in the inverted index are treated as shards with zero or more copies per shard. Each data node in the cluster may carry one or more slices and coordinate and process various operations. This ensures that load rebalancing and routing is done automatically in most cases when processing large volumes of data. Second, the Elastic Search cluster can be extended to hundreds of servers, handling PB-level structured or unstructured data. And finally, a plug-in mechanism is supported, namely a word segmentation plug-in, a synchronization plug-in, a Hadoop plug-in, a visual plug-in and the like.
S4: monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
when a user has new interests and wants to watch a live broadcast of the new interests, the user generally searches through a search function of the platform, so that the platform monitors search contents in real time and acquires input sentences corresponding to the search behavior. In this way, the input sentence can be fully captured by the platform as a direct reaction to the user's current interest.
The live broadcast platform needs to meet the watching requirements of a large number of users, but the live broadcast platform carries out real-time recommendation according to the current interests of the users, and needs a large amount of data, real-time processing and stability of data processing. The invention uses the kafka queue to ensure that the retrieval words can be accessed quickly, uses the storm real-time stream processing framework to perform retrieval stably in real time, and ensures that the platform can process the data processing requirements of a large number of users in real time.
Further, the detecting the search behavior in real time and acquiring the input sentence specifically includes: storing the input sentences each time according to the number of times, and acquiring the latest input sentences for a plurality of times. The input sentences during searching under the current interest of the user are related to time, so that the sentences input each time are stored according to times, and the latest input sentences are obtained, so that the current interest of the user can be better captured.
Specifically, the index word storage structure is a sorted set structure of the Redis database, and a double type score of the sorted set structure is used for storing the search timestamp. The search time stamp storage platform has the advantages that storage is carried out through a redis database, a sorted structure arranged by the search time stamp storage platform is a double type score, the search time stamp can be better stored, a large number of clients can be guaranteed to be dealt with by the platform, and a currently interested live broadcast room is recommended for the platform in real time.
S5: and acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting.
In order to obtain a live broadcast room which is fit with the current user recommendation, further searching is needed in the inverted index, so that a related live broadcast room is obtained, after a plurality of live broadcast rooms are obtained, the weights of phrases are obtained according to the character splitting information obtained corresponding to the searched words and are sequenced, and then the live broadcast room with the optimal sequencing preset number is recommended to the user.
It should be noted that the optimal sorting is the live broadcast room with a larger weight, regardless of the big-to-small or small-to-big rows, after sorting according to the weight. When the live broadcast room most fitting the current interest is recommended to the user in real time, the matching degree between the live broadcast room recommended to the user and the user interest can be improved by using the live broadcast room with higher weight.
For example, for a live broadcast with ID96291, its word-segmentation phrase and corresponding weight (weight is a value in parentheses) are: phrases and corresponding weights (weights in parentheses): 96291(1.0), northeast large quail (0.95), inter-aural artist (0.5), northeast (0.4), large quail (0.3), inter-aural (0.2), artist (0.1); and a live broadcast room with an ID of 96200, wherein the word segmentation phrases and corresponding weights are as follows: 96200(1.0), northeast phase (0.95), northeast (0.6), phase (0.4).
When searching for "northeast phase", the northeast phase (0.95) of the live room with ID 96200 was retrieved directly. When searching for "northeast", live rooms with an ID of 96291 and 96200 were retrieved, where the northeast (0.4) and northeast (0.6) have weights 0.6 greater than 0.4, making the live room with an ID of 96200 ranked better than the live room with an ID of 96291.
As an optional embodiment, a part of the recommendation fields on the user interface displays the preset number of live rooms with the optimal ranking.
That is, only part of the live broadcast rooms recommended to the user on the user interface is recommended according to input sentences retrieved by the user in real time, and other recommended live broadcast rooms can be displayed in other recommended columns, for example: selecting a plurality of historical behavior recommending live broadcast rooms according to the historical behavior characteristics of the user, and then sending the historical behavior recommending live broadcast rooms and the preset number of live broadcast rooms with optimal recommending sequencing to the client side together.
The live broadcast room recommended according to the historical behaviors of the user is recommended to the client besides the live broadcast room recommended according to the real-time retrieval behaviors of the user, the interest and hobbies of the user cannot be suddenly and completely changed, the interest of the user on the other hand is possibly temporarily increased, the old interest still exists, the live broadcast room recommended by the user for estimating the real-time retrieval behaviors of the user is possibly uncomfortable to the user, and therefore the user experience can be better improved by reasonably integrating the two recommendation types of live broadcast rooms to recommend the user.
Based on the same inventive concept, the present application provides the second embodiment, which is as follows.
Example two
As shown in fig. 2, an embodiment of the present invention provides a live broadcast recommendation method system, which is characterized in that:
the splitting module (1) is used for acquiring character information of all live broadcasting rooms, splitting the character information from complex to simple according to the character information, and establishing reverse sorting indexes step by step according to complexity to store the reverse sorting indexes into a database;
the retrieval module (2) is used for detecting the search behavior of the user in real time, acquiring the input sentence of the user, splitting the input sentence of the user to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
and the recommending module (3) is used for acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching inverted sorting indexes in the database, sorting the searched live broadcasting rooms according to the number of the search words, and recommending the live broadcasting rooms with the top preset number.
Various modifications and specific examples of the foregoing embodiments of the method are also applicable to the system of the present embodiment, and the implementation and advantages of the system of the present embodiment will be apparent to those skilled in the art from the foregoing detailed description of the method, and therefore, for the sake of brevity of this description, detailed descriptions thereof will not be provided herein.
Based on the same inventive concept, the present application provides the third embodiment.
EXAMPLE III
A third embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a live broadcast room recommendation method as provided in any of the embodiments of the present invention, the method comprising:
acquiring text information of each live broadcast room;
inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups;
establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database;
monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
and acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer-readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
Based on the same inventive concept, the present application provides the fourth embodiment.
Example four
The fourth embodiment of the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program running on the processor, and the processor executes the computer program to implement all or part of the method steps in the first embodiment.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like which is the control center for the computer device and which connects the various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the computer device by running or executing the computer programs and/or modules stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the cellular phone, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Generally speaking, the live broadcast room recommendation method, the storage medium, the electronic device and the system provided by the embodiment of the invention capture the current interest of the user by acquiring the input words searched by the user in real time, establish the inverted index of each live broadcast room for retrieval, and recommend the live broadcast room fitting the current interest to the user.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A live broadcast room recommending method is used for recommending a live broadcast room in which a user is currently interested to the user in real time, and is characterized by comprising the following steps of:
acquiring text information of each live broadcast room;
inputting the character information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups;
establishing an inverted index for the live broadcast room ID and the phrases corresponding to the live broadcast room ID, and storing the inverted index into a database;
monitoring search content in real time, acquiring an input sentence, splitting the input sentence to obtain a retrieval word, and storing the retrieval word into a kafka message queue;
and acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching the inverted sorting index in the database, sorting the live broadcast rooms according to the weight of the search words in the inverted sorting index, and recommending the preset number of the live broadcast rooms with optimal sorting.
2. The method of claim 1, wherein:
the step of inputting the text information into a distributed search engine for word segmentation to obtain a plurality of word groups and weights of the corresponding word groups comprises the following steps:
differentiating the character information into a plurality of words and phrases combined together;
and scoring the phrases combined by the elements and the elements through a distributed search engine to obtain weights of the phrases combined by the corresponding elements and the elements, wherein the weights are scores of the relevancy of the text information of the live broadcast room corresponding to the elements or the phrases.
3. The method of claim 1,
the real-time detection of the search behavior and the acquisition of the input sentence specifically include: storing the input sentences each time according to the number of times, and acquiring the latest input sentences for a plurality of times.
4. The method of claim 3, wherein: the index word storage structure is a sorted set structure of the Redis database, and the double type score of the sorted set structure is used for storing the search timestamp.
5. The method of claim 1, wherein: the text information comprises a live broadcast room title, a live broadcast room ID, an anchor ID and an anchor nickname.
6. The method of claim 1, wherein: the use of an ElasticSearch word segmentation tool to split the text information.
7. The method of claim 1, wherein: and displaying the preset number of live broadcast rooms with the optimal sequencing in a part of recommended columns on a user interface.
8. A storage medium having a computer program stored thereon, characterized in that: the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program that runs on the processor, characterized in that: the processor, when executing the computer program, implements the method of any of claims 1 to 7.
10. A live broadcast room recommendation method system is characterized by comprising the following steps:
the splitting module is used for obtaining the text information of all live broadcasting rooms, splitting the text information from complex to simple according to the composition of the text information, and establishing reverse sorting indexes step by step according to the complexity to store the reverse sorting indexes into a database;
the retrieval module is used for detecting the search behavior of the user in real time, acquiring the input sentence of the user, splitting the input sentence of the user to obtain retrieval words, and storing the retrieval words into the kafka message queue;
and the recommending module is used for acquiring the search words from the kafka queue by using a real-time stream processing framework storm, searching inverted sorting indexes in the database, sorting the searched live broadcasting rooms according to the number of the search words from most to least, and recommending the live broadcasting rooms with the top preset number of sorting.
CN201811178341.2A 2018-10-10 2018-10-10 Live broadcast room recommendation method, storage medium, electronic equipment and system Active CN111104583B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811178341.2A CN111104583B (en) 2018-10-10 2018-10-10 Live broadcast room recommendation method, storage medium, electronic equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811178341.2A CN111104583B (en) 2018-10-10 2018-10-10 Live broadcast room recommendation method, storage medium, electronic equipment and system

Publications (2)

Publication Number Publication Date
CN111104583A true CN111104583A (en) 2020-05-05
CN111104583B CN111104583B (en) 2024-01-05

Family

ID=70418169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811178341.2A Active CN111104583B (en) 2018-10-10 2018-10-10 Live broadcast room recommendation method, storage medium, electronic equipment and system

Country Status (1)

Country Link
CN (1) CN111104583B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111954017A (en) * 2020-08-14 2020-11-17 北京达佳互联信息技术有限公司 Live broadcast room searching method and device, server and storage medium
CN112579899A (en) * 2020-12-21 2021-03-30 杭州米络星科技(集团)有限公司 Searching method and device for anchor
CN113329233A (en) * 2021-04-30 2021-08-31 北京达佳互联信息技术有限公司 Live broadcast data processing method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968465A (en) * 2012-11-09 2013-03-13 同济大学 Network information service platform and search service method based on network information service platform
CN102982153A (en) * 2012-11-29 2013-03-20 北京亿赞普网络技术有限公司 Information retrieval method and device
CN103198079A (en) * 2012-01-06 2013-07-10 北大方正集团有限公司 Related search implementation method and device
CN104317945A (en) * 2014-10-31 2015-01-28 亚信科技(南京)有限公司 E-commerce website commodity recommending method on basis of search behaviors
CN104978314A (en) * 2014-04-01 2015-10-14 深圳市腾讯计算机系统有限公司 Media content recommendation method and device
CN106557483A (en) * 2015-09-25 2017-04-05 阿里巴巴集团控股有限公司 A kind of data processing, data query method and apparatus
CN108256044A (en) * 2018-01-12 2018-07-06 武汉斗鱼网络科技有限公司 Direct broadcasting room recommends method, apparatus and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198079A (en) * 2012-01-06 2013-07-10 北大方正集团有限公司 Related search implementation method and device
CN102968465A (en) * 2012-11-09 2013-03-13 同济大学 Network information service platform and search service method based on network information service platform
CN102982153A (en) * 2012-11-29 2013-03-20 北京亿赞普网络技术有限公司 Information retrieval method and device
CN104978314A (en) * 2014-04-01 2015-10-14 深圳市腾讯计算机系统有限公司 Media content recommendation method and device
CN104317945A (en) * 2014-10-31 2015-01-28 亚信科技(南京)有限公司 E-commerce website commodity recommending method on basis of search behaviors
CN106557483A (en) * 2015-09-25 2017-04-05 阿里巴巴集团控股有限公司 A kind of data processing, data query method and apparatus
CN108256044A (en) * 2018-01-12 2018-07-06 武汉斗鱼网络科技有限公司 Direct broadcasting room recommends method, apparatus and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111954017A (en) * 2020-08-14 2020-11-17 北京达佳互联信息技术有限公司 Live broadcast room searching method and device, server and storage medium
CN111954017B (en) * 2020-08-14 2022-03-25 北京达佳互联信息技术有限公司 Live broadcast room searching method and device, server and storage medium
CN112579899A (en) * 2020-12-21 2021-03-30 杭州米络星科技(集团)有限公司 Searching method and device for anchor
CN113329233A (en) * 2021-04-30 2021-08-31 北京达佳互联信息技术有限公司 Live broadcast data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111104583B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
US11709901B2 (en) Personalized search filter and notification system
US7860878B2 (en) Prioritizing media assets for publication
US11580168B2 (en) Method and system for providing context based query suggestions
US8595375B1 (en) Segmenting video based on timestamps in comments
US20130268597A1 (en) Relevance-Based Aggregated Social Feeds
US9342584B2 (en) Server apparatus, information terminal, and program
US20150067505A1 (en) System And Methods For User Curated Media
US20120246302A1 (en) System and methodology for creating and using contextual user profiles
US9213748B1 (en) Generating related questions for search queries
KR20140088205A (en) Computing similarity between media programs
US11803557B2 (en) Social intelligence architecture using social media message queues
US10482142B2 (en) Information processing device, information processing method, and program
US9407589B2 (en) System and method for following topics in an electronic textual conversation
CN103984740A (en) Combination label based search page display method and system
US20180302761A1 (en) Recommendation System for Multi-party Communication Sessions
CN111104583B (en) Live broadcast room recommendation method, storage medium, electronic equipment and system
US20240086479A1 (en) Identification and Issuance of Repeatable Queries
JP2021530758A (en) Dynamic application content analysis
CN111597446B (en) Content pushing method and device based on artificial intelligence, server and storage medium
CN116595241A (en) New media information display method and device, electronic equipment and computer readable medium
CN105589863B (en) Searching method, data processing method, device and system
US11269940B1 (en) Related content searching
CN107004014A (en) Effectively find and the contents attribute that comes to the surface
CN112597760A (en) Method and device for extracting domain words in document
JP2009070210A (en) Device for creating ranking by category

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231124

Address after: 450000 Zhengzhou, Henan Province Henan Free Trade Zone Zhengzhou Area (Economic Development Zone) No. 160-11E Trade Warehouse C-202, 8th Street, Zhengzhou Area (Economic Development Zone)

Applicant after: Henan Xingyi Network Technology Co.,Ltd.

Address before: 430000 East Lake Development Zone, Wuhan City, Hubei Province, No. 1 Software Park East Road 4.1 Phase B1 Building 11 Building

Applicant before: WUHAN DOUYU NETWORK TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant