CN111078935A

CN111078935A - Voice live broadcast anchor value evaluation method based on cross-domain recommendation idea

Info

Publication number: CN111078935A
Application number: CN201911116602.2A
Authority: CN
Inventors: 廉亚红; 丁宁
Original assignee: Guangzhou Lizhi Network Technology Co ltd
Current assignee: Guangzhou Lizhi Network Technology Co ltd
Priority date: 2019-11-15
Filing date: 2019-11-15
Publication date: 2020-04-28

Abstract

The invention relates to a sound live broadcast anchor value evaluation method based on a cross-domain recommendation idea, which comprises the following steps of: automatically collecting anchor multi-dimensional characteristic information for constructing an anchor characteristic system of sound live broadcast; step two, constructing multi-dimensional information in a data mining and data statistics mode, and supplementing the multi-dimensional information to a main broadcasting feature system of sound live broadcasting; introducing a time attenuation function under the condition that a plurality of live broadcast rooms exist in each anchor live broadcast, and according to the distance between the live broadcast time and the current time; and step four, constructing a anchor value evaluation algorithm based on the cross-domain recommendation idea. The method can automatically and intelligently complete the value evaluation of the sound live broadcast anchor in a model mode.

Description

Voice live broadcast anchor value evaluation method based on cross-domain recommendation idea

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to a method for evaluating the value of a sound live broadcast anchor based on a cross-domain recommendation idea.

Background

With the popularization of 4G networks and the coming 5G era, people gradually convert from text interaction to audio and video interaction. The voice live broadcast is a special interaction mode appearing in recent years, interaction can be carried out between users and a main broadcast through voice, and specific audience users are accumulated in the categories of friend making, singing, talk show and the like.

As a novel social media content, the voice live broadcast has a great difference in user groups, listening behavior modes, interaction modes with a main broadcast and other aspects from interaction modes of other internet media such as news, video live broadcast and the like. In order to reach thousands of people, the customized recommendation of the sound live broadcast content which is interested by the users is one of the main methods for improving the user experience and retaining the new and old users. The value evaluation of the anchor user of the voice live broadcast is an essential link in personalized recommendation.

For the value assessment of the anchor, a value score is predicted for the anchor based on the historical performance behavior of the anchor in most scenarios. Or the operator manually maintains the relevant historical behavior of the valuable big anchor and tracks the long-term and short-term value of the behavior.

Obviously, such an approach suffers from the following drawbacks:

(1) the manual maintenance of the relevant historical behaviors of the anchor by operators is a time-consuming and labor-consuming project and has certain time delay. Such an approach would delay the recommendation effort if there were a real-time recommendation scenario.

(2) The value of the anchor is evaluated only by the historical behavior of the anchor, and the problem of cold start of a new anchor can also occur due to certain one-sidedness.

Aiming at the technical problems, the invention provides a method for evaluating the sound live broadcast anchor value based on a cross-domain recommendation idea, and the method is generated by the scheme.

Disclosure of Invention

The invention aims to provide a voice live broadcast anchor value evaluation method based on a cross-domain recommendation idea, which is used for constructing an artificial intelligent model to automatically and intelligently evaluate the anchor value of voice live broadcast.

In order to achieve the purpose, the invention specifically provides the following technical scheme: a voice live broadcast anchor value evaluation method based on a cross-domain recommendation idea comprises the following steps:

automatically collecting anchor multi-dimensional characteristic information for constructing an anchor characteristic system of sound live broadcast;

step two, constructing multi-dimensional information in a data mining and data statistics mode, and supplementing the multi-dimensional information to a main broadcasting feature system of sound live broadcasting;

introducing a time attenuation function under the condition that a plurality of live broadcast rooms exist in each anchor live broadcast, and according to the distance between the live broadcast time and the current time;

and step four, constructing a anchor value evaluation algorithm based on the cross-domain recommendation idea, and automatically and intelligently evaluating the anchor value according to the multi-dimensional characteristic information.

Further, the multi-dimensional feature information of the step one includes anchor information, interaction capability, income condition, user retention and content quality.

Further, the step two of constructing the multi-dimensional information includes: firstly, a first-level category label of a main broadcasting recorded program; secondly, a secondary category label of the main broadcasting recorded broadcasting program; thirdly, all entities for recording and broadcasting programs are broadcasted; fourthly, the key words of the recorded and broadcast program are broadcasted; fifthly, rating of the anchor recorded and broadcast program; sixthly, the overreview rate of the anchor historical program; seventhly, the anchor can pass the trial rate for 7 days.

Further, the decay function: computing by lowering weight

dt＝(T-t₁,T-t_2,T-t_3…T-t_n)

Where T is the current time, T₁…t_nIs a time point corresponding to each field of a main broadcast

Make normalization

Each field corresponds to a weight of

w＝exp^-nt/n。

Further, the value assessment algorithm: performing embedding on a secondary label of a live broadcast anchor, performing embedding on the character characteristics related to the recorded broadcast of the anchor as cross-domain knowledge to supplement the cross-domain knowledge to an anchor evaluation system, and training a scoring model by using xgboost; and the platform randomly extracts fifty thousand anchor information and scores the anchors as training data.

The invention has the beneficial effects that:

(1): the method is automatic and intelligent, and the value evaluation of the sound live broadcast anchor is automatically finished in a model mode;

(2): the key index of the grading is that feedback is obtained based on the objective historical behavior of the platform sound live broadcast anchor, and the key index has objective evaluation significance;

(3): all types of anchor (new and old anchor, anchor under the first-level classification label) can be covered, and the related evaluation index is objective enough.

Drawings

Fig. 1 is a flowchart illustrating a method for assessing a value of a live anchor based on cross-domain recommendation according to an exemplary embodiment.

Fig. 2 is an algorithm model diagram illustrating a voice live anchor value assessment method based on cross-domain recommendation concept according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating an algorithm of a voice live anchor value evaluation method based on a cross-domain recommendation concept according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

With reference to fig. 1, a method for evaluating the value of a live broadcast anchor based on a cross-domain recommendation concept includes the following steps:

The method comprises the following steps:

a anchor characteristic system for voice live broadcasting is constructed, the system comprises but not limited to the following dimension characteristic information, and the related information is automatically collected through system information:

the activity degree: (1) the number of historical fans of the anchor; (2) the number of active fans in anchor; (3) the anchor issues dynamic numbers; (4) receiving the number of the comments of the anchor release program;

(5) main page access volume; (6) accumulating the number of effective online users; (7) effective online number of people on the spot: the number of users effectively listening in each live broadcast field; (8) effective broadcast rate (effective broadcast times/total broadcast times in a month near); (9) and (5) reflowing the anchor (the anchor which returns to the broadcast after 30 days of continuous non-broadcast).

Interaction capacity: (1) the number of the wheat users is equal to the number of the wheat users in the field; (2) 5-minute retention rate of field; (3) a floor-to-floor speech rate; (4) the number of red packets is counted for each field; (5) and the ratio of the new user in the newly added vermicelli.

The income condition is as follows: (1) the total number of gifts received by the anchor from the live broadcast room in the last month; (2) the total gift value received by the anchor in the next month from the live broadcast room; (3) the number of users who give a main broadcast for a month; (4) and the number of the vermicelli for delivering the present to the anchor in the next month.

User retention: (1) effectively listening to the access platform rate of the user on the next broadcast day on the main broadcast day; (2) the effective listening user on the last broadcast day of the anchor has the access anchor rate on the next broadcast day; (3) and average main broadcasting rate of vermicelli access on the main broadcasting starting day. The content quality is as follows: (1) the user-defined label of the anchor live broadcast room; (2) an operation definition label of the anchor live broadcast room; (3) and description of a main broadcast live room.

Secondly, dimension information including but not limited to the following dimension information is constructed in a data mining and data statistics mode and is supplemented to a main broadcasting feature system of sound live broadcasting: (1) the first-level category label of the main broadcasting recorded broadcasting program; (2) the second-level category label of the main broadcasting recorded broadcasting program; (3) all entities for broadcasting programs by anchor; (4) the key words of the recorded and broadcast program are broadcasted; (5) rating of the anchor recorded and broadcast program; (6) the audience historical program review rate; (7) and 5, the anchor can pass the trial rate in 7 days.

Introducing a time attenuation function under the condition that multiple live broadcast rooms exist in each live broadcast, and calculating the weight reduction according to the distance between the live broadcast time and the current time

dt＝(T-t₁,T-t_2,T-t_3…T-t_n)

Make normalization

Each field corresponds to a weight of

w＝exp^-nt/n

And fourthly, constructing a anchor value evaluation algorithm based on a cross-domain recommendation idea, fusing relevant dimensions and characteristics of recorded and broadcast programs based on anchor relevant dimensions of voice live broadcast, and performing automatic and intelligent value evaluation on the anchor, wherein the specific algorithm flow is as follows:

(1) features of the anchor of the sound live: the activity degree: vermicelli number, release dynamic number, main page visit volume, accumulated online number, average online number (vermicelli/non-vermicelli can be distinguished);

interaction capacity: the number of the connected wheat in each field, the 5-minute retention rate in each field, the speech rate in each field and the red packet number in each field are calculated;

and (4) yield: gift number, gift sum of money, number of gift users (capable of distinguishing vermicelli/non-vermicelli), content quality: the user-defined label of the live broadcast room, the operation definition label of the live broadcast room and the description of the live broadcast room.

(2) Features of the anchor recorded program: and the first-level category label, the program key words, the program entities and the program over-qualification rate are used for dividing the buckets of the anchor broadcasts of the sound live broadcast according to the first-level label and the new and old anchor broadcasts.

Model: imbedding is carried out on a first-level label and a second-level label of a live anchor, imbedding is carried out on the recorded broadcast related character characteristics of the anchor as cross-domain knowledge to be supplemented into an anchor evaluation system, and an xgboost training scoring model is used.

Training: the platform randomly extracts 5 ten thousand anchor information, and manually scores the anchors to serve as training data. The method comprises the steps of establishing an anchor historical behavior characteristic system of sound live broadcast, namely collecting the overall objective behavior of the sound live broadcast anchor, simultaneously collecting and mining the relevant text and continuity characteristics of the anchor on recorded and broadcast programs, and establishing an artificial intelligent model to evaluate the automation and intelligent value of the sound live broadcast anchor.

Claims

1. A voice live broadcast anchor value evaluation method based on a cross-domain recommendation idea is characterized by comprising the following steps:

2. The method for evaluating the value of the live anchor of the voice based on the cross-domain recommendation idea according to claim 1, characterized in that: the multi-dimensional characteristic information of the step one comprises anchor information, interaction capacity, income condition, user retention and content quality.

3. The method for evaluating the value of the live anchor of the voice based on the cross-domain recommendation idea according to claim 1, characterized in that: the step two of constructing the multi-dimensional information comprises the following steps: firstly, a first-level category label of a main broadcasting recorded program; secondly, a secondary category label of the main broadcasting recorded broadcasting program; thirdly, all entities for recording and broadcasting programs are broadcasted; fourthly, the key words of the recorded and broadcast program are broadcasted; fifthly, rating of the anchor recorded and broadcast program; sixthly, the overreview rate of the anchor historical program; seventhly, the anchor can pass the trial rate for 7 days.

4. The method for evaluating the value of the live anchor of the sound based on the cross-domain recommendation idea according to claim 1, wherein the attenuation function: computing by lowering weight

dt＝(T-t₁,T-t₂,T-t₃…T-t_n)

Where T is the current time, T₁…t_nThe time point corresponding to each field of a main broadcast is normalized

Each field corresponds to a weight of

w＝exp^-nt/n。

5. The method for evaluating the value of the live anchor of the voice based on the cross-domain recommendation idea according to claim 1, characterized in that: the value assessment algorithm: and performing secondary labeling on the live anchor, and embedding, wherein the text is vectorized by mainly applying nlp (related to natural language processing) technology. Given that the amount of text for the secondary labels is not too large, the text can be vectorized using tf-idf statistical word frequencies. Performing text vectorization on the recorded broadcast related character features of the anchor (specifically, calculating tf-idf, mapping the text into a vector with a fixed length), supplementing the vector into an anchor evaluation system as cross-domain knowledge, and training a scoring model by using xgboost; and the platform randomly extracts fifty thousand anchor information and scores the anchors as training data.