CN116582694B - Live broadcast data processing method based on character recognition - Google Patents

Live broadcast data processing method based on character recognition Download PDF

Info

Publication number
CN116582694B
CN116582694B CN202310704406.7A CN202310704406A CN116582694B CN 116582694 B CN116582694 B CN 116582694B CN 202310704406 A CN202310704406 A CN 202310704406A CN 116582694 B CN116582694 B CN 116582694B
Authority
CN
China
Prior art keywords
broadcasting room
barrages
live broadcasting
historical
live
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310704406.7A
Other languages
Chinese (zh)
Other versions
CN116582694A (en
Inventor
王俊桦
杨锋
蒋侃
蒋军华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Mosquito Club Digital Marketing Co ltd
Original Assignee
Hangzhou Mosquito Club Digital Marketing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Mosquito Club Digital Marketing Co ltd filed Critical Hangzhou Mosquito Club Digital Marketing Co ltd
Priority to CN202310704406.7A priority Critical patent/CN116582694B/en
Publication of CN116582694A publication Critical patent/CN116582694A/en
Application granted granted Critical
Publication of CN116582694B publication Critical patent/CN116582694B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

The invention provides a live broadcast data processing method based on character recognition, which belongs to the technical field of data processing and specifically comprises the following steps: determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words; correcting the basic identification frequency by using the correction amount to obtain a corrected identification frequency; character recognition is carried out on the barrages of the live broadcasting room based on the corrected recognition frequency to obtain interested audience and associated barrages of the product, interest degree evaluation is carried out based on the number and proportion of the interested audience and the number of the associated barrages of the interested audience, the number of the associated barrages of the product and the audience number of the associated barrages are obtained, the recommendation introduction duration of the product is output by combining the interest degree, and the recommendation introduction content of the product is output based on the recognition result, so that the recommendation and introduction of the live broadcasting product are targeted.

Description

Live broadcast data processing method based on character recognition
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a live broadcast data processing method based on character recognition.
Background
In network live broadcast, audience often interact with the anchor through a text form, but once the number of people who interact is large, the anchor cannot acquire effective interaction information, so that the interaction requirement of the audience can not be met.
In order to solve the technical problems, in the patent grant bulletin number CN115174951B, an unmanned live broadcast online analysis management system based on multidimensional feature capture, text information, voice information and picture information sent by a user are analyzed and responded, so that the comprehensiveness of information analysis and reply to the user is improved, but the following technical problems exist:
1. in a live broadcasting room of a real person, the prior art does not consider the situation of combining the live broadcasting room barrage number, the live broadcasting room number and the like to determine the detection frequency of an automatic character analysis module, and if the live broadcasting room barrage number or the live broadcasting room number is small, at the moment, the same detection frequency as the live broadcasting room barrage number or the live broadcasting room number is large is adopted, so that the pressure of a server for carrying out character analysis is increased.
2. The introduction time length and the introduction key point of different products are not determined by combining the text analysis result of the live broadcasting room, and for different products, the interest degree of the audience on different products and the attention key point of the audience on the products can be determined by combining the text analysis result of the live broadcasting room, so that if the analysis result cannot be combined, the introduction time length is improved for the products with larger interest degree, and the introduction key point is improved, the interaction requirement of the audience cannot be met.
Aiming at the technical problems, the invention provides a live broadcast data processing method based on character recognition.
Disclosure of Invention
In order to achieve the purpose of the invention, the invention adopts the following technical scheme:
according to one aspect of the invention, a live broadcast data processing method based on character recognition is provided.
The live broadcast data processing method based on character recognition is characterized by comprising the following steps of:
s11, determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
s12, constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
s13, carrying out character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of products based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether output of recommended introduction duration of the products can be carried out or not based on the interestingness, if so, entering a step S14, and if not, determining the recommended introduction duration of the products based on preset introduction duration;
s14, acquiring the number of associated barrages of the product and the number of audience members sending the associated barrages, outputting the recommended introduction duration of the product by combining the interestingness, and outputting the recommended introduction content of the product based on the identification result.
The further technical scheme is that the set time period is less than 2 minutes, and the determination is specifically performed according to the daily average active audience quantity of the live broadcasting platform where the live broadcasting room is located, wherein the more the daily average active audience quantity of the live broadcasting platform is, the shorter the set time period is.
The further technical scheme is that when the historical audience evaluation value of the live broadcasting room is smaller than the audience evaluation set value, the live broadcasting room is determined to be a low-attention live broadcasting room.
The second identification frequency is greater than the identification frequency, and specifically dynamically determines according to the number of products in the live broadcasting room, wherein the greater the number of products in the live broadcasting room, the second identification frequency is further determined based on the activity level and the real-time activity level in the live broadcasting room, and the method specifically includes:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
The further technical scheme is that the screening of interested audiences and associated barrages of the product is carried out based on the identification result, and the method specifically comprises the following steps:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
The further technical scheme is that the recommended introduction time length is related to the number of correlated barrages for obtaining the product, the number of audiences sending the correlated barrages and the interestingness, and is specifically constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of correlated barrages for the product is larger, the number of audiences sending the correlated barrages is larger, the interestingness is higher, and the recommended introduction time length of the product is longer.
In another aspect, the present invention provides a computer storage medium having a computer program stored thereon, which when executed in a computer causes the computer to perform a method for processing live broadcast data based on word recognition as described above.
The invention has the beneficial effects that:
the basic recognition frequency of character recognition is determined by utilizing the number of historical audience people, the number of historical barrages in a set time period and the number of historical barrages, so that the dynamic determination of the basic recognition frequency is realized from the perspective of historical data, the pressure of a server is greatly reduced, and the timeliness of analysis is ensured.
The correction amount is constructed by utilizing the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, so that the identification frequency of character identification is determined by combining the angle of real-time data, the dynamic real-time adjustment of the identification frequency is realized, and the timeliness of analysis is further ensured.
The method and the device have the advantages that the interest degree is evaluated based on the number and the proportion of the interested audiences and the number of the relevant barrages of the interested audiences, so that the determination of the interest condition of users of different products is realized from the analysis result of barrage characters, and a foundation is laid for the differentiated determination of the recommendation introduction duration of the products and the output of the recommendation introduction content.
The method has the advantages that the recommended introduction time length of the product is output by the number of the associated barrages based on the products and the number of audiences sending the associated barrages and combining the interestingness, so that the recommended introduction time length is determined from multiple angles, the recommended time length of the product with higher interest degree of the user is ensured, and the live broadcast experience of the audience is improved.
Additional features and advantages will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings;
FIG. 1 is a flow chart of a live data processing method based on word recognition according to embodiment 1;
FIG. 2 is a flowchart of the steps of determining a base recognition frequency for text recognition according to embodiment 1;
FIG. 3 is a flowchart of a specific step of correction amount determination according to embodiment 1;
fig. 4 is a flowchart of specific steps of interestingness determination according to embodiment 1.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present disclosure.
The technical problems are summarized as follows:
the existing mobile self-service vending equipment ignores real-time monitoring of the electric quantity of the mobile self-service vending equipment, optimizes the vending strategy according to the monitoring result of the electric quantity, and meanwhile does not consider verifying the identity of a receiver in a combined manner, for example, if the mobile self-service vending equipment cannot verify the identity of the receiver and the type of an article, unnecessary goods loss can possibly occur.
Example 1
In order to solve the above problem, according to one aspect of the present invention, as shown in fig. 1, there is provided a live broadcast data processing method based on text recognition, which is characterized by specifically including:
s11, determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
the setting time period is less than 2 minutes, and is specifically determined according to the daily average number of active audiences on a live broadcasting platform where the live broadcasting room is located, wherein the more the daily average number of active audiences on the live broadcasting platform is, the shorter the setting time period is.
Specifically, as shown in fig. 2, the steps for determining the basic recognition frequency of the text recognition are as follows:
s21, determining the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room through the historical audience number of the live broadcasting room, determining the historical audience evaluation value of the live broadcasting room based on the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room, determining whether the live broadcasting room belongs to a low-attention live broadcasting room or not based on the historical audience evaluation value, if yes, taking the set identification frequency as the basic identification frequency of the live broadcasting room, and if no, entering step S22;
s22, determining the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period through the number of the historical barrages of the live broadcasting room in the set time period, determining the estimated value of the number of the historical barrages of the live broadcasting room based on the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to the live broadcasting room with high attention degree or not based on the estimated value of the number of the historical barrages, if so, entering step S23, otherwise, entering step S24;
s23, determining the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period through the historical barrage word number of the live broadcasting room in the set time period, determining the historical barrage word number evaluation value of the live broadcasting room based on the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to a high-attention live broadcasting room or not based on the historical barrage word number evaluation value, if so, taking a second set identification frequency as a basic identification frequency of the live broadcasting room, and if not, entering step S24;
s24, determining the basic identification frequency of the live broadcasting room based on the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value of the live broadcasting room.
In the embodiment, the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value are combined, so that the judgment of the basic recognition frequency in a more comprehensive mode is realized, and the recognition accuracy is ensured.
In addition, the number of the historic audience, the number of the historic barrage and the number of the historic barrages generally take the history data of the live broadcasting room within one month.
It will be appreciated that when the historic audience rating value for the live room is less than the audience rating setting, then the live room is determined to be a low-interest live room.
It should be further noted that the second set identification frequency is greater than the set identification frequency, and specifically, the second set identification frequency and the set identification frequency are greater as the number of products in the live broadcast room is greater, where the number of products in the live broadcast room is greater.
In the embodiment, the basic recognition frequency of character recognition is determined by utilizing the number of historical audience people in a live broadcasting room, the number of historical barrages in a set time period and the number of historical barrages, so that the dynamic determination of the basic recognition frequency is realized from the perspective of historical data, the pressure of a server is greatly reduced, and meanwhile, the timeliness of analysis is ensured.
S12, constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
specifically, as shown in fig. 3, the correction amount is determined by the following specific steps:
s31, acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, if yes, entering a step S32, and if not, determining that the basic identification frequency does not need to be corrected;
s32, acquiring the number of real-time barrages of the live broadcasting room in a set time period, judging whether the number of real-time barrages is required to correct the basic identification frequency, if yes, entering a step S33, and if not, determining that the basic identification frequency is not corrected;
s33, acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining whether the basic identification frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, if yes, entering step S34, and if no, determining that the basic identification frequency needs not to be corrected;
s34, constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room.
In this embodiment, by acquiring data multiple times, the technical problem that the accuracy of the correction amount is not high due to single data interference is avoided, and the accuracy of the correction amount construction is improved.
It can be appreciated that determining whether the basic identification frequency needs to be modified based on the liveness and the real-time liveness of the live room specifically includes:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
In this embodiment, the correction amount is constructed by using the number of real-time audience in the live broadcasting room, the number of real-time barrages in the set time period and the number of real-time barrages, so that the identification frequency of character identification is determined by combining the angle of real-time data, the dynamic real-time adjustment of the identification frequency is realized, and the timeliness of analysis is further ensured.
S13, carrying out character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of products based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether output of recommended introduction duration of the products can be carried out or not based on the interestingness, if so, entering a step S14, and if not, determining the recommended introduction duration of the products based on preset introduction duration;
specifically, screening interested audience and associated barrages of the product based on the identification result specifically comprises the following steps:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
It should be further noted that, as shown in fig. 4, the specific steps of the interestingness determination are as follows:
s41, acquiring the number of the associated barrages, determining whether the product is an audience interested product based on the number of the associated barrages, if so, entering a step S42, and if not, evaluating the interestingness of the product based on the number of the associated barrages and the set interestingness;
s42, taking the audience quantity of the associated barrage as the transmitted audience quantity, determining whether the product is an audience interested product based on the transmitted audience quantity, if so, proceeding to step S43, and if not, evaluating the interestingness of the product based on the transmitted audience quantity and the set interestingness;
s43, using the audience with the number of the associated barrages larger than the preset number as the interested audience, and evaluating the interest degree of the interested audience based on the number of the interested audience, the number of the associated barrages of the interested audience and the proportion of the interested audience in the live broadcasting room;
s44, evaluating the interestingness of the sending audience based on the number of the associated barrages, the number of the sending audience and the ratio of the number of the sending audience to the number of the audience in the living broadcast room, and evaluating the interestingness based on the interestingness of the interested audience and the interestingness of the sending audience.
In this embodiment, the interest degree is evaluated based on the number and the proportion of the interested audiences and the number of the relevant barrages of the interested audiences, so that the determination of the interest condition of the users of different products is realized from the analysis result of barrage characters, and a foundation is laid for the differentiated determination of the recommendation introduction time of the products and the output of the recommendation introduction content.
S14, acquiring the number of associated barrages of the product and the number of audience members sending the associated barrages, outputting the recommended introduction duration of the product by combining the interestingness, and outputting the recommended introduction content of the product based on the identification result.
The recommended introduction time length is related to the number of associated barrages for obtaining the product, the number of audiences sending the associated barrages and the interestingness, and is specifically constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of the associated barrages of the product is larger, the number of the audiences sending the associated barrages is larger, the interestingness is higher, and the recommended introduction time length of the product is longer.
In the embodiment, the recommended introduction time length of the product is output by the number of the associated barrages based on the product, the number of audiences sending the associated barrages and combining the interestingness, so that the recommended introduction time length is determined from multiple angles, the recommended time length of the product with higher interest degree of the user is ensured, and the live broadcast experience of the audience is improved.
Example 2
In another aspect, the present invention provides a computer storage medium having a computer program stored thereon, which when executed in a computer causes the computer to perform a method for processing live broadcast data based on word recognition as described above.
The live broadcast data processing method based on character recognition specifically comprises the following steps:
determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, and entering the next step;
acquiring the number of real-time barrages of the live broadcasting room in a set time period, and entering the next step when the number of the real-time barrages is judged to determine that the basic identification frequency needs to be corrected;
acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining that the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, and entering the next step;
constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room;
when the correction amount is used for determining that the basic identification frequency needs to be corrected, the correction amount is used for correcting the basic identification frequency to obtain a corrected identification frequency;
character recognition is carried out on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening of interested audiences and associated barrages of products is carried out based on the recognition result, and evaluation of interestingness is carried out based on the number and proportion of the interested audiences and the number of associated barrages of the interested audiences;
the number of the associated barrages of the product and the number of audience members sending the associated barrages are obtained, the interestingness is combined to output the recommended introduction duration of the product, and the recommended introduction content of the product is output based on the identification result.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing is merely one or more embodiments of the present description and is not intended to limit the present description. Various modifications and alterations to one or more embodiments of this description will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of one or more embodiments of the present description, is intended to be included within the scope of the claims of the present description.

Claims (10)

1. The live broadcast data processing method based on character recognition is characterized by comprising the following steps of:
determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrages in a set time period and the number of real-time barrages, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
performing character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of the product based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether the output of the recommended introduction duration of the product can be performed or not based on the interestingness, if so, entering the next step, and if not, determining the recommended introduction duration of the product based on the preset introduction duration;
the number of the associated barrages of the product and the number of audience members sending the associated barrages are obtained, the interestingness is combined to output the recommended introduction duration of the product, and the recommended introduction content of the product is output based on the identification result.
2. The method for processing live broadcast data based on character recognition according to claim 1, wherein the set time period is less than 2 minutes, specifically, the method is determined according to the daily average number of active audiences on a live broadcast platform where the live broadcast room is located, and the set time period is shorter as the daily average number of active audiences on the live broadcast platform is greater.
3. The live broadcast data processing method based on character recognition according to claim 1, wherein the step of determining the basic recognition frequency of the character recognition is:
s21, determining the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room through the historical audience number of the live broadcasting room, determining the historical audience evaluation value of the live broadcasting room based on the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room, determining whether the live broadcasting room belongs to a low-attention live broadcasting room or not based on the historical audience evaluation value, if yes, taking the set identification frequency as the basic identification frequency of the live broadcasting room, and if no, entering step S22;
s22, determining the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period through the number of the historical barrages of the live broadcasting room in the set time period, determining the estimated value of the number of the historical barrages of the live broadcasting room based on the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to the live broadcasting room with high attention degree or not based on the estimated value of the number of the historical barrages, if so, entering step S23, otherwise, entering step S24;
s23, determining the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period through the historical barrage word number of the live broadcasting room in the set time period, determining the historical barrage word number evaluation value of the live broadcasting room based on the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to a high-attention live broadcasting room or not based on the historical barrage word number evaluation value, if so, taking a second set identification frequency as a basic identification frequency of the live broadcasting room, and if not, entering step S24;
s24, determining the basic identification frequency of the live broadcasting room based on the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value of the live broadcasting room.
4. The method for processing live data based on character recognition according to claim 3, wherein when the historical audience evaluation value of the live room is smaller than the audience evaluation set value, the live room is determined to be a low-attention live room.
5. The method for processing live broadcast data based on character recognition according to claim 3, wherein the second set recognition frequency is greater than the set recognition frequency, and specifically, the second set recognition frequency and the set recognition frequency are greater as the number of products in the live broadcast room is greater.
6. The method for processing live broadcast data based on character recognition according to claim 1, wherein the specific steps of correction amount determination are as follows:
acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, if so, entering the next step, and if not, determining that the basic identification frequency does not need to be corrected;
acquiring the number of real-time barrages of the live broadcasting room in a set time period, judging whether the number of real-time barrages is required to correct the basic identification frequency, if so, entering the next step, and if not, determining not to correct the basic identification frequency;
acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining whether the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, if yes, entering the next step, and if no, determining that the basic recognition frequency needs not to be corrected;
and constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room.
7. The method for processing live broadcast data based on character recognition according to claim 1, wherein determining whether the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the live broadcast room comprises:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
8. The method for processing live broadcast data based on character recognition according to claim 1, wherein the screening of interested viewers and associated barrages of the product is performed based on the recognition result, specifically comprising:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
9. The method for processing live broadcast data based on character recognition according to claim 1, wherein the recommended introduction duration is related to the number of correlated curtains for obtaining the product, the number of audiences sending the correlated curtains, and the interestingness, and specifically is constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of correlated curtains of the product is larger, the number of audiences sending the correlated curtains is larger, the interestingness is higher, and the recommended introduction duration of the product is longer.
10. A computer storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform a word recognition based live data processing method as claimed in any of claims 1-9.
CN202310704406.7A 2023-06-14 2023-06-14 Live broadcast data processing method based on character recognition Active CN116582694B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310704406.7A CN116582694B (en) 2023-06-14 2023-06-14 Live broadcast data processing method based on character recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310704406.7A CN116582694B (en) 2023-06-14 2023-06-14 Live broadcast data processing method based on character recognition

Publications (2)

Publication Number Publication Date
CN116582694A CN116582694A (en) 2023-08-11
CN116582694B true CN116582694B (en) 2023-10-31

Family

ID=87541502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310704406.7A Active CN116582694B (en) 2023-06-14 2023-06-14 Live broadcast data processing method based on character recognition

Country Status (1)

Country Link
CN (1) CN116582694B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507870A (en) * 2020-03-20 2020-08-07 威比网络科技(上海)有限公司 Lecture display method, system, equipment and storage medium based on online education
CN112765336A (en) * 2021-01-29 2021-05-07 中国平安人寿保险股份有限公司 Bullet screen management method and device, terminal equipment and storage medium
WO2021137978A1 (en) * 2019-12-31 2021-07-08 Sling Tv Llc Digital advertisement frequency correction
CN114417140A (en) * 2021-12-23 2022-04-29 广西壮族自治区公众信息产业有限公司 Live broadcast marketing method and system
WO2022105269A1 (en) * 2020-11-17 2022-05-27 北京达佳互联信息技术有限公司 Live broadcast video pushing method and device
CN115708354A (en) * 2021-08-18 2023-02-21 武汉斗鱼鱼乐网络科技有限公司 Information recommendation method and device, server and computer readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100162285A1 (en) * 2007-09-11 2010-06-24 Yossef Gerard Cohen Presence Detector and Method for Estimating an Audience

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021137978A1 (en) * 2019-12-31 2021-07-08 Sling Tv Llc Digital advertisement frequency correction
CN111507870A (en) * 2020-03-20 2020-08-07 威比网络科技(上海)有限公司 Lecture display method, system, equipment and storage medium based on online education
WO2022105269A1 (en) * 2020-11-17 2022-05-27 北京达佳互联信息技术有限公司 Live broadcast video pushing method and device
CN112765336A (en) * 2021-01-29 2021-05-07 中国平安人寿保险股份有限公司 Bullet screen management method and device, terminal equipment and storage medium
CN115708354A (en) * 2021-08-18 2023-02-21 武汉斗鱼鱼乐网络科技有限公司 Information recommendation method and device, server and computer readable storage medium
CN114417140A (en) * 2021-12-23 2022-04-29 广西壮族自治区公众信息产业有限公司 Live broadcast marketing method and system

Also Published As

Publication number Publication date
CN116582694A (en) 2023-08-11

Similar Documents

Publication Publication Date Title
EP2875644B1 (en) Logging individuals for tv measurement compliance
EP1971075A1 (en) An information issuing system, a public media information issuing system and an issuing method
JP6062935B2 (en) Method and apparatus for delivering targeted content
Li et al. A cost-constrained video quality satisfaction study on mobile devices
CN109246450B (en) Movie and television preferred recommendation method based on implicit information scoring
CN108122314A (en) A kind of doorbell call processing method, Cloud Server, medium and system
CN111402439B (en) Online training class arrival rate statistical management method and system based on face recognition
CN109121006B (en) Marketing method and platform based on live broadcast watching user
CN111401906A (en) Transfer risk detection method and system
CN112913251B (en) Media identification with watermarking and signing
CN116582694B (en) Live broadcast data processing method based on character recognition
CN116761207B (en) User portrait construction method and system based on communication behaviors
US11212570B2 (en) Viewing data
Damsbo-Svendsen Mass media influence on the rapid rise of climate change
CN114945097B (en) Video stream processing method and device
Gürsun et al. On context-aware DDoS attacks using deep generative networks
CN114679600A (en) Data processing method and device
CN113157756A (en) Interactive data statistical method and device, electronic equipment and storage medium
Lv et al. QoE prediction on imbalanced IPTV data based on multi-layer neural network
Son Interpersonal trust and confidence in Labor Unions: The Case of South Korea
CN111209817A (en) Assessment method, device and equipment based on artificial intelligence and readable storage medium
CN115277620B (en) Social application processing method and system
CN118035455A (en) Cross-modal data analysis processing system and method
CN110536158B (en) Application program competitiveness analysis method and device
US10911822B2 (en) Device-based detection of ambient media to be used by a server to selectively provide secondary content to the device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant