CN116582694B - Live broadcast data processing method based on character recognition - Google Patents
Live broadcast data processing method based on character recognition Download PDFInfo
- Publication number
- CN116582694B CN116582694B CN202310704406.7A CN202310704406A CN116582694B CN 116582694 B CN116582694 B CN 116582694B CN 202310704406 A CN202310704406 A CN 202310704406A CN 116582694 B CN116582694 B CN 116582694B
- Authority
- CN
- China
- Prior art keywords
- broadcasting room
- barrages
- live broadcasting
- historical
- live
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 13
- 238000011156 evaluation Methods 0.000 claims abstract description 23
- 238000012937 correction Methods 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims description 17
- 238000012216 screening Methods 0.000 claims description 10
- 230000002596 correlated effect Effects 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 description 13
- 230000003993 interaction Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/635—Overlay text, e.g. embedded captions in a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4788—Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
The invention provides a live broadcast data processing method based on character recognition, which belongs to the technical field of data processing and specifically comprises the following steps: determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words; correcting the basic identification frequency by using the correction amount to obtain a corrected identification frequency; character recognition is carried out on the barrages of the live broadcasting room based on the corrected recognition frequency to obtain interested audience and associated barrages of the product, interest degree evaluation is carried out based on the number and proportion of the interested audience and the number of the associated barrages of the interested audience, the number of the associated barrages of the product and the audience number of the associated barrages are obtained, the recommendation introduction duration of the product is output by combining the interest degree, and the recommendation introduction content of the product is output based on the recognition result, so that the recommendation and introduction of the live broadcasting product are targeted.
Description
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a live broadcast data processing method based on character recognition.
Background
In network live broadcast, audience often interact with the anchor through a text form, but once the number of people who interact is large, the anchor cannot acquire effective interaction information, so that the interaction requirement of the audience can not be met.
In order to solve the technical problems, in the patent grant bulletin number CN115174951B, an unmanned live broadcast online analysis management system based on multidimensional feature capture, text information, voice information and picture information sent by a user are analyzed and responded, so that the comprehensiveness of information analysis and reply to the user is improved, but the following technical problems exist:
1. in a live broadcasting room of a real person, the prior art does not consider the situation of combining the live broadcasting room barrage number, the live broadcasting room number and the like to determine the detection frequency of an automatic character analysis module, and if the live broadcasting room barrage number or the live broadcasting room number is small, at the moment, the same detection frequency as the live broadcasting room barrage number or the live broadcasting room number is large is adopted, so that the pressure of a server for carrying out character analysis is increased.
2. The introduction time length and the introduction key point of different products are not determined by combining the text analysis result of the live broadcasting room, and for different products, the interest degree of the audience on different products and the attention key point of the audience on the products can be determined by combining the text analysis result of the live broadcasting room, so that if the analysis result cannot be combined, the introduction time length is improved for the products with larger interest degree, and the introduction key point is improved, the interaction requirement of the audience cannot be met.
Aiming at the technical problems, the invention provides a live broadcast data processing method based on character recognition.
Disclosure of Invention
In order to achieve the purpose of the invention, the invention adopts the following technical scheme:
according to one aspect of the invention, a live broadcast data processing method based on character recognition is provided.
The live broadcast data processing method based on character recognition is characterized by comprising the following steps of:
s11, determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
s12, constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
s13, carrying out character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of products based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether output of recommended introduction duration of the products can be carried out or not based on the interestingness, if so, entering a step S14, and if not, determining the recommended introduction duration of the products based on preset introduction duration;
s14, acquiring the number of associated barrages of the product and the number of audience members sending the associated barrages, outputting the recommended introduction duration of the product by combining the interestingness, and outputting the recommended introduction content of the product based on the identification result.
The further technical scheme is that the set time period is less than 2 minutes, and the determination is specifically performed according to the daily average active audience quantity of the live broadcasting platform where the live broadcasting room is located, wherein the more the daily average active audience quantity of the live broadcasting platform is, the shorter the set time period is.
The further technical scheme is that when the historical audience evaluation value of the live broadcasting room is smaller than the audience evaluation set value, the live broadcasting room is determined to be a low-attention live broadcasting room.
The second identification frequency is greater than the identification frequency, and specifically dynamically determines according to the number of products in the live broadcasting room, wherein the greater the number of products in the live broadcasting room, the second identification frequency is further determined based on the activity level and the real-time activity level in the live broadcasting room, and the method specifically includes:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
The further technical scheme is that the screening of interested audiences and associated barrages of the product is carried out based on the identification result, and the method specifically comprises the following steps:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
The further technical scheme is that the recommended introduction time length is related to the number of correlated barrages for obtaining the product, the number of audiences sending the correlated barrages and the interestingness, and is specifically constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of correlated barrages for the product is larger, the number of audiences sending the correlated barrages is larger, the interestingness is higher, and the recommended introduction time length of the product is longer.
In another aspect, the present invention provides a computer storage medium having a computer program stored thereon, which when executed in a computer causes the computer to perform a method for processing live broadcast data based on word recognition as described above.
The invention has the beneficial effects that:
the basic recognition frequency of character recognition is determined by utilizing the number of historical audience people, the number of historical barrages in a set time period and the number of historical barrages, so that the dynamic determination of the basic recognition frequency is realized from the perspective of historical data, the pressure of a server is greatly reduced, and the timeliness of analysis is ensured.
The correction amount is constructed by utilizing the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, so that the identification frequency of character identification is determined by combining the angle of real-time data, the dynamic real-time adjustment of the identification frequency is realized, and the timeliness of analysis is further ensured.
The method and the device have the advantages that the interest degree is evaluated based on the number and the proportion of the interested audiences and the number of the relevant barrages of the interested audiences, so that the determination of the interest condition of users of different products is realized from the analysis result of barrage characters, and a foundation is laid for the differentiated determination of the recommendation introduction duration of the products and the output of the recommendation introduction content.
The method has the advantages that the recommended introduction time length of the product is output by the number of the associated barrages based on the products and the number of audiences sending the associated barrages and combining the interestingness, so that the recommended introduction time length is determined from multiple angles, the recommended time length of the product with higher interest degree of the user is ensured, and the live broadcast experience of the audience is improved.
Additional features and advantages will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings;
FIG. 1 is a flow chart of a live data processing method based on word recognition according to embodiment 1;
FIG. 2 is a flowchart of the steps of determining a base recognition frequency for text recognition according to embodiment 1;
FIG. 3 is a flowchart of a specific step of correction amount determination according to embodiment 1;
fig. 4 is a flowchart of specific steps of interestingness determination according to embodiment 1.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present disclosure.
The technical problems are summarized as follows:
the existing mobile self-service vending equipment ignores real-time monitoring of the electric quantity of the mobile self-service vending equipment, optimizes the vending strategy according to the monitoring result of the electric quantity, and meanwhile does not consider verifying the identity of a receiver in a combined manner, for example, if the mobile self-service vending equipment cannot verify the identity of the receiver and the type of an article, unnecessary goods loss can possibly occur.
Example 1
In order to solve the above problem, according to one aspect of the present invention, as shown in fig. 1, there is provided a live broadcast data processing method based on text recognition, which is characterized by specifically including:
s11, determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
the setting time period is less than 2 minutes, and is specifically determined according to the daily average number of active audiences on a live broadcasting platform where the live broadcasting room is located, wherein the more the daily average number of active audiences on the live broadcasting platform is, the shorter the setting time period is.
Specifically, as shown in fig. 2, the steps for determining the basic recognition frequency of the text recognition are as follows:
s21, determining the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room through the historical audience number of the live broadcasting room, determining the historical audience evaluation value of the live broadcasting room based on the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room, determining whether the live broadcasting room belongs to a low-attention live broadcasting room or not based on the historical audience evaluation value, if yes, taking the set identification frequency as the basic identification frequency of the live broadcasting room, and if no, entering step S22;
s22, determining the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period through the number of the historical barrages of the live broadcasting room in the set time period, determining the estimated value of the number of the historical barrages of the live broadcasting room based on the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to the live broadcasting room with high attention degree or not based on the estimated value of the number of the historical barrages, if so, entering step S23, otherwise, entering step S24;
s23, determining the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period through the historical barrage word number of the live broadcasting room in the set time period, determining the historical barrage word number evaluation value of the live broadcasting room based on the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to a high-attention live broadcasting room or not based on the historical barrage word number evaluation value, if so, taking a second set identification frequency as a basic identification frequency of the live broadcasting room, and if not, entering step S24;
s24, determining the basic identification frequency of the live broadcasting room based on the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value of the live broadcasting room.
In the embodiment, the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value are combined, so that the judgment of the basic recognition frequency in a more comprehensive mode is realized, and the recognition accuracy is ensured.
In addition, the number of the historic audience, the number of the historic barrage and the number of the historic barrages generally take the history data of the live broadcasting room within one month.
It will be appreciated that when the historic audience rating value for the live room is less than the audience rating setting, then the live room is determined to be a low-interest live room.
It should be further noted that the second set identification frequency is greater than the set identification frequency, and specifically, the second set identification frequency and the set identification frequency are greater as the number of products in the live broadcast room is greater, where the number of products in the live broadcast room is greater.
In the embodiment, the basic recognition frequency of character recognition is determined by utilizing the number of historical audience people in a live broadcasting room, the number of historical barrages in a set time period and the number of historical barrages, so that the dynamic determination of the basic recognition frequency is realized from the perspective of historical data, the pressure of a server is greatly reduced, and meanwhile, the timeliness of analysis is ensured.
S12, constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrage in a set time period and the number of real-time barrage words, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
specifically, as shown in fig. 3, the correction amount is determined by the following specific steps:
s31, acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, if yes, entering a step S32, and if not, determining that the basic identification frequency does not need to be corrected;
s32, acquiring the number of real-time barrages of the live broadcasting room in a set time period, judging whether the number of real-time barrages is required to correct the basic identification frequency, if yes, entering a step S33, and if not, determining that the basic identification frequency is not corrected;
s33, acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining whether the basic identification frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, if yes, entering step S34, and if no, determining that the basic identification frequency needs not to be corrected;
s34, constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room.
In this embodiment, by acquiring data multiple times, the technical problem that the accuracy of the correction amount is not high due to single data interference is avoided, and the accuracy of the correction amount construction is improved.
It can be appreciated that determining whether the basic identification frequency needs to be modified based on the liveness and the real-time liveness of the live room specifically includes:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
In this embodiment, the correction amount is constructed by using the number of real-time audience in the live broadcasting room, the number of real-time barrages in the set time period and the number of real-time barrages, so that the identification frequency of character identification is determined by combining the angle of real-time data, the dynamic real-time adjustment of the identification frequency is realized, and the timeliness of analysis is further ensured.
S13, carrying out character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of products based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether output of recommended introduction duration of the products can be carried out or not based on the interestingness, if so, entering a step S14, and if not, determining the recommended introduction duration of the products based on preset introduction duration;
specifically, screening interested audience and associated barrages of the product based on the identification result specifically comprises the following steps:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
It should be further noted that, as shown in fig. 4, the specific steps of the interestingness determination are as follows:
s41, acquiring the number of the associated barrages, determining whether the product is an audience interested product based on the number of the associated barrages, if so, entering a step S42, and if not, evaluating the interestingness of the product based on the number of the associated barrages and the set interestingness;
s42, taking the audience quantity of the associated barrage as the transmitted audience quantity, determining whether the product is an audience interested product based on the transmitted audience quantity, if so, proceeding to step S43, and if not, evaluating the interestingness of the product based on the transmitted audience quantity and the set interestingness;
s43, using the audience with the number of the associated barrages larger than the preset number as the interested audience, and evaluating the interest degree of the interested audience based on the number of the interested audience, the number of the associated barrages of the interested audience and the proportion of the interested audience in the live broadcasting room;
s44, evaluating the interestingness of the sending audience based on the number of the associated barrages, the number of the sending audience and the ratio of the number of the sending audience to the number of the audience in the living broadcast room, and evaluating the interestingness based on the interestingness of the interested audience and the interestingness of the sending audience.
In this embodiment, the interest degree is evaluated based on the number and the proportion of the interested audiences and the number of the relevant barrages of the interested audiences, so that the determination of the interest condition of the users of different products is realized from the analysis result of barrage characters, and a foundation is laid for the differentiated determination of the recommendation introduction time of the products and the output of the recommendation introduction content.
S14, acquiring the number of associated barrages of the product and the number of audience members sending the associated barrages, outputting the recommended introduction duration of the product by combining the interestingness, and outputting the recommended introduction content of the product based on the identification result.
The recommended introduction time length is related to the number of associated barrages for obtaining the product, the number of audiences sending the associated barrages and the interestingness, and is specifically constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of the associated barrages of the product is larger, the number of the audiences sending the associated barrages is larger, the interestingness is higher, and the recommended introduction time length of the product is longer.
In the embodiment, the recommended introduction time length of the product is output by the number of the associated barrages based on the product, the number of audiences sending the associated barrages and combining the interestingness, so that the recommended introduction time length is determined from multiple angles, the recommended time length of the product with higher interest degree of the user is ensured, and the live broadcast experience of the audience is improved.
Example 2
In another aspect, the present invention provides a computer storage medium having a computer program stored thereon, which when executed in a computer causes the computer to perform a method for processing live broadcast data based on word recognition as described above.
The live broadcast data processing method based on character recognition specifically comprises the following steps:
determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, and entering the next step;
acquiring the number of real-time barrages of the live broadcasting room in a set time period, and entering the next step when the number of the real-time barrages is judged to determine that the basic identification frequency needs to be corrected;
acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining that the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, and entering the next step;
constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room;
when the correction amount is used for determining that the basic identification frequency needs to be corrected, the correction amount is used for correcting the basic identification frequency to obtain a corrected identification frequency;
character recognition is carried out on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening of interested audiences and associated barrages of products is carried out based on the recognition result, and evaluation of interestingness is carried out based on the number and proportion of the interested audiences and the number of associated barrages of the interested audiences;
the number of the associated barrages of the product and the number of audience members sending the associated barrages are obtained, the interestingness is combined to output the recommended introduction duration of the product, and the recommended introduction content of the product is output based on the identification result.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing is merely one or more embodiments of the present description and is not intended to limit the present description. Various modifications and alterations to one or more embodiments of this description will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of one or more embodiments of the present description, is intended to be included within the scope of the claims of the present description.
Claims (10)
1. The live broadcast data processing method based on character recognition is characterized by comprising the following steps of:
determining basic recognition frequency of character recognition of the live broadcasting room barrage according to the number of historical audience people in the live broadcasting room, the number of historical barrages in a set time period and the number of historical barrage words;
constructing correction quantity by using the number of real-time audience in the live broadcasting room, the number of real-time barrages in a set time period and the number of real-time barrages, determining whether the basic recognition frequency needs to be corrected based on the correction quantity, if so, correcting the basic recognition frequency by using the correction quantity to obtain a corrected recognition frequency, and if not, taking the basic recognition frequency as the corrected recognition frequency;
performing character recognition on the barrage of the live broadcasting room based on the corrected recognition frequency to obtain a recognition result, screening interested audiences and associated barrages of the product based on the recognition result, evaluating the interestingness based on the number, the proportion and the number of the associated barrages of the interested audiences, determining whether the output of the recommended introduction duration of the product can be performed or not based on the interestingness, if so, entering the next step, and if not, determining the recommended introduction duration of the product based on the preset introduction duration;
the number of the associated barrages of the product and the number of audience members sending the associated barrages are obtained, the interestingness is combined to output the recommended introduction duration of the product, and the recommended introduction content of the product is output based on the identification result.
2. The method for processing live broadcast data based on character recognition according to claim 1, wherein the set time period is less than 2 minutes, specifically, the method is determined according to the daily average number of active audiences on a live broadcast platform where the live broadcast room is located, and the set time period is shorter as the daily average number of active audiences on the live broadcast platform is greater.
3. The live broadcast data processing method based on character recognition according to claim 1, wherein the step of determining the basic recognition frequency of the character recognition is:
s21, determining the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room through the historical audience number of the live broadcasting room, determining the historical audience evaluation value of the live broadcasting room based on the maximum value of the historical audience number and the average value of the historical audience number of the live broadcasting room, determining whether the live broadcasting room belongs to a low-attention live broadcasting room or not based on the historical audience evaluation value, if yes, taking the set identification frequency as the basic identification frequency of the live broadcasting room, and if no, entering step S22;
s22, determining the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period through the number of the historical barrages of the live broadcasting room in the set time period, determining the estimated value of the number of the historical barrages of the live broadcasting room based on the maximum value of the number of the historical barrages of the live broadcasting room in the set time period and the average value of the number of the historical barrages of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to the live broadcasting room with high attention degree or not based on the estimated value of the number of the historical barrages, if so, entering step S23, otherwise, entering step S24;
s23, determining the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period through the historical barrage word number of the live broadcasting room in the set time period, determining the historical barrage word number evaluation value of the live broadcasting room based on the maximum value of the historical barrage word number of the live broadcasting room in the set time period and the average value of the historical barrage word number of the live broadcasting room in the set time period, determining whether the live broadcasting room belongs to a high-attention live broadcasting room or not based on the historical barrage word number evaluation value, if so, taking a second set identification frequency as a basic identification frequency of the live broadcasting room, and if not, entering step S24;
s24, determining the basic identification frequency of the live broadcasting room based on the historical audience evaluation value, the historical bullet screen number evaluation value and the historical bullet screen word evaluation value of the live broadcasting room.
4. The method for processing live data based on character recognition according to claim 3, wherein when the historical audience evaluation value of the live room is smaller than the audience evaluation set value, the live room is determined to be a low-attention live room.
5. The method for processing live broadcast data based on character recognition according to claim 3, wherein the second set recognition frequency is greater than the set recognition frequency, and specifically, the second set recognition frequency and the set recognition frequency are greater as the number of products in the live broadcast room is greater.
6. The method for processing live broadcast data based on character recognition according to claim 1, wherein the specific steps of correction amount determination are as follows:
acquiring the number of real-time audience members in the live broadcasting room, determining whether the basic identification frequency needs to be corrected based on the number of the real-time audience members, if so, entering the next step, and if not, determining that the basic identification frequency does not need to be corrected;
acquiring the number of real-time barrages of the live broadcasting room in a set time period, judging whether the number of real-time barrages is required to correct the basic identification frequency, if so, entering the next step, and if not, determining not to correct the basic identification frequency;
acquiring the number of audience people in the living broadcasting room, the number of barrages and the number of barrages in a set time period again at least after a preset time, evaluating the liveness of the living broadcasting room based on the number of audience people in the living broadcasting room and the number of barrages, evaluating the real-time liveness of the living broadcasting room based on the number of real-time audience people in the living broadcasting room and the number of live barrages, determining whether the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the living broadcasting room, if yes, entering the next step, and if no, determining that the basic recognition frequency needs not to be corrected;
and constructing the correction quantity by utilizing the liveness and the real-time liveness of the live broadcasting room and combining the real-time barrage word number of the live broadcasting room and the barrage word number of the live broadcasting room.
7. The method for processing live broadcast data based on character recognition according to claim 1, wherein determining whether the basic recognition frequency needs to be corrected based on the liveness and the real-time liveness of the live broadcast room comprises:
determining whether the basic identification frequency needs to be corrected based on the liveness of the live broadcasting room, if yes, determining that the basic identification frequency needs to be corrected, and if no, entering the next step;
and constructing the comprehensive liveness of the live broadcasting room based on the liveness and the real-time liveness of the live broadcasting room, and determining whether the basic identification frequency needs to be corrected based on the comprehensive liveness.
8. The method for processing live broadcast data based on character recognition according to claim 1, wherein the screening of interested viewers and associated barrages of the product is performed based on the recognition result, specifically comprising:
and determining keywords of characters of the barrage of the live broadcasting room based on the identification result, and screening interested audiences of the product and associated barrages based on the association of the keywords and the product, wherein the interested audiences of the product are the audience with the number of the transmitted associated barrages being larger than the preset number.
9. The method for processing live broadcast data based on character recognition according to claim 1, wherein the recommended introduction duration is related to the number of correlated curtains for obtaining the product, the number of audiences sending the correlated curtains, and the interestingness, and specifically is constructed by adopting a prediction model based on a neural network algorithm, wherein when the number of correlated curtains of the product is larger, the number of audiences sending the correlated curtains is larger, the interestingness is higher, and the recommended introduction duration of the product is longer.
10. A computer storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform a word recognition based live data processing method as claimed in any of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310704406.7A CN116582694B (en) | 2023-06-14 | 2023-06-14 | Live broadcast data processing method based on character recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310704406.7A CN116582694B (en) | 2023-06-14 | 2023-06-14 | Live broadcast data processing method based on character recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116582694A CN116582694A (en) | 2023-08-11 |
CN116582694B true CN116582694B (en) | 2023-10-31 |
Family
ID=87541502
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310704406.7A Active CN116582694B (en) | 2023-06-14 | 2023-06-14 | Live broadcast data processing method based on character recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116582694B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507870A (en) * | 2020-03-20 | 2020-08-07 | 威比网络科技(上海)有限公司 | Lecture display method, system, equipment and storage medium based on online education |
CN112765336A (en) * | 2021-01-29 | 2021-05-07 | 中国平安人寿保险股份有限公司 | Bullet screen management method and device, terminal equipment and storage medium |
WO2021137978A1 (en) * | 2019-12-31 | 2021-07-08 | Sling Tv Llc | Digital advertisement frequency correction |
CN114417140A (en) * | 2021-12-23 | 2022-04-29 | 广西壮族自治区公众信息产业有限公司 | Live broadcast marketing method and system |
WO2022105269A1 (en) * | 2020-11-17 | 2022-05-27 | 北京达佳互联信息技术有限公司 | Live broadcast video pushing method and device |
CN115708354A (en) * | 2021-08-18 | 2023-02-21 | 武汉斗鱼鱼乐网络科技有限公司 | Information recommendation method and device, server and computer readable storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100162285A1 (en) * | 2007-09-11 | 2010-06-24 | Yossef Gerard Cohen | Presence Detector and Method for Estimating an Audience |
-
2023
- 2023-06-14 CN CN202310704406.7A patent/CN116582694B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021137978A1 (en) * | 2019-12-31 | 2021-07-08 | Sling Tv Llc | Digital advertisement frequency correction |
CN111507870A (en) * | 2020-03-20 | 2020-08-07 | 威比网络科技(上海)有限公司 | Lecture display method, system, equipment and storage medium based on online education |
WO2022105269A1 (en) * | 2020-11-17 | 2022-05-27 | 北京达佳互联信息技术有限公司 | Live broadcast video pushing method and device |
CN112765336A (en) * | 2021-01-29 | 2021-05-07 | 中国平安人寿保险股份有限公司 | Bullet screen management method and device, terminal equipment and storage medium |
CN115708354A (en) * | 2021-08-18 | 2023-02-21 | 武汉斗鱼鱼乐网络科技有限公司 | Information recommendation method and device, server and computer readable storage medium |
CN114417140A (en) * | 2021-12-23 | 2022-04-29 | 广西壮族自治区公众信息产业有限公司 | Live broadcast marketing method and system |
Also Published As
Publication number | Publication date |
---|---|
CN116582694A (en) | 2023-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2875644B1 (en) | Logging individuals for tv measurement compliance | |
EP1971075A1 (en) | An information issuing system, a public media information issuing system and an issuing method | |
JP6062935B2 (en) | Method and apparatus for delivering targeted content | |
Li et al. | A cost-constrained video quality satisfaction study on mobile devices | |
CN109246450B (en) | Movie and television preferred recommendation method based on implicit information scoring | |
CN108122314A (en) | A kind of doorbell call processing method, Cloud Server, medium and system | |
CN111402439B (en) | Online training class arrival rate statistical management method and system based on face recognition | |
CN109121006B (en) | Marketing method and platform based on live broadcast watching user | |
CN111401906A (en) | Transfer risk detection method and system | |
CN112913251B (en) | Media identification with watermarking and signing | |
CN116582694B (en) | Live broadcast data processing method based on character recognition | |
CN116761207B (en) | User portrait construction method and system based on communication behaviors | |
US11212570B2 (en) | Viewing data | |
Damsbo-Svendsen | Mass media influence on the rapid rise of climate change | |
CN114945097B (en) | Video stream processing method and device | |
Gürsun et al. | On context-aware DDoS attacks using deep generative networks | |
CN114679600A (en) | Data processing method and device | |
CN113157756A (en) | Interactive data statistical method and device, electronic equipment and storage medium | |
Lv et al. | QoE prediction on imbalanced IPTV data based on multi-layer neural network | |
Son | Interpersonal trust and confidence in Labor Unions: The Case of South Korea | |
CN111209817A (en) | Assessment method, device and equipment based on artificial intelligence and readable storage medium | |
CN115277620B (en) | Social application processing method and system | |
CN118035455A (en) | Cross-modal data analysis processing system and method | |
CN110536158B (en) | Application program competitiveness analysis method and device | |
US10911822B2 (en) | Device-based detection of ambient media to be used by a server to selectively provide secondary content to the device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |