CN111159437A - Method and system for predicting transmission result and type of film and television works - Google Patents
Method and system for predicting transmission result and type of film and television works Download PDFInfo
- Publication number
- CN111159437A CN111159437A CN201911365467.5A CN201911365467A CN111159437A CN 111159437 A CN111159437 A CN 111159437A CN 201911365467 A CN201911365467 A CN 201911365467A CN 111159437 A CN111159437 A CN 111159437A
- Authority
- CN
- China
- Prior art keywords
- television
- film
- work
- movie
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/45—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Evolutionary Biology (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Operations Research (AREA)
- Multimedia (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The present disclosure relates to a method and a system for predicting a transmission result and a type of a film and television work, wherein the method comprises the following steps: determining first type distribution of a first film and television work according to text information of the first film and television work to be predicted and a plurality of preset film and television types, wherein the text information comprises a script of the first film and television work and/or a literary work corresponding to the first film and television work, and the first type distribution represents probability distribution of the first film and television work belonging to the plurality of film and television types; determining similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works; determining a third film and television work with the highest similarity from the plurality of second film and television works; and determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work. According to the embodiment of the method and the device, the propagation result of the film and television works can be predicted according to the similarity of the type distribution of the film and television works, and the accuracy of the prediction of the propagation result of the film and television works is improved.
Description
Technical Field
The present disclosure relates to the field of information processing technologies, and in particular, to a method and a system for predicting a propagation result and a type of a movie and television work.
Background
With the development of media technology, the richness of media forms and the diversification of expression channels, the number of film and television works has increased explosively, but at the same time, the quality of the film and television works is different, so that the transmission effect of each film and television work is far from the other. At present, the evaluation and prediction of the propagation effect of the film and television works are mainly based on the content quality of the film and television works, and have strong limitation and poor accuracy.
Disclosure of Invention
In view of the above, the present disclosure provides a method and a system for predicting a transmission result and a type of a movie and television work.
According to an aspect of the present disclosure, there is provided an information prediction method for a movie work, the method including:
determining first type distribution of a first film and television work according to text information of the first film and television work to be predicted and a plurality of preset film and television types, wherein the text information comprises a script of the first film and television work and/or a literature work corresponding to the first film and television work, and the first type distribution represents probability distribution of the first film and television work belonging to the plurality of film and television types;
determining similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works;
determining a third film and television work with the highest similarity from the plurality of second film and television works;
and determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work.
In a possible implementation manner, determining a first type distribution of a first movie work according to text information of the first movie work to be predicted and a plurality of preset movie types includes:
segmenting words of text information of a first film and television work to be predicted, and determining a plurality of first words corresponding to the first film and television work;
and determining the first type distribution of the first film and television works according to the plurality of first words and the plurality of preset film and television types.
In a possible implementation manner, the word segmentation is performed on text information of a first film and television work to be predicted, and a plurality of first words corresponding to the first film and television work are determined, including:
determining a plurality of candidate words of a sentence and a predecessor word of each candidate word according to a preset movie and television work corpus for any sentence in text information of a first movie and television work to be predicted, wherein the movie and television work corpus comprises a plurality of second words and cumulative occurrence probability of each second word;
respectively determining the best predecessor word of each candidate word and the terminal word of the sentence according to the cumulative occurrence probability of each second word in the film and television work corpus, wherein the best predecessor word is the predecessor word with the largest cumulative occurrence probability in the predecessor words of each candidate word;
and determining a plurality of first words included in the sentence according to the terminal words of the sentence and the optimal predecessor words of the candidate words.
In a possible implementation manner, determining a first type distribution of the first movie according to the plurality of first terms and a plurality of preset movie types includes:
determining a first vector corresponding to the first film and television work according to the plurality of first words;
determining a plurality of Euclidean distances between the first vector and a plurality of second vectors, wherein the plurality of second vectors are vectors corresponding to a plurality of preset film and television types;
determining the probability that the first film and television work belongs to each film and television type according to the European distances;
and determining the first type distribution of the first film and television works according to the probability that the first film and television works belong to each film and television type.
In a possible implementation manner, determining a first type distribution of a first movie work according to text information of the first movie work to be predicted and a plurality of preset movie types includes:
determining first type distribution of the first film and television works according to text information, actor information and a plurality of preset film and television types of the first film and television works to be predicted.
In one possible implementation manner, determining a similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second movie and television works includes:
and determining cosine similarity between a third vector corresponding to the first type distribution and a fourth vector corresponding to any second type distribution.
In one possible implementation, the method further includes: and determining the type distribution of a fourth film and television work matched with the target transmission result according to the target transmission result to be predicted, the preference information of a user group to a plurality of film and television types and the preset transmission information of a plurality of second film and television works.
In a possible implementation manner, determining type distribution of a fourth film and television work matched with a target propagation result according to the target propagation result to be predicted, preference information of a user group on a plurality of film and television types, and preset propagation information of a plurality of second film and television works, includes:
determining a fifth film and television work matched with the target transmission result from the plurality of second film and television works according to the target transmission result to be predicted and the preset transmission information of the plurality of second film and television works;
and determining the type distribution of the fourth film and television works according to the type distribution of the fifth film and television works and the type distribution of preference information of a user group to a plurality of film and television types.
In one possible implementation, the broadcast information of the film and television works comprises at least one of scores, prize winning information, box-office and audience rating of the film and television works.
According to another aspect of the present disclosure, there is provided an information prediction system for a movie work, the system including:
the type distribution determining module is used for determining first type distribution of a first movie and television work according to text information of the first movie and television work to be predicted and a plurality of preset movie and television types, wherein the text information comprises a script of the first movie and television work and/or a literary work corresponding to the first movie and television work, and the first type distribution represents probability distribution of the first movie and television work belonging to the plurality of movie and television types;
the similarity determining module is used for determining the similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works;
the selecting module is used for determining a third film and television work with the highest similarity from the plurality of second film and television works;
and the first prediction module is used for determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work.
In one possible implementation, the system may further include: and the second prediction module is used for determining the type distribution of a fourth film and television work matched with the target propagation result according to the target propagation result to be predicted, the preference information of a user group to a plurality of film and television types and the preset propagation information of a plurality of second film and television works.
According to the embodiment of the disclosure, the type distribution of the first film and television works to be predicted can be determined according to the text information (such as script, literary works and the like) of the first film and television works, the third film and television works with the highest similarity to the first film and television works can be determined from the preset second film and television works according to the similarity between the type distribution and the preset type distribution of the second film and television works, the predicted transmission result of the first film and television works can be determined according to the transmission information of the third film and television works, and therefore the transmission result of the film and television works can be predicted according to the similarity of the type distribution of the film and television works, and the accuracy of the prediction of the transmission result of the film and television works is improved.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 illustrates a flowchart of an information prediction method of a movie work according to an embodiment of the present disclosure.
FIG. 2 illustrates a block diagram of an information prediction system for a film and television work according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Fig. 1 illustrates a flowchart of an information prediction method of a movie work according to an embodiment of the present disclosure. As shown in fig. 1, the method includes:
step S100, determining first type distribution of a first film and television work according to text information of the first film and television work to be predicted and a plurality of preset film and television types, wherein the text information comprises a script of the first film and television work and/or a literary work corresponding to the first film and television work, and the first type distribution represents probability distribution of the first film and television work belonging to the plurality of film and television types;
step S200, determining the similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works;
step S300, determining a third film and television work with the highest similarity from the plurality of second film and television works;
step S400, determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work.
According to the embodiment of the disclosure, the type distribution of the first film and television works to be predicted can be determined according to the text information (such as script, literary works and the like) of the first film and television works, the third film and television works with the highest similarity to the first film and television works can be determined from the preset second film and television works according to the similarity between the type distribution and the preset type distribution of the second film and television works, the predicted transmission result of the first film and television works can be determined according to the transmission information of the third film and television works, and therefore the transmission result of the film and television works can be predicted according to the similarity of the type distribution of the film and television works, and the accuracy of the prediction of the transmission result of the film and television works is improved.
In one possible implementation, the movie or television work is a work that is filmed on a substance (e.g., film, storable medium, etc.), is composed of a series of frames with or without accompanying sound, and is shown and played by a suitable device (e.g., player), and may include movie works and works created in a manner similar to filming (e.g., television series, video works, etc.). The present disclosure does not limit the details of the movie work.
In one possible implementation, the text information of the movie and television work refers to information in the form of characters of the movie and television work, and may include a script of the movie and television work and/or a literary work (such as a novel, a biographical character, and the like) corresponding to the movie and television work.
In one possible implementation, the preset movie types can be preset movie works of multiple types, such as comedy, tragedy, love, action, science fiction, horror, animation, war, and the like. The specific type of film and television can be preset by those skilled in the art according to practical situations, and the present disclosure is not limited thereto.
In one possible implementation manner, the type distribution of the movie and television work may represent a probability distribution that the movie and television work belongs to a plurality of preset movie and television types. For example, the preset multiple movie types include comedy, tragedy, love, action, science fiction and horror, and the probability that the movie work a belongs to each movie type is respectively comedy 45%, tragedy 5%, love 35%, action 2%, science fiction 10% and horror 3%, so that the type distribution can be determined according to the probability that the movie work a belongs to each movie type.
In one possible implementation, the type distribution of the movie and television work can have multiple representations such as numbers (e.g., vectors), graphics (e.g., histograms, percentage charts), and the like. For example, the type distribution of the movie work a may be represented as a vector (0.45, 0.05, 0.35, 0.02, 0.10, 0.03). It should be understood that the representation of the type distribution can be determined by those skilled in the art according to practical circumstances, and the present disclosure does not limit this.
In one possible implementation, the broadcast information of the movie work may include at least one of a rating, winning information, a box office, and a rating of the movie work. The box house and the audience rating of the film and television works can be determined according to data obtained from various media such as internet, radio, television, newspaper and the like.
In one possible implementation, the score of the movie and television work can be determined according to the user scores of the movie and television work on a plurality of preset video playing platforms (such as a bean movie, an arcade art, an Tencent video and the like). For example, the score of the movie and television work can be determined according to the average value of the scores of the users of the movie and television work on each preset video playing platform; or determining the scores of the film and television works according to the preset platform weight and the scores of the film and television works on the preset user of each video playing platform. The present disclosure does not limit the manner in which the rating of the movie work is determined.
For example, assuming that the score interval of the scores of the movie and television works is 1-10 scores and 0.1 score interval is used, the number of the preset video playing platforms is 3, and the preset video playing platforms are respectively platform 1, platform 2 and platform 3, the user score of the movie and television work a on the platform 1 is 8.5, the user score on the platform 2 is 9.0, the user score on the platform 3 is 8.9, and the average value of the user scores of the movie and television work a on the 3 platforms is 8.8, which can be determined as the score of the movie and television work a.
In one possible implementation, the winning information of the movie work may include at least one international prize and/or domestic prize obtained by the movie work. Examples of the international prize include the prize in the Cannes international movie festival, the Venice international movie festival, the Berlin international movie festival, or the Oscar golden prize, and the domestic prize includes the prize in the golden pheasant, golden statue, or golden horse. The present disclosure is not limited to particular awards that may be obtained for a movie work.
In a possible implementation manner, before predicting the propagation result of the first movie and television work, a movie and television work library including a plurality of second movie and television works may be established first, and the second type distribution and propagation information of each second movie and television work in the movie and television work library may be determined.
The second movie and television works can be selected in various selection modes, for example, the second movie and television works can be selected according to various ranking lists such as a ticket house ranking list of movies, an audience rating ranking list of television series, a rating ranking list and the like, the second movie and television works can also be selected according to prize winning information of the movie and television works, and the second movie and television works can also be selected in other modes. The selection of the second movie work can be determined by those skilled in the art according to practical situations, and the present disclosure does not limit this.
In one possible implementation, the second movie works may be selected from a plurality of second movie works to create a movie work library as follows:
the method comprises the following steps that a plurality of winning movie and television works in a preset time period can be selected, and the propagation information of each winning movie and television work is determined, wherein the winning movie and television works refer to movie and television works for obtaining international awards and/or domestic awards;
for any winning movie and television works, the total score of the winning movie and television works can be determined according to preset rules and the spreading information of the winning movie and television works;
determining the winning movie and television works with the total score larger than or equal to a preset total score threshold value as second movie and television works;
and establishing a film and television work library according to the plurality of second film and television works.
In a possible implementation manner, a plurality of winning movie works in a preset time period may be selected first, where the preset time period is, for example, 10 years, 15 years, and 20 years, and specific values thereof may be set according to actual situations, which is not limited by the present disclosure; broadcast information for each of the awarded movie works is then determined, and the broadcast information may include at least one of ratings, award information, box-office, and ratings for the awarded movie works. One skilled in the art can determine the broadcast information of the winning movie work in a variety of ways, and the present disclosure is not limited thereto.
In a possible implementation manner, for any winning movie and television work, the total score of the winning movie and television work can be determined according to preset rules and the spreading information of the winning movie and television work. The preset rule may include a weight of the propagation information, and a correspondence between the propagation information and the score.
The following illustrates the manner of determining the total score of a winning movie or television work:
firstly, the transmission information of the film and television works can be divided into three aspects of box office or audience rating, scoring and prize winning information, and the corresponding weights are respectively set to be 45%, 35% and 20%;
then, determining the score determination modes of the box office, the audience rating, the score and the prize winning information respectively. Wherein, the scoring rule of the box office can be set as: aiming at a domestic movie box office, 10 minutes of movie works with more than 10 hundred million boxes can be recorded, 9 minutes of movie works with more than 6 hundred million boxes can be recorded, 8 minutes of movie works with more than 3 hundred million boxes can be recorded, 7.5 minutes of movie works with more than 1 million boxes can be recorded, 7 minutes of movie works with more than 6000 million boxes can be recorded, 6.5 minutes of movie works with more than 3000 million boxes can be recorded, 6 minutes of movie works with more than 1000 million boxes can be recorded, 4 minutes of movie works with more than 600 million boxes can be recorded, 2 minutes of movie works with more than 100 million boxes can be recorded, 1 minute of movie works with more than 50 million boxes can be recorded, and 0 minutes of other movie works can be recorded;
the rating scoring rule may be set as: aiming at domestic TV series, according to the total ranking list of the inland TV series, the TV series with the rating ranking of 1 st can be recorded as 10 points, the TV series with the rating ranking of 2-10 nd can be recorded as 9.5 points, the TV series with the rating ranking of 11-20 th can be recorded as 9 points, the TV series with the rating ranking of 21-30 st can be recorded as 8.5 points, the TV series with the rating ranking of 31-40 th can be recorded as 8 points, and the scores can be set for the TV series ranked later in the ranking list by a similar method, which is not described again;
the manner of determination of the score may be set as: determining according to the average value of user scores of the film and television works on each preset video playing platform;
the scoring rule of the winning information may be set as: the prize may include four types of international prize, domestic prize, other prize and nomination prize. Obtaining at least one video work of international award (including awards of Cannes international movie festival, Venice international movie festival, Berlin international movie festival and Oscar golden award) which is marked as 10 points; obtaining at least one film and television work of domestic awards (including the awards of a golden chicken award, a golden image award and a golden horse award) and recording the film and television work as 8 points; obtaining at least one other video work of the prize and recording as 5 points; and obtaining at least one video work of the name-drawing prize, and recording the score as 3. For any movie work, the score value corresponding to each type of award obtained by the movie work may be multiplied by a preset ratio (e.g., 0.5) to obtain an adjusted score value corresponding to each type of award, a maximum value may be selected from the adjusted score values, and the maximum value is determined as a final score value corresponding to the award winning information of the movie work. When an award related to a story line (e.g., a best story sheet) is included in the awards obtained for the movie work, a preset ratio of the award may be set to 1.
Then, the total score of the film and television works can be determined according to the weight of the transmission information and the score of the transmission information: total score is box office (or rating) × 45% + score × 35% + winning information × 20%.
For example, suppose the broadcast information of movie and television work a is: the box office is 21 hundred million, and the user scores of 3 video playing platforms are respectively 8.5, 9.0 and 8.9, so that the prize of 2 Cannes of movie festival, the prize of 3 gold chicken prize (including the best story-film prize) and 1 other prize are obtained, then the score of the box office of movie work a is 10, the score is (8.5+9+8.9) ÷ 3 ═ 8.8, the score corresponding to the international prize in the prize information is 10 × 0.5 ═ 5, the score corresponding to the domestic prize is 8 × 1 ═ 8, the score corresponding to the other prize is 5 × 0.5 ═ 2.5, and the maximum 8 of the three scores can be determined as the final score of the prize information, so the total score of the movie work a is: 10 × 45% +8.8 × 35% +8 × 20% ═ 9.18.
It should be understood that the preset rules can be set by those skilled in the art according to practical situations, and the disclosure is not limited thereto.
In one possible implementation manner, after the total score of each winning movie and television work is determined, the winning movie and television work of which the total score is greater than or equal to a preset total score threshold (for example, 7.0) can be determined as the second movie and the broadcast information of the winning movie and television work can be determined as the broadcast information of the second movie and television work; and then establishing a film and television work library according to the selected plurality of second film and television works.
In one possible implementation, after the library of movie and television works is created, a second type distribution of each second movie and television work in the library of movie and television works may be determined. The second type distribution can represent the probability distribution that the second film and television work belongs to a plurality of preset film and television types. For example, the preset plurality of movie types include comedy, tragedy, love, action, science fiction and horror, the probability that the second movie work B belongs to each movie type is respectively comedy 5%, tragedy 5%, love 35%, action 52%, science fiction 1% and horror 2%, and the second type distribution of the second movie work B can be expressed as a vector (0.05, 0.05, 0.35, 0.52, 0.01, 0.02).
In a possible implementation manner, the second type distribution of the second movie work can be determined according to the text information of the second movie work and a plurality of preset movie types.
In a possible implementation manner, the text information of the second movie work can be segmented to determine a plurality of third words corresponding to the second movie work, and then the second type distribution of the second movie work is determined according to the plurality of third words corresponding to the second movie work and a plurality of preset movie types.
In a possible implementation manner, before performing word segmentation on text information of the second movie and television work, a movie and television work corpus can be established through a word segmentation algorithm based on probability maximization according to a preset first training text. The film and television work corpus can comprise a plurality of second words and the accumulated occurrence probability of each second word.
In one possible implementation, the first training text may include a transcript of a plurality of movie works. When the film and television work corpus is established, the first training text can be input into a word segmentation algorithm based on probability maximization for word segmentation, and the word segmentation algorithm based on probability maximization can divide the first training text into a plurality of words according to a preset reference dictionary (such as latest edition 'thesaurus'), so as to establish the film and television work corpus. In the word segmentation process, for any word, if the movie work corpus does not include the word, the word is added into the movie work corpus, the frequency (namely the occurrence frequency) of the word is added with 1, and the total number of the words in the movie work corpus is added with 1; if the film and television work corpus includes the word, the frequency count of the word is increased by 1. And after the first training text is input, determining the finally obtained film and television work corpus as the finally established film and television work corpus. For any second term in the film and television work corpus, the cumulative occurrence probability of the second term can be determined according to the frequency of the second term and the total number of terms in the film and television work corpus, namely the cumulative occurrence probability of the second term is the frequency of the second term/the total number of terms in the film and television work corpus.
In a possible implementation manner, after the corpus of the film and television works is established, any sentence in the text information of the second film and television work can be regarded as a sentence to be participled, and a plurality of third words included in the sentence to be participled are determined by the following method:
according to the film and television work corpus, a plurality of candidate words of the sentence to be segmented are determined, wherein the candidate words comprise all the words appearing in the film and television work corpus in the sentence to be segmented. For example, for the sentence to be segmented, "he proposes an opinion", when determining the candidate words, all the words appearing in the movie corpus in the sentence can be determined in the order from left to right: he/she/go/you/see;
after determining a plurality of candidate words, the candidate words may be represented as w in the left-to-right sequence of the candidate words in the sentence to be segmented1,w2,…,wi-1,wi,…,wN(N is a positive integer representing the total number of candidate words in the sentence, i is a positive integer, and 1 ≦ i ≦ N), and then w is addediSuffix w in previous candidate wordiCandidate words with adjacent prefix are determined as wiWherein the leftmost candidate word has no predecessor words. For example, "he" is the predecessor of "propose", and "propose", "out" is the predecessor of "opinion", and "he" is the leftmost candidate, with no predecessor;
after determining the predecessor words of each candidate word, for any candidate word, determining the cumulative occurrence probability of each predecessor word of the candidate word according to the cumulative occurrence probability of the second word in the film and television work corpus, and determining the predecessor word with the maximum cumulative occurrence probability as the best predecessor word of the candidate word. For example, the candidate word "propose" has only one predecessor word "he", which may be determined to be the best predecessor for "propose"; the candidate word 'opinion' has two predecessor words 'propose' and 'propose', the accumulative occurrence probability of the proposal 'in the film and television work corpus is larger than that of the' propose ', and the proposal' can be determined as the best predecessor word of the 'opinion';
then the terminal word of the sentence to be participled can be determined: when only one candidate word containing the sentence tail is available, determining the candidate word as a terminal word of the sentence to be segmented; when a plurality of candidate words containing the sentence end exist, the accumulated occurrence probability of each candidate word containing the sentence end can be respectively determined according to the accumulated occurrence probability of the second word in the film and television work corpus, and then the candidate word with the largest accumulated occurrence probability in the candidate words containing the sentence end is determined as the terminal word of the sentence to be segmented. For example, "opinion" and "see" are candidate words containing the end of a sentence "see" of the sentence "he proposes opinion", the cumulative probability of occurrence of the "opinion" in the film and television work corpus is greater than the cumulative probability of occurrence of the "see", and the "opinion" can be determined as a terminal word of the sentence "he proposes opinion";
after determining the end word of the sentence to be segmented and the optimal predecessor word of each candidate word, the terminal word of the sentence to be segmented can be traced back forwards, and the optimal predecessor word of each candidate word is determined forwards according to the guidance of the optimal predecessor word, so that the segmentation result of the sentence to be segmented, namely a plurality of third words included in the sentence to be segmented, is determined. For example, the word segmentation result of the sentence "he proposes an opinion" is: he/she proposes/views.
In a possible implementation manner, the word segmentation method may be used to segment each sentence in the text information of the second movie and television work to obtain a plurality of third words corresponding to the second movie and television work. And then, second type distribution of the second film and television work can be determined according to a plurality of third words corresponding to the second film and television work and a plurality of preset film and television types.
In one possible implementation, the second vectors corresponding to the respective movie types may be determined prior to determining the second type distribution of the second movie work. The method comprises the steps of selecting a second training text corresponding to each film and television type, carrying out word segmentation and clustering on the second training text to obtain a plurality of words corresponding to each film and television type, and then determining an M-dimensional second vector corresponding to any film and television type according to the plurality of words corresponding to the film and television type, wherein M is a positive integer.
In one possible implementation, the M-dimensional second vector corresponding to any one of the movie types may be determined by a neural network. The neural network may include an input layer, a hidden layer, and an output layer. A plurality of words corresponding to any type of film and television can be input into a neural network for processing, an input layer of the neural network can perform hot independent coding (namely one-hot coding) on the input words to obtain a plurality of 1 xM-dimensional word characteristic vectors, and the 1 xM-dimensional word characteristic vectors are multiplied by an M x V-dimensional input weight matrix (wherein V is a positive integer) to obtain a plurality of 1 x V-dimensional first intermediate vectors; inputting a plurality of 1 x V-dimensional first intermediate vectors into a hidden layer, wherein the hidden layer can average the input plurality of 1 x V-dimensional first intermediate vectors to obtain 1 x V-dimensional second intermediate vectors; and inputting the second intermediate vector with the dimension of 1 xV into the output layer, wherein the output layer can multiply the second intermediate vector with the dimension of 1 xV and the output weight matrix with the dimension of V xM to obtain 1 output vector with the dimension of 1 x M, and the output vector is the second vector with the dimension of M corresponding to the type of the film and television.
It should be understood that the specific values of the vector dimensions M and V can be determined by those skilled in the art according to practical situations, and the disclosure is not limited thereto.
In one possible implementation manner, a fifth vector corresponding to the second movie work may be determined according to a plurality of third words corresponding to the second movie work in a manner similar to that described above, where the fifth vector is an M-dimensional vector.
In one possible implementation, after determining the fifth vector corresponding to the second movie work and the second vectors corresponding to the respective movie types, euclidean distances between the fifth vector and the respective second vectors may be determined, respectively. The fifth vector may be represented as (z)1,z2,…,zj,…,zM) The second vector is represented as (y)1,y2,…,yj…, yM), where j is a positive integer, and 1 ≦ j ≦ M, the Euclidean distance h between the fifth vector and the second vector corresponding to the kth type of film may be determined by the following formula (1)k:
Wherein k is a positive integer, k is more than or equal to 1 and less than or equal to Q, and Q is the number of a plurality of film and television types.
In one possible implementation, the probability that the second movie belongs to each movie type may be determined according to a plurality of euclidean distances between the fifth vector and the plurality of second vectors.
In one possible implementation, the probability P that the second movie work belongs to the kth movie type can be determined by the following formula (2)k:
After the probability that the second film and television work belongs to each film and television type is determined, the second type distribution of the second film and television work can be determined according to the probability.
In one possible implementation, the second type distribution of each second film and television work in the film and television work library may be determined in a manner similar to that described above.
In a possible implementation manner, after the movie and television work library is established and the second type distribution and the propagation information of each second movie and television work in the movie and television work library are determined, the propagation result of the first movie and television work to be predicted can be predicted according to the second type distribution and the propagation information of a plurality of second movie and television works in the movie and television work library.
In a possible implementation manner, in step S100, a first type distribution of a first movie and television work may be determined according to text information of the first movie and television work to be predicted and a plurality of preset movie and television types. Wherein the first type distribution may represent a probability distribution that the first movie work belongs to the plurality of movie types.
For example, the preset multiple movie types include comedy, tragedy, love, action, science fiction, and horror, and the text information of the first movie work may be analyzed, clustered, and the like, so as to determine the probability that the first movie work belongs to each movie type, for example, the probability that the first movie work belongs to each type of movie type is: comedy 35%, tragedy 2%, love 16%, action 45%, science fiction 1%, horror 1%, according to which probability the first type distribution of the first movie work can be represented as a vector (0.35, 0.02, 0.16, 0.45, 0.01, 0.01).
In one possible implementation, after determining the first type distribution, the similarity between the first type distribution and a plurality of second type distributions of a plurality of second movie and television works may be determined in step S200. The similarity between the first-type distribution and any one of the second-type distributions can be determined in various ways, for example, the similarity between the first-type distribution and the second-type distribution can be determined in various ways, such as euclidean distance, pearson correlation coefficient, cosine similarity, and the like. The determination of the similarity can be selected by those skilled in the art according to practical situations, and the present disclosure does not limit this.
In one possible implementation manner, after determining the similarity between the first type distribution and the plurality of second type distributions, in step S300, a third movie work with the highest similarity may be determined from the plurality of second movie works. That is, the respective similarities may be compared, and the second movie and television work corresponding to the second type distribution having the highest similarity with the first type distribution may be determined as the third movie and television work. Wherein, the third film and television works can be one or more. The number of the third movie works can be determined by those skilled in the art according to practical situations, and the disclosure does not limit this.
In a possible implementation manner, after the third video work is determined, in step S400, a predicted transmission result of the first video work may be determined according to the first transmission information of the third video work.
In a possible implementation manner, when one third video work is available, the first transmission information of the third video work can be directly determined as the predicted transmission result of the first video work; when the number of the third video works is multiple, the multiple first broadcast information of the multiple third video works may be directly determined as the predicted broadcast result of the first video work, or the multiple first broadcast information of the multiple third video works may be compared, averaged, maximized, minimized, and the like (for example, the average value is calculated for the scores of the multiple third video works, the minimum value is calculated for the box office, and the like), and the processing result is determined as the predicted broadcast effect of the first video work.
For example, assume that 5 second movie works are included in the movie work library: the method comprises the following steps that a movie work C, a movie work D, a movie work E, a movie work F and a movie work G are adopted, a plurality of preset movie types comprise comedy, tragedy, love, action, science fiction and horror, a first movie work to be predicted is a movie work X, and the first movie work can be determined according to text information of the first movie work and a plurality of movie typesA first type distribution of a movie work; then respectively determining the similarity between the first type distribution and the second type distribution of each second film and television work to obtain 5 similarity values R1、R2、R3、R4、R5(ii) a Comparing the 5 similarity values to determine 2 similarity values R with the highest similarity2And R3And will be reacted with R2And R3Determining the corresponding second film and television work as a third film and television work, namely determining the film and television work D and the film and television work E as the third film and television work; and then determining the transmission information of the third video work as the predicted transmission effect of the first video work, namely determining the scores, prize winning information, ticket houses or audience ratings of the video work D and the video work E as the predicted transmission effect of the video work X.
According to the embodiment of the disclosure, the type distribution of the first film and television works to be predicted can be determined according to the text information (such as script, literary works and the like) of the first film and television works, the third film and television works with the highest similarity to the first film and television works can be determined from the preset second film and television works according to the similarity between the type distribution and the preset type distribution of the second film and television works, the predicted transmission result of the first film and television works can be determined according to the transmission information of the third film and television works, and therefore the transmission result of the film and television works can be predicted according to the similarity of the type distribution of the film and television works, and the accuracy of the prediction of the transmission result of the film and television works is improved.
In one possible implementation, step S100 may include: segmenting words of text information of a first film and television work to be predicted, and determining a plurality of first words corresponding to the first film and television work; and determining the first type distribution of the first film and television works according to the plurality of first words and the plurality of preset film and television types.
In one possible implementation manner, a word segmentation tool (e.g., jieba chinese word segmentation component, pkueseg word segmentation tool, etc.) or a word segmentation algorithm may be used to segment the text information of the first movie and television work, and the segmentation result is determined as a plurality of first words corresponding to the first movie and television work. This disclosure is not limited to the particular manner in which the words are segmented.
In a possible implementation manner, after a plurality of first words corresponding to the first movie work are determined, the first type distribution of the first movie work can be determined through quantization, clustering and other processing according to the plurality of first words and the preset plurality of movie types.
In this embodiment, by segmenting the text information of the first movie and television work, a plurality of words corresponding to the first movie and television work are determined, and the first type distribution of the first movie and television work is determined according to the plurality of words and the plurality of movie and television types, so that the first type distribution of the first movie and television work can be determined according to the text information, and the accuracy of the first type distribution can be improved.
In a possible implementation manner, the segmenting the text information of the first film and television work to be predicted, and determining a plurality of first words corresponding to the first film and television work may include:
determining a plurality of candidate words of a sentence and a predecessor word of each candidate word according to a preset film and television work corpus for any sentence in text information of a first film and television work to be predicted;
respectively determining the best predecessor word of each candidate word and the terminal word of the sentence according to the cumulative occurrence probability of each second word in the film and television work corpus, wherein the best predecessor word is the predecessor word with the largest cumulative occurrence probability in the predecessor words of each candidate word;
and determining a plurality of first words included in the sentence according to the terminal words of the sentence and the optimal predecessor words of the candidate words.
In a possible implementation manner, for any sentence in the text information of the first movie and television work, a plurality of candidate words of the sentence may be determined according to a movie and television work corpus, where the candidate words include all the words that have appeared in the movie and television work corpus in the sentence; then determining the predecessor words of the candidate words: the candidate words may be represented as w in their left-to-right order in the sentence1,w2,…,wi-1,wi,…,wN(N is a positive integer, a significand)The total number of candidate words in the sentence, i is a positive integer, and i is more than or equal to 1 and less than or equal to N), and then w is addediSuffix w in previous candidate wordiCandidate words with adjacent prefix are determined as wiWherein the leftmost candidate word has no predecessor words.
In a possible implementation manner, after determining the predecessor words of each candidate word, for any candidate word, the cumulative occurrence probability of each predecessor word of the candidate word may be determined according to the cumulative occurrence probability of the second word in the movie work corpus, and the predecessor word with the largest cumulative occurrence probability is determined as the best predecessor word of the candidate word.
In a possible implementation manner, when only one candidate word containing a sentence end exists, the candidate word is determined as an end word of the sentence; when there are a plurality of candidate words containing the sentence end, the accumulated occurrence probability of each candidate word containing the sentence end can be respectively determined according to the accumulated occurrence probability of the second word in the film and television work corpus, and then the candidate word with the largest accumulated occurrence probability in the candidate words containing the sentence end is determined as the end word of the sentence.
In a possible implementation manner, after the end word of the sentence and the best predecessor word of each candidate word are determined, a plurality of first words included in the sentence can be determined according to the end word and the best predecessor word of each candidate word. The method can trace back from the end word of the sentence forward, and determine the best predecessor word of each candidate word forward according to the guidance of the best predecessor word, thereby determining the word segmentation result of the sentence, namely a plurality of first words included in the sentence.
In a possible implementation manner, the word segmentation method may be used to segment each sentence in the text information of the first movie work to obtain a plurality of first words corresponding to the first movie work.
In this embodiment, the text information of the first movie and television works can be segmented according to the dedicated movie and television work corpus to obtain a plurality of first words, so that the plurality of first words obtained by segmentation can accord with the word style of the movie and television works, and the accuracy of word segmentation processing can be improved.
In a possible implementation manner, determining a first type distribution of the first movie according to the plurality of first terms and a plurality of preset movie types may include:
determining a first vector corresponding to the first film and television work according to the plurality of first words;
determining a plurality of Euclidean distances between the first vector and a plurality of second vectors;
determining the probability that the first film and television work belongs to each film and television type according to the European distances;
and determining the first type distribution of the first film and television works according to the probability that the first film and television works belong to each film and television type.
In one possible implementation manner, when determining the first type distribution of the first movie work, a first vector corresponding to the first movie work may be determined according to a plurality of first words, where the dimension of the first vector is the same as that of the second vector and is M. The first vector may be determined in a similar manner as the second vector, or may be determined in other manners, as the present disclosure is not limited in this respect.
In a possible implementation manner, after the first vector corresponding to the first movie work is determined, the euclidean distances between the first vector and each second vector can be respectively determined. The first vector may be first represented as (x)1,x2,…,xj,…,xM) The second vector is expressed as (y)1,y2,…,yj,…,yM) Wherein j is a positive integer and j is more than or equal to 1 and less than or equal to M, and then determining the Euclidean distance between the first vector and the second vector corresponding to the kth film and television type in a manner similar to the formula (1); then, the probability that the first film and television work belongs to each film and television type is determined according to the Euclidean distances in a manner similar to the above formula (2).
In a possible implementation manner, after the probability that the first movie work belongs to each movie type is determined, the first type distribution of the first movie work can be determined according to the probability.
In this embodiment, the probability that the first movie and television work belongs to each movie and television type can be determined according to the Euclidean distances between the first vector corresponding to the first movie and television work and the second vectors, and then the first type distribution of the first movie and television work can be determined, so that the Euclidean distances can be used when the first type distribution is determined, the method is simple and fast, and the processing efficiency can be improved.
In one possible implementation, step S100 may include: determining first type distribution of the first film and television works according to text information, actor information and a plurality of preset film and television types of the first film and television works to be predicted.
Wherein the actor information may include a starring actor and a distribution of work types for the starring actor for the first movie work.
In one possible implementation, the actor work type model may be established prior to determining the first type distribution for the first film and television work. The actor work type model may be used to represent the movie type to which multiple actors belong, and may include multiple actors and a distribution of work types for each actor.
In one possible implementation, the work type distribution of any actor may include a probability distribution of a preset number of movie types with the largest number of movie works performed by the actor. The preset number can be set according to actual conditions, and the disclosure does not limit the preset number.
For example, the work type distribution of the actor may include probability distributions of three movie types when the preset number is 3. The work type distribution of the actor Liu may be represented as (action-0.45, science fiction-0.35, comedy-0.2).
In one possible implementation, the number of movie works for each movie type in which an actor plays may be determined in a variety of ways. For example, according to a plurality of preset movie types, on a plurality of video playing platforms (such as bean movies, romantic art, Tencent videos, and the like), a preset number (for example, 50) of movie works in a ranking list of each movie type can be selected, actors of heroes and actresses can be determined, and the number of the movie works of each movie type played by each actor can be counted. For example, a plurality of actors can be selected according to the actor ranking lists (such as Baidu Fengyun ranking list, microblog best actor ranking list, 365 Ming star ranking list, etc.), and the number of movie works of each type of movie played by each actor can be counted respectively. It may also be determined in a combination of the two ways described above, or by other means, and the disclosure is not limited thereto.
In a possible implementation manner, after the number of the video works of each video type played by each actor is determined, for any actor, the video type with the largest number of the preset number (for example, 3) of video works can be selected, the selected video type is determined as the type of the work of the actor, and the probability distribution of each selected video type is calculated. For example, if the preset number is 3, the types of the works of the actor king are love, science fiction and comedy, the number of the corresponding movie works is 40, 35 and 25, and the probabilities thereof are 40%, 35% and 25%, respectively, then the distribution of the types of the works of the actor king can be expressed as (love-0.4, science fiction-0.35 and comedy-0.25).
In one possible implementation, an actor work type model may be established based on a plurality of actors and the distribution of work types of the individual actors.
In a possible implementation manner, after the actor work type model is established, the initial type distribution of the first film and television work can be determined according to the text information of the first film and television work to be predicted and a plurality of preset film and television types, and the work type distribution of actors of the first film and television work can be determined according to the actor work type model and at least one actor of the first film and television work; and then, adjusting the initial type distribution of the first video work according to the work type distribution of actors of the first video work and preset actor information weight (such as 0.1) to determine the first type distribution of the first video work.
In this embodiment, the type distribution of the movie and television works can be determined according to the text information and the actor information of the movie and television works, so that the actor information of the movie and television works can be referred to when the type distribution of the movie and television works is determined, the type distribution of the movie and television works is related to actors of the movie and television works, and the accuracy of the type distribution of the movie and television works can be improved.
In one possible implementation, step S200 may include: and determining cosine similarity between a third vector corresponding to the first type distribution and a fourth vector corresponding to any second type distribution.
The cosine similarity, also called cosine similarity, is determined by calculating a cosine value of an included angle between two vectors. The smaller the included angle of the two vectors is, the closer the cosine value is to 1, and the higher the similarity of the two vectors is.
In one possible implementation, when determining the similarity between the first type distribution and any one of the second type distributions, the first type distribution may be represented as a third vector, the second type distribution may be represented as a fourth vector, then the cosine similarity between the third vector and the fourth vector is determined, and the cosine similarity is determined as the similarity between the first type distribution and the second type distribution.
In one possible implementation, the third vector (a) may be determined by the following equation (3)1,a2,…,aQ) And a fourth vector (b)1,b2,…,bQ) Cosine similarity between cos θ:
in this embodiment, the similarity between the first type distribution and the second type distribution is determined by the cosine similarity, which is simple and fast and can improve the processing efficiency.
In one possible implementation, the method further includes: and determining the type distribution of a fourth film and television work matched with the target transmission result according to the target transmission result to be predicted, the preference information of a user group to a plurality of film and television types and the preset transmission information of a plurality of second film and television works.
Wherein, the target transmission effect is at least one of the score, the prize winning information, the box office and the audience rating of the film and television works expected by the creator of the film and television works.
In one possible implementation, the preference information of the user population for the plurality of movie and television types may include popularity information and emotion information of the user population. The popularity information may include information of the movie and television works whose attention or popularity is greater than or equal to a preset popularity threshold within a preset time period (e.g., within 1 year), for example, the same information of the movie and television works of each movie and television type may be extracted and determined as the popularity information by performing classification, comparison, statistics, and the like according to a plurality of movie and television types according to a plurality of movie and television works (e.g., a popularity list) with the highest popularity discussed on media such as a network, a television, and a newspaper.
In a possible implementation manner, the emotion information of the user group may represent emotion keywords of the user group for the movie works of each movie type, and the emotion keywords may be determined by keywords of the user comment information of the movie works of each movie type within a preset time period (for example, within 1 year). The user comment information can be captured from platforms such as home and abroad social platforms and video playing platforms through a crawler technology. After the comment information of the user is captured, word segmentation processing can be carried out on the comment information, keywords are extracted, and the extracted keywords are used as emotion information of a user group. The present disclosure does not limit the specific values of the preset time period.
In a possible implementation manner, the type distribution of the fourth film and television work matched with the target propagation result can be determined according to the target propagation result to be predicted, the preference information of a user group to a plurality of film and television types and the preset propagation information of a plurality of second film and television works. That is, in predicting the type distribution of the third movie work matching the target distribution result, information on both the target distribution result desired by the creator of the movie work and the preference information of the user group for the plurality of movie types can be used.
For example, the propagation result of the user group to the preference information of the plurality of movie types may be predicted using the above manner, and then, according to the propagation result and a plurality of similarities between the target propagation result and the propagation information of the plurality of second movies, a movie with the highest similarity may be determined from the plurality of second movies, and the second type distribution of the movie may be determined as the type distribution of the fourth movie matching the target propagation result.
In this embodiment, the type distribution of the movie works matched with the target propagation result can be predicted according to the target propagation result, the preference information of the user group on the plurality of movie types and the propagation information of the plurality of second movie works, so that the accuracy of type distribution prediction can be improved, and a reference direction is provided for a creator of the movie works.
In a possible implementation manner, determining type distribution of a fourth film and television work matched with a target propagation result according to the target propagation result to be predicted, preference information of a user group on a plurality of film and television types, and preset propagation information of a plurality of second film and television works, includes:
determining a fifth film and television work matched with the target transmission result from the plurality of second film and television works according to the target transmission result to be predicted and the preset transmission information of the plurality of second film and television works;
and determining the type distribution of the fourth film and television works according to the type distribution of the fifth film and television works and the type distribution of preference information of a user group to a plurality of film and television types.
In a possible implementation manner, a plurality of similarities between the target propagation result and the propagation information of the plurality of second film and television works can be determined according to the target propagation result to be predicted and the preset propagation information of the plurality of second film and television works, a fifth film and television work matched with the target propagation result is determined from the plurality of second film and television works according to the similarities, and the type distribution of the fifth film and television work is determined.
In a possible implementation manner, a similar method to the determination of the second type distribution of the second movie work may be used to determine the type distribution of the preference information of the user group for the plurality of movie types, and then the type distribution of the fourth movie work is determined according to the type distribution of the fifth movie work, the type distribution of the preference information of the user group for the plurality of movie types, and the preset prediction weight (for example, the weight of the target propagation result is 40%, and the weight of the preference information of the user group for the plurality of movie types is 60%).
In this embodiment, the type distribution of the fourth movie work matched with the target dissemination result can be determined according to the type distribution of the fifth movie work matched with the target dissemination result and the type distribution of the preference information of the user group for the plurality of movie types, so that the accuracy of the type distribution of the fourth movie work can be improved.
FIG. 2 illustrates a block diagram of an information prediction system for a film and television work according to an embodiment of the present disclosure. As shown in fig. 2, the system includes:
the type distribution determining module 21 is configured to determine a first type distribution of a first movie and television work according to text information of the first movie and television work to be predicted and a plurality of preset movie and television types, where the text information includes a script of the first movie and television work and/or a literary work corresponding to the first movie and television work, and the first type distribution represents a probability distribution that the first movie and television work belongs to the plurality of movie and television types;
a similarity determining module 22, configured to determine similarities between the first type distribution and a plurality of preset second type distributions of a plurality of second movie and television works;
the selecting module 23 is configured to determine a third movie and television work with the highest similarity from the plurality of second movie and television works;
and the first prediction module 24 is configured to determine a predicted propagation result of the first movie and television work according to the propagation information of the third movie and television work.
In the embodiment, the type distribution of a first film and television work to be predicted can be determined according to text information (such as a script and literary works and the like) of the first film and television work, a third film and television work with the highest similarity to the first film and television work can be determined from a plurality of preset second film and television works according to the similarity between the type distribution and the type distribution of a plurality of preset second film and television works, and the predicted transmission result of the first film and television work can be determined according to the transmission information of the third film and television work, so that the transmission result of the film and television work can be predicted according to the similarity of the type distribution of the film and television works, and the accuracy of the prediction of the transmission result of the film and television work is improved.
In one possible implementation, the system may further include: and the second prediction module is used for determining the type distribution of a fourth film and television work matched with the target propagation result according to the target propagation result to be predicted, the preference information of a user group to a plurality of film and television types and the preset propagation information of a plurality of second film and television works.
In this embodiment, the type distribution of the movie works matched with the target propagation result can be predicted according to the target propagation result, the preference information of the user group on the plurality of movie types and the propagation information of the plurality of second movie works, so that the accuracy of type distribution prediction can be improved, and a reference direction is provided for a creator of the movie works.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (10)
1. An information prediction method for a movie and television work, the method comprising:
determining first type distribution of a first film and television work according to text information of the first film and television work to be predicted and a plurality of preset film and television types, wherein the text information comprises a script of the first film and television work and/or a literature work corresponding to the first film and television work, and the first type distribution represents probability distribution of the first film and television work belonging to the plurality of film and television types;
determining similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works;
determining a third film and television work with the highest similarity from the plurality of second film and television works;
and determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work.
2. The method of claim 1, wherein determining a first type distribution of the first movie work according to text information of the first movie work to be predicted and a plurality of preset movie types comprises:
segmenting words of text information of a first film and television work to be predicted, and determining a plurality of first words corresponding to the first film and television work;
and determining the first type distribution of the first film and television works according to the plurality of first words and the plurality of preset film and television types.
3. The method of claim 2, wherein the segmenting text information of the first video work to be predicted to determine a plurality of first words corresponding to the first video work comprises:
determining a plurality of candidate words of a sentence and a predecessor word of each candidate word according to a preset movie and television work corpus for any sentence in text information of a first movie and television work to be predicted, wherein the movie and television work corpus comprises a plurality of second words and cumulative occurrence probability of each second word;
respectively determining the best predecessor word of each candidate word and the terminal word of the sentence according to the cumulative occurrence probability of each second word in the film and television work corpus, wherein the best predecessor word is the predecessor word with the largest cumulative occurrence probability in the predecessor words of each candidate word;
and determining a plurality of first words included in the sentence according to the terminal words of the sentence and the optimal predecessor words of the candidate words.
4. The method of claim 3, wherein determining a first type distribution of the first film and television work according to the first words and the preset types of films and television comprises:
determining a first vector corresponding to the first film and television work according to the plurality of first words;
determining a plurality of Euclidean distances between the first vector and a plurality of second vectors, wherein the plurality of second vectors are vectors corresponding to a plurality of preset film and television types;
determining the probability that the first film and television work belongs to each film and television type according to the European distances;
and determining the first type distribution of the first film and television works according to the probability that the first film and television works belong to each film and television type.
5. The method of claim 1, wherein determining a first type distribution of the first movie work according to text information of the first movie work to be predicted and a plurality of preset movie types comprises:
determining first type distribution of the first film and television works according to text information, actor information and a plurality of preset film and television types of the first film and television works to be predicted.
6. The method of claim 1, wherein determining a similarity between the first type distribution and a plurality of second type distributions of a predetermined plurality of second film and television works comprises:
and determining cosine similarity between a third vector corresponding to the first type distribution and a fourth vector corresponding to any second type distribution.
7. The method of claim 1, further comprising:
and determining the type distribution of a fourth film and television work matched with the target transmission result according to the target transmission result to be predicted, the preference information of a user group to a plurality of film and television types and the preset transmission information of a plurality of second film and television works.
8. The method of claim 7, wherein determining the type distribution of the fourth film and television works matched with the target transmission result according to the target transmission result to be predicted, the preference information of the user group to a plurality of film and television types, and the preset transmission information of a plurality of second film and television works comprises:
determining a fifth film and television work matched with the target transmission result from the plurality of second film and television works according to the target transmission result to be predicted and the preset transmission information of the plurality of second film and television works;
and determining the type distribution of the fourth film and television works according to the type distribution of the fifth film and television works and the type distribution of preference information of a user group to a plurality of film and television types.
9. The method of any one of claims 1-8, wherein the broadcast information of the film and television work comprises at least one of rating, winning information, box office, and rating of the film and television work.
10. An information prediction system for a movie or television work, the system comprising:
the type distribution determining module is used for determining first type distribution of a first movie and television work according to text information of the first movie and television work to be predicted and a plurality of preset movie and television types, wherein the text information comprises a script of the first movie and television work and/or a literary work corresponding to the first movie and television work, and the first type distribution represents probability distribution of the first movie and television work belonging to the plurality of movie and television types;
the similarity determining module is used for determining the similarity between the first type distribution and a plurality of preset second type distributions of a plurality of second film and television works;
the selecting module is used for determining a third film and television work with the highest similarity from the plurality of second film and television works;
and the first prediction module is used for determining the predicted transmission result of the first film and television work according to the transmission information of the third film and television work.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911365467.5A CN111159437B (en) | 2019-12-26 | 2019-12-26 | Film and television work propagation result and type prediction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911365467.5A CN111159437B (en) | 2019-12-26 | 2019-12-26 | Film and television work propagation result and type prediction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111159437A true CN111159437A (en) | 2020-05-15 |
CN111159437B CN111159437B (en) | 2023-08-22 |
Family
ID=70558266
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911365467.5A Active CN111159437B (en) | 2019-12-26 | 2019-12-26 | Film and television work propagation result and type prediction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111159437B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115983499A (en) * | 2023-03-03 | 2023-04-18 | 北京奇树有鱼文化传媒有限公司 | Box office prediction method and device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991533A (en) * | 2017-04-07 | 2017-07-28 | 杭州火剧科技有限公司 | Predict the method and server of films and television programs investment risk |
US20170323313A1 (en) * | 2015-02-04 | 2017-11-09 | Alibaba Group Holding Limited | Information propagation method and apparatus |
-
2019
- 2019-12-26 CN CN201911365467.5A patent/CN111159437B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170323313A1 (en) * | 2015-02-04 | 2017-11-09 | Alibaba Group Holding Limited | Information propagation method and apparatus |
CN106991533A (en) * | 2017-04-07 | 2017-07-28 | 杭州火剧科技有限公司 | Predict the method and server of films and television programs investment risk |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115983499A (en) * | 2023-03-03 | 2023-04-18 | 北京奇树有鱼文化传媒有限公司 | Box office prediction method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111159437B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11601703B2 (en) | Video recommendation based on video co-occurrence statistics | |
CN101281540B (en) | Apparatus, method and computer program for processing information | |
CN113055741B (en) | Video abstract generation method, electronic equipment and computer readable storage medium | |
US7680768B2 (en) | Information processing apparatus and method, program, and storage medium | |
CN111757170B (en) | Video segmentation and marking method and device | |
US20170169040A1 (en) | Method and electronic device for recommending video | |
CN104199898B (en) | Acquisition methods and device, the method for pushing and device of a kind of attribute information | |
CN112464100B (en) | Information recommendation model training method, information recommendation method, device and equipment | |
CN111259245A (en) | Work pushing method and device and storage medium | |
CN111159437A (en) | Method and system for predicting transmission result and type of film and television works | |
CN112231579B (en) | Social video recommendation system and method based on implicit community discovery | |
CN117956232A (en) | Video recommendation method and device | |
CN112417845A (en) | Text evaluation method and device, electronic equipment and storage medium | |
CN110769288A (en) | Video cold start recommendation method and system | |
CN110929035B (en) | Information prediction method and system for film and television works | |
CN112804580B (en) | Video dotting method and device | |
CN114637909A (en) | Film recommendation system and method based on improved deep structured semantic model | |
CN115048546A (en) | Video color ring music matching recommendation method, device, equipment and computer storage medium | |
Ando et al. | A robust scene recognition system for baseball broadcast using data-driven approach | |
AU2015200201B2 (en) | Video recommendation based on video co-occurrence statistics | |
CN109063137A (en) | A kind of recommended determines method, apparatus, equipment and readable storage medium storing program for executing | |
CN111104552A (en) | Method for predicting movie scoring category based on movie structural information and brief introduction | |
JP2003167891A (en) | Word significance calculating method, device, program and recording medium | |
JP2009048334A (en) | Video identification processing apparatus, image identification processing apparatus, and computer program | |
Ando et al. | Robust scene recognition using language models for scene contexts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |