Multimedia video stream management system and method based on artificial intelligence
Technical Field
The invention relates to the technical field of video stream label processing, in particular to a multimedia video stream management system and method based on artificial intelligence.
Background
AI in the multimedia video stream tagging refers to a process of automatically tagging and classifying video content using artificial intelligence and machine learning algorithms, specifically, AI in the multimedia video stream tagging process, firstly analyzes video content through computer vision and deep learning techniques, identifies key elements (such as characters, objects, scenes, etc.) in video, and then generates tags according to these elements. These tags may cover aspects of the content description, emotional color, topic classification, etc. of the video.
AI is very widely used in the field of tagging of multimedia video streams. Firstly, in the aspect of personalized recommendation, by analyzing video content to generate a label and combining interest preference of a user, accurate personalized recommendation can be realized. Secondly, in the field of content auditing, the AI can help to detect illegal contents, and ensure the safety and compliance of the platform. Finally, in the aspect of media asset management, the automatically generated labels can greatly improve the retrieval efficiency of the media assets, and facilitate users to quickly find the required video data. For example, the repeated or similar segments in the video can be identified and searched by using the AI technology, and the method is suitable for scenes such as original identification, video duplicate checking and the like.
Therefore, it can be known that the accuracy of determining the attribute tag of the AI on the video stream directly affects the effective utilization of the AI processing result in reality, but some data deviation problems often occur in the process of performing the AI on the video stream, mainly because the data set used in training the AI model has an imbalance or incomplete condition in a specific aspect, so that the model has an error on the prediction result of some specific groups or conditions, and the imbalance or incomplete condition is a representation of lack of diversity of sample data.
Disclosure of Invention
The invention aims to provide a multimedia video stream management system and method based on artificial intelligence, which are used for solving the problems in the prior art.
In order to achieve the above purpose, the invention provides a multimedia video stream management method based on artificial intelligence, which comprises the following steps:
step S1, collecting process data generated when each video stream segment received from a multimedia port generates a corresponding video tag by calling an AI module on a multimedia video stream management cloud platform, and combing reference element information according to which the AI module classifies the video tag type of any video stream segment;
s2, judging and identifying any two video tags with adjacent judgment when the AI module is subjected to differential classification by comparing deviation distribution conditions of reference element information presented between any two different video tags, and extracting to obtain a plurality of pairs of video tag groups which are mutually adjacent tags;
Step S3, extracting distribution conditions of the presented distinguishing reference element information in the process of judging the video labels of the corresponding video stream fragments within the range of the corresponding mutually adjacent labels by the AI module, and calculating the mutually adjacent degree value of each pair of mutually adjacent label video label groups;
and S4, monitoring calling rate distribution of the video stream fragments stored in each storage area by a user in the multimedia video stream management cloud platform, adjusting video labels of the video stream fragments with abnormal calling rates by referring to corresponding adjacent labels, and judging whether to send early warning prompts needing to optimize the AI module according to corresponding calling rate change conditions brought by the video labels after adjustment.
Preferably, step S1 includes:
S1-1, acquiring a multimedia video stream management cloud platform, and before an AI module is called to generate a corresponding video tag for each video stream segment received from a multimedia port, acquiring all feature element information extracted from each video stream segment after image identification and audio analysis, and collecting and generating a feature element information set corresponding to each video stream segment;
As can be seen from the above, the feature element information set corresponding to each video stream segment is the information set that ultimately causes the AI module to analyze the information set that is referenced when generating a certain exact video tag for that video stream segment;
Step S1-2, feature element information sets of all video stream fragments which are stored in a multimedia video stream management cloud platform and have the same corresponding video labels are respectively collected to respectively obtain a plurality of reference element information sets according to which an AI module performs video label type classification on any video stream fragment, wherein the video stream fragments with the same video labels are placed in the same storage area in the multimedia video stream management cloud platform for data storage, and one reference element information set corresponds to one video label.
Preferably, the step S2 comprises the following steps:
S2-1, extracting overlapped characteristic element information from reference element information sets corresponding to any two different video tags one by one to respectively generate overlapped reference element information sets corresponding to any two different video tags;
And step S2-2, acquiring the total value of the characteristic element information contained in each superposition reference element information set corresponding to each video tag and other types of video tags, and judging that the AI module has adjacent judgment when classifying any two different video tags, and judging that the any two different video tags are a pair of video tag groups mutually adjacent tags if the total value of the characteristic element information contained in the superposition reference element information set between any two different video tags is larger than a total threshold value.
From the above, it can be seen that, when a certain video stream segment is determined to be a specific video tag of the mutually adjacent tags, the AI module determines the final output tag division result by using relatively tiny distinguishing feature elements, that is, when a video stream segment containing the corresponding overlapping feature elements is determined to be a specific type of the mutually adjacent tags, the AI module has a higher requirement on identification accuracy, and if the AI module has identification data deviation, an error is more likely to occur when the AI module performs distinguishing classification between the mutually adjacent video tags.
Preferably, step S3 includes:
Step S3-1, extracting a reference element information set P (A) of the video tag A, a reference element information set P (B) of the video tag B and a superposition reference element information set U a,b between the video tag A and the video tag B if the video tag A and the video tag B are adjacent tags, extracting a difference reference element information set C 1=P(A)-Ua,b from the video tag A, and extracting a difference reference element information set C 2=P(B)-Ua,b from the video tag B;
The above-mentioned differential reference element information sets obtained by obtaining the video tag a and the video tag B respectively are often video stream segments of overlapping reference element information of the video tag a and the video tag B included in the corresponding feature element information sets by the AI module, and in determining whether the video stream segment is finally the video tag a or the video tag B, feature element information for narrowing the matching degree with the video tag a or the video tag B is used, so to speak, feature element information included in the overlapping reference element information set U a,b represents common element information between the video tag a and the video tag B, feature element information included in the differential reference element information set obtained by obtaining the video tag a represents individual element information of the video tag a, and feature element information included in the differential reference element information set obtained by obtaining the video tag B represents individual element information of the video tag B;
Step S3-2, collecting all video stream fragments of marked video labels A from a multimedia video stream management cloud platform to obtain a first video stream fragment set Y1, collecting all video stream fragments of marked video labels B to obtain a second video stream fragment set Y2, and if a certain video stream fragment exists in the first video stream fragment set Y1 or the second video stream fragment set Y2, and a characteristic element information set R extracted from a certain video stream fragment satisfies R and U a,b = Q not equal to R and U35 = Q not equal to Q @ Wherein Q represents an intersection between the feature element information set R and the superposition reference element information set U a,b, and a certain video stream segment is used as a feature marker, and a target distinguishing element information set Q' =r-Q is extracted from a certain video stream segment;
S3-3, collecting a target distinguishing element information set extracted from all video stream fragments subjected to characteristic marking in a first video stream fragment set Y1, collecting the types eta 1 of the accumulated characteristic element information, collecting a target distinguishing element information set extracted from all video stream fragments subjected to characteristic marking in a second video stream fragment set Y2, collecting the types eta 2 of the accumulated characteristic element information, and calculating to obtain a first adjacent index beta 1=[η1/card(C1)+η2/card(C2)/2 between a video tag A and a video tag B;
η 1/card(C1) or η 2/card(C2) that indicates that when a video stream segment containing overlapping reference element information is determined as a video tag a or a video tag B, the distribution of the related distinguishing reference element information is more uniform, that is, the AI module makes a distinction proximity division less difficult in the process of excluding a conclusion belonging to the video tag B or the video tag a based on the distinguishing reference element information of the video tag a or the video tag B, that is, the video stream segment having the distinguishing reference element information of the video tag a or the video tag B is sufficiently separated from a video feature gap between the video stream segment corresponding to the video tag B or the video tag a, that is, the AI module makes a difference in the actual proximity degree on the data determination analysis level when the video stream segment operates the classification of the video tag a and the video tag B, that is, the AI module makes a distinction classification between the video tag a and the video tag B less likely to occur;
Step S3-4, obtaining a ratio alpha 1 of the number of video stream fragments which are marked by the features in the first video stream fragment set Y1 and a ratio alpha 2 of the number of video stream fragments which are marked by the features in the second video stream fragment set Y2, and calculating to obtain a second adjacent index beta 2 = (alpha 1+ alpha 2)/2 between the video tag A and the video tag B;
the higher the ratio alpha 1 or the ratio alpha 2, the higher the ratio of the video stream fragments containing the superposition reference element information between the video label A and the video label B in the video stream fragments judged to correspond to the video label A or the video label B, and the more frequently the phenomenon that the AI module distinguishes the adjacent division between the video label A and the video label B occurs;
Step S3-5, performing similarity calculation on the feature element information sets corresponding to the video stream fragments in the first video stream fragment set Y1 and the feature element information sets corresponding to the video stream fragments in the second video stream fragment set Y2 one by one, capturing the highest similarity value delta, and calculating the proximity value zeta= (1/beta 1+β2) x delta between the video tag A and the video tag B.
Preferably, step S4 includes:
Step S4-1, monitoring the average calling rate of a user to all stored video stream fragments in each storage area, if a certain video stream fragment with the corresponding calling rate lower than the average calling rate exists in a certain storage area corresponding to the video tag a, capturing a video tag b with the highest adjacent degree value with the video tag a, and adjusting the video tag of the certain video stream fragment from a to b;
Step S4-2, monitoring the calling rate of a user for presenting a certain video stream segment in unit period time after the video label of the certain video stream segment is adjusted from a to b, generating an early warning signal optimized for an AI module if the calling rate is increased compared with the value presented before adjustment, and canceling adjustment if the calling rate is increased compared with the value not presented before adjustment;
And S4-3, when the accumulated times of the early warning signals generated by optimizing the AI module are larger than a time threshold, sending an early warning prompt for optimizing the AI module to the management terminal.
In order to better realize the method, the system also provides a multimedia video stream management system, which comprises a video stream tag processing data management module, a proximity tag judgment management module, a proximity value calculation management module and an AI module optimization prompt management module;
The video stream tag processing data management module is used for collecting process data generated when each video stream segment received from the multimedia port generates a corresponding video tag by calling the AI module for the multimedia video stream management cloud platform, and combing reference element information according to which the AI module carries out video tag type classification on any video stream segment;
The adjacent tag judgment management module is used for judging and identifying any two video tags with adjacent judgment when the AI module is used for distinguishing and classifying by comparing deviation distribution conditions of the reference element information presented between any two different video tags, and extracting a plurality of pairs of video tag groups which are mutually adjacent tags;
The adjacent degree value calculation management module is used for extracting the distribution condition of the presented distinguishing reference element information in the process of judging the video labels of the corresponding video stream fragments within the range of the corresponding mutually adjacent labels by the AI module, and calculating the mutually adjacent degree value of each pair of mutually adjacent label video label groups;
and the AI module optimization prompt management module is used for monitoring the calling rate distribution of the video stream fragments stored in each storage area by a user in the multimedia video stream management cloud platform, carrying out video tag adjustment on the video stream fragments with abnormal calling rate by referring to corresponding adjacent tags, and judging whether to send an early warning prompt for optimizing the AI module according to the corresponding calling rate change condition brought by the video tag adjustment.
Preferably, the adjacent label judging and managing module comprises a characteristic element information carding unit and an adjacent label judging unit;
The characteristic element information carding unit is used for comparing deviation distribution conditions of the reference element information presented between any two different video tags;
The adjacent label judging unit is used for judging and identifying any two video labels with adjacent judgment when the AI module is used for distinguishing and classifying, and extracting a plurality of pairs of video label groups which are adjacent labels.
Preferably, the AI module optimization prompt management module comprises a label adjustment monitoring management unit and an optimization early warning prompt management unit;
The label adjustment monitoring management unit is used for monitoring the calling rate distribution of the video stream fragments stored in each storage area by a user in the multimedia video stream management cloud platform, and performing video label adjustment on the video stream fragments with abnormal calling rate by referring to corresponding adjacent labels;
And the optimized early warning prompt management unit is used for judging whether to send an early warning prompt which needs to be optimized to the AI module according to the corresponding calling rate change condition brought by the video tag after adjustment.
Compared with the prior art, the invention has the beneficial effects that: the invention obtains a plurality of adjacent labels based on data analysis and extraction by combing the reference element information according to which the AI module carries out video label type classification on any video stream, the phenomenon of adjacent judgment exists when the AI module carries out differential classification, the invention outputs whether the classification result of judgment has data deviation or not in the range of the video labels which are mutually adjacent labels by monitoring the AI module, therefore, a judgment basis is provided for whether the AI module needs to be optimized, the self-adaptive monitoring of the performance of the AI module in the field of multimedia video stream label processing can be realized, and the discoverability of the AI in the multimedia video content and the effect of the AI in user experience are effectively improved.
Drawings
Fig. 1 is a schematic flow chart of a multimedia video stream management method based on artificial intelligence.
Fig. 2 is a schematic structural diagram of an artificial intelligence-based multimedia video stream management system according to the present invention.
Detailed Description
All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
1-2, The invention provides a technical scheme, namely a multimedia video stream management method based on artificial intelligence, which comprises the following steps:
step S1, collecting process data generated when each video stream segment received from a multimedia port generates a corresponding video tag by calling an AI module on a multimedia video stream management cloud platform, and combing reference element information according to which the AI module classifies the video tag type of any video stream segment;
wherein, step S1 includes:
S1-1, acquiring a multimedia video stream management cloud platform, and before an AI module is called to generate a corresponding video tag for each video stream segment received from a multimedia port, acquiring all feature element information extracted from each video stream segment after image identification and audio analysis, and collecting and generating a feature element information set corresponding to each video stream segment;
S1-2, respectively collecting characteristic element information sets of all video stream fragments which are stored in a multimedia video stream management cloud platform and have the same corresponding video labels, respectively obtaining a plurality of reference element information sets according to which an AI module performs video label type classification on any video stream fragment;
s2, judging and identifying any two video tags with adjacent judgment when the AI module is subjected to differential classification by comparing deviation distribution conditions of reference element information presented between any two different video tags, and extracting to obtain a plurality of pairs of video tag groups which are mutually adjacent tags;
preferably, the step S2 comprises the following steps:
S2-1, extracting overlapped characteristic element information from reference element information sets corresponding to any two different video tags one by one to respectively generate overlapped reference element information sets corresponding to any two different video tags;
for example, the reference element information set corresponding to the first video tag includes { feature element information r1, feature element information r2, feature element information r4, feature element information r7};
in summary, the feature element information overlapped between the first video tag and the second video tag comprises feature element information r1 and feature element information r2, so that the overlapped reference element information set of the first video tag and the second video tag is { feature element information r1 and feature element information r2};
Step S2-2, acquiring the total value of characteristic element information contained in each superposition reference element information set corresponding to each video tag and other types of video tags, judging that an AI module has adjacent judgment when distinguishing and classifying any two types of different video tags if the total value of the characteristic element information contained in the superposition reference element information set between any two types of different video tags is larger than a total threshold value, and judging that the any two types of different video tags are a pair of video tag groups mutually adjacent to each other;
Step S3, extracting distribution conditions of the presented distinguishing reference element information in the process of judging the video labels of the corresponding video stream fragments within the range of the corresponding mutually adjacent labels by the AI module, and calculating the mutually adjacent degree value of each pair of mutually adjacent label video label groups;
wherein, step S3 includes:
Step S3-1, extracting a reference element information set P (A) of the video tag A, a reference element information set P (B) of the video tag B and a superposition reference element information set U a,b between the video tag A and the video tag B if the video tag A and the video tag B are adjacent tags, extracting a difference reference element information set C 1=P(A)-Ua,b from the video tag A, and extracting a difference reference element information set C 2=P(B)-Ua,b from the video tag B;
Step S3-2, collecting all video stream fragments of marked video labels A from a multimedia video stream management cloud platform to obtain a first video stream fragment set Y1, collecting all video stream fragments of marked video labels B to obtain a second video stream fragment set Y2, and if a certain video stream fragment exists in the first video stream fragment set Y1 or the second video stream fragment set Y2, and a characteristic element information set R extracted from a certain video stream fragment satisfies R and U a,b = Q not equal to R and U35 = Q not equal to Q @ Wherein Q represents an intersection between the feature element information set R and the superposition reference element information set U a,b, and a certain video stream segment is used as a feature marker, and a target distinguishing element information set Q' =r-Q is extracted from a certain video stream segment;
S3-3, collecting a target distinguishing element information set extracted from all video stream fragments subjected to characteristic marking in a first video stream fragment set Y1, collecting the types eta 1 of the accumulated characteristic element information, collecting a target distinguishing element information set extracted from all video stream fragments subjected to characteristic marking in a second video stream fragment set Y2, collecting the types eta 2 of the accumulated characteristic element information, and calculating to obtain a first adjacent index beta 1=[η1/card(C1)+η2/card(C2)/2 between a video tag A and a video tag B;
Step S3-4, obtaining a ratio alpha 1 of the number of video stream fragments which are marked by the features in the first video stream fragment set Y1 and a ratio alpha 2 of the number of video stream fragments which are marked by the features in the second video stream fragment set Y2, and calculating to obtain a second adjacent index beta 2 = (alpha 1+ alpha 2)/2 between the video tag A and the video tag B;
the higher the ratio alpha 1 or the ratio alpha 2, the higher the ratio of the video stream fragments containing the superposition reference element information between the video label A and the video label B in the video stream fragments judged to correspond to the video label A or the video label B, and the more frequently the phenomenon that the AI module distinguishes the adjacent division between the video label A and the video label B occurs;
Step S3-5, performing similarity calculation on the feature element information sets corresponding to the video stream fragments in the first video stream fragment set Y1 and the feature element information sets corresponding to the video stream fragments in the second video stream fragment set Y2 one by one, capturing the highest similarity value delta, and calculating the proximity value zeta= (1/beta 1+β2) x delta between the video tag A and the video tag B.
For example, the first video stream segment set Y1 includes a video stream segment w1, a video stream segment w2, and a video stream segment w3;
The second video stream segment set Y2 comprises a video stream segment d1 and a video stream segment d2;
the similarity between the feature element information set corresponding to the video stream segment w1 and the feature element information set corresponding to the video stream segment d1 is Q1;
the similarity between the characteristic element information set corresponding to the video stream fragment w1 and the characteristic element information set corresponding to the video stream fragment d2 is Q2;
the similarity between the characteristic element information set corresponding to the video stream fragment w2 and the characteristic element information set corresponding to the video stream fragment d1 is Q3;
the similarity between the characteristic element information set corresponding to the video stream fragment w2 and the characteristic element information set corresponding to the video stream fragment d2 is Q4;
the similarity between the characteristic element information set corresponding to the video stream fragment w3 and the characteristic element information set corresponding to the video stream fragment d1 is Q5;
the similarity between the characteristic element information set corresponding to the video stream fragment w3 and the characteristic element information set corresponding to the video stream fragment d2 is Q6;
if Q6> Q3> Q2> Q4> Q61> Q5 is satisfied, the highest captured similarity value δ=q6;
Step S4, monitoring calling rate distribution of a user on video stream fragments stored in each storage area in a multimedia video stream management cloud platform, adjusting video labels of the video stream fragments with abnormal calling rates by referring to corresponding adjacent labels, and judging whether to send early warning prompts needing to optimize an AI module according to corresponding calling rate change conditions brought by the video labels after adjustment;
Wherein, step S4 includes:
Step S4-1, monitoring the average calling rate of a user to all stored video stream fragments in each storage area, if a certain video stream fragment with the corresponding calling rate lower than the average calling rate exists in a certain storage area corresponding to the video tag a, capturing a video tag b with the highest adjacent degree value with the video tag a, and adjusting the video tag of the certain video stream fragment from a to b;
Step S4-2, monitoring the calling rate of a user for presenting a certain video stream segment in unit period time after the video label of the certain video stream segment is adjusted from a to b, generating an early warning signal optimized for an AI module if the calling rate is increased compared with the value presented before adjustment, and canceling adjustment if the calling rate is increased compared with the value not presented before adjustment;
And S4-3, when the accumulated times of the early warning signals generated by optimizing the AI module are larger than a time threshold, sending an early warning prompt for optimizing the AI module to the management terminal.
In order to better realize the method, the system also provides a multimedia video stream management system, which comprises a video stream tag processing data management module, a proximity tag judgment management module, a proximity value calculation management module and an AI module optimization prompt management module;
The video stream tag processing data management module is used for collecting process data generated when each video stream segment received from the multimedia port generates a corresponding video tag by calling the AI module for the multimedia video stream management cloud platform, and combing reference element information according to which the AI module carries out video tag type classification on any video stream segment;
The adjacent tag judgment management module is used for judging and identifying any two video tags with adjacent judgment when the AI module is used for distinguishing and classifying by comparing deviation distribution conditions of the reference element information presented between any two different video tags, and extracting a plurality of pairs of video tag groups which are mutually adjacent tags;
The adjacent label judging and managing module comprises a characteristic element information carding unit and an adjacent label judging unit;
The characteristic element information carding unit is used for comparing deviation distribution conditions of the reference element information presented between any two different video tags;
The adjacent label judging unit is used for judging and identifying any two video labels with adjacent judgment when the AI module is used for distinguishing and classifying, and extracting a plurality of pairs of video label groups which are mutually adjacent labels;
The adjacent degree value calculation management module is used for extracting the distribution condition of the presented distinguishing reference element information in the process of judging the video labels of the corresponding video stream fragments within the range of the corresponding mutually adjacent labels by the AI module, and calculating the mutually adjacent degree value of each pair of mutually adjacent label video label groups;
the AI module optimizing prompt management module is used for monitoring the calling rate distribution of the video stream fragments stored in each storage area by a user in the multimedia video stream management cloud platform, carrying out video tag adjustment on the video stream fragments with abnormal calling rate by referring to corresponding adjacent tags, and judging whether to send early warning prompts needing to optimize the AI module according to the corresponding calling rate change conditions brought by the video tag adjustment;
the AI module optimizing prompt management module comprises a label adjusting and monitoring management unit and an optimizing early warning prompt management unit;
The label adjustment monitoring management unit is used for monitoring the calling rate distribution of the video stream fragments stored in each storage area by a user in the multimedia video stream management cloud platform, and performing video label adjustment on the video stream fragments with abnormal calling rate by referring to corresponding adjacent labels;
And the optimized early warning prompt management unit is used for judging whether to send an early warning prompt which needs to be optimized to the AI module according to the corresponding calling rate change condition brought by the video tag after adjustment.
It should be noted that the above-mentioned embodiments are merely preferred embodiments of the present invention, and the present invention is not limited thereto, but may be modified or substituted for some of the technical features thereof by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.