CN116156279A - Short video transmission processing method and system based on artificial intelligence - Google Patents

Short video transmission processing method and system based on artificial intelligence Download PDF

Info

Publication number
CN116156279A
CN116156279A CN202310222805.XA CN202310222805A CN116156279A CN 116156279 A CN116156279 A CN 116156279A CN 202310222805 A CN202310222805 A CN 202310222805A CN 116156279 A CN116156279 A CN 116156279A
Authority
CN
China
Prior art keywords
video
vector
content
video expression
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310222805.XA
Other languages
Chinese (zh)
Inventor
曾海兵
闫国范
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202310222805.XA priority Critical patent/CN116156279A/en
Publication of CN116156279A publication Critical patent/CN116156279A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of artificial intelligence, live broadcast and short video processing, and relates to a short video transmission processing method and system based on artificial intelligence. According to the short video transmission processing method based on artificial intelligence, when the threads are configured, the video expression vectors of the sample video expression contents are identified by combining the secondary video expression contents related to the sample video expression contents, so that the video expression vectors of the identified sample video expression contents are more accurate and reliable, the accuracy of the video expression contents to be processed is improved, the accurate control of video data transmission can be realized, and the short video transmission processing efficiency is improved.

Description

Short video transmission processing method and system based on artificial intelligence
Technical Field
The application relates to the technical field of artificial intelligence, live broadcast and short video processing, in particular to a short video transmission processing method and system based on artificial intelligence.
Background
In the internet era, short videos become a part of current people's recorded lives and become an important component of current live broadcast with goods and online sales. At present, in the conventional technology, the video has the characteristics of incomplete video, easy drying and the like, so that the experience of a user can be reduced, the problems can not be solved effectively when the defects are improved in the prior art, and the fluency and the integrity of the video can not be improved.
Disclosure of Invention
In order to improve the technical problems in the related art, the application provides a short video transmission processing method and system based on artificial intelligence.
In a first aspect, there is provided a short video transmission processing method based on artificial intelligence, the method at least comprising: obtaining first video interaction data, wherein the first video interaction data comprises important video fragments and at least one secondary video fragment, the fragment vector of the important video fragment represents a video expression vector of target video expression content, the video expression vector is controlled to transmit, the fragment vector of the secondary video fragment represents a video expression vector of secondary video expression content, and the secondary video expression content is video expression content related to the target video expression content; and loading the first video interaction data to a vector optimization thread, wherein the vector optimization thread optimizes the segment vector of the important video segment by combining the segment vector of the secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit.
In an independent embodiment, the vector optimization thread optimizes the segment vector of the important video segment in combination with the segment vector of the secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit, including: determining a confidence level between the important video segment and each of the secondary video segments in the first video interaction data; combining the confidence coefficient to fuse the video expression vectors of the secondary video segments to obtain the weighting vector of the important video segment; and combining the video expression vector of the important video segment with the weighting vector to obtain the video expression vector of the optimized target video expression content, and controlling the video expression vector to transmit.
In an independently implemented embodiment, before the obtaining the first video interaction data, the method further comprises: and combining the target video expression content, and acquiring secondary video expression content associated with the target video expression content from a prestored video expression content set.
In an independently implemented embodiment, the obtaining secondary video presentation content associated with the target video presentation content from a set of pre-stored video presentation content in combination with the target video presentation content comprises: the video expression vectors of the target video expression contents are obtained one by one through a vector identification thread, and the video expression vectors are controlled to be transmitted and prestored for the video expression vectors of each audio content in a video expression content set; and controlling the video expression vector to transmit and pre-store the vector commonality association degree between the video expression vector of each audio content in the video expression content set based on the video expression vector of the target video expression content, and determining secondary video expression content associated with the target video expression content from the pre-store video expression content set.
In an independent embodiment, the controlling the degree of vector commonality association between the video enunciation vector for transmitting and pre-storing the video enunciation vector for each audio content in a set of video enunciation content based on the video enunciation vector for the target video enunciation content, determining secondary video enunciation content associated with the target video enunciation content, comprises: distributing the vector commonality association degree between the target video expression content and each audio content according to the association coefficient of the vector commonality association degree from big to small; and screening the audio content corresponding to the vector commonality association degree of the interval data, and regarding the audio content as the secondary video expression content with the association of the target video expression content.
In an independently implemented embodiment, the controlling the degree of vector commonality association between the video enunciation vector for transmission and the video enunciation vector for each of the audio content in a set of pre-stored video enunciation content based on the video enunciation vector for the target video enunciation content, determining secondary video enunciation content from the set of pre-stored video enunciation content that is associated with the target video enunciation content, comprises: combining the video expression vectors of the target video expression contents, controlling the vector commonality association degree between the video expression vectors for transmission and the video expression vectors of each audio content, and obtaining a first video expression content associated with the target video expression contents from each audio content; obtaining second video expression content associated with the first video expression content from each of the audio contents in combination with the vector commonality association degree between the video expression vector of the first video expression content and the video expression vector of the audio content; and regarding the first video expression content and the second video expression content as secondary video expression content of the target video expression content.
In an independent embodiment, the number of vector optimized threads is one, or a number accumulated one by one; when the number of vector optimization threads is several: the calculation of one vector optimization thread is random, and the calculation is the first video interaction data output by the vector optimization thread which is the last time.
In an independent embodiment, the fusing the video expression vectors of the secondary video segments in combination with the confidence level to obtain the weighted vector of the important video segment includes: and clustering the video expression vectors of the secondary video segments by combining the confidence coefficient to obtain the weighting vector of the important video segment.
In an independent embodiment, the combining the video expression vector of the important video segment with the weighting vector to obtain the video expression vector of the optimized target video expression content, and controlling the video expression vector to transmit includes: combining a video expression vector of the important video segment with the weighting vector; and carrying out first projection on the combined vectors to obtain video expression vectors of the optimized target video expression content, and controlling the video expression vectors to be transmitted.
In an independently implemented embodiment, said determining a confidence level between said important video segment and each said secondary video segment in said first video interaction data comprises: performing second projection on the important video segment and the secondary video segment; determining an association relation between the important video segment and the secondary video segment after the second projection; and determining the confidence coefficient according to the association relation after the first projection processing.
In an independently implemented embodiment, the target video presentation content comprises: searching video expression content to be processed and pre-storing each audio content in the video expression content set; after obtaining the video expression vector of the target video expression content corresponding to the important video segment and controlling the video expression vector to transmit, the method further comprises: and controlling the vector commonality association degree between the video expression vector and the video expression vector of each audio content based on the video expression vector of the optimized target video expression content, and obtaining the near video expression content of the target video expression content from the audio content as a search result.
In a second aspect, an artificial intelligence based short video transmission processing system is provided, comprising a processor and a memory in communication with each other, the processor being adapted to read a computer program from the memory and execute the computer program to implement the method described above.
According to the short video transmission processing method and system based on artificial intelligence, when threads are configured, the video expression vectors of the example video expression content are identified by combining the secondary video expression content associated with the example video expression content, so that the video expression vectors of the identified example video expression content are more accurate and reliable, the accuracy of video expression content to be processed is improved, and therefore video data transmission can be accurately controlled.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a short video transmission processing method based on artificial intelligence according to an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions described above, the following detailed description of the technical solutions of the present application is provided through the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limit the technical solutions of the present application, and the technical features of the embodiments and embodiments of the present application may be combined with each other without conflict.
Referring to fig. 1, a short video transmission processing method based on artificial intelligence is shown, which may include the following technical solutions described in step 101 and step 102.
In step 101, first video interaction data is obtained, where the first video interaction data includes an important video segment and at least one secondary video segment, a segment vector of the important video segment represents a video expression vector of a target video expression content, the video expression vector is controlled to transmit, and a segment vector of the secondary video segment represents a video expression vector of a secondary video expression content, and the secondary video expression content is a video expression content associated with the target video expression content.
In this embodiment, the target video expression content is a video expression content of a video expression vector to be screened, and the video expression content may be video expression content in different application scenarios, for example, may be video expression content to be searched in a video expression content search application, and the pre-stored video expression content set may be a search pre-stored video expression content set in the video expression content search application.
For example, the secondary video presentation may be obtained by obtaining secondary video presentation associated with the presence of the target video presentation from a set of pre-stored video presentation before obtaining the first video interactive data. For example, the secondary video presentation content may be determined according to a video presentation vector similarity metric, for example, the video presentation vectors of the target video presentation content are obtained one by one through a vector recognition thread, the video presentation vectors of each audio content in the set of video presentation vectors are controlled to be transmitted and pre-stored, the degree of vector commonality association between the video presentation vectors of the target video presentation content and the video presentation vectors of each audio content in the set of pre-stored video presentation content is controlled based on the video presentation vectors of the target video presentation content, and the secondary video presentation content associated with the target video presentation content is determined from the set of pre-stored video presentation content.
In one embodiment, the degree of vector commonality association between the target video expression content and each audio content may be distributed according to the order from large to small according to the association coefficient of the degree of vector commonality association, and the audio content corresponding to the first X vector commonality association is screened, which is regarded as the secondary video expression content associated with the target video expression content.
In one possible implementation embodiment, it is also possible to obtain a first video expression content associated with the target video expression content according to the similarity between the video expression vectors, and then obtain a second video expression content associated with the first video expression content, and consider both the first video expression content and the second video expression content as secondary video expression content of the target video expression content.
In step 102, the first video interaction data is loaded to a vector optimization thread, and the vector optimization thread optimizes the segment vector of the important video segment in combination with the segment vector of the secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit.
Taking the feature extraction unit as an example, the feature extraction unit in this embodiment may optimize the segment vector of the important video segment according to the segment vector of the secondary video segment, for example, may determine the confidence coefficient between the important video segment and each secondary video segment in the first video interaction data, fuse the video expression vector of each secondary video segment according to the confidence coefficient to obtain the weighting vector of the important video segment, and combine the video expression vector of the important video segment with the weighting vector to obtain the video expression vector of the optimized target video expression content, so as to control the video expression vector to transmit.
In the present embodiment, the number of the feature extraction units may be one or a number accumulated one by one. For example, when the number of feature extraction units is two, the first video interaction data is loaded to the first feature extraction unit, the first feature extraction unit optimizes the video expression vector of the important video segment according to the video expression vector of each secondary video segment, and the video expression vector of the important video segment in the first video interaction data output by the first feature extraction unit is already optimized and is the optimized first video interaction data. The optimized first video interaction data is continuously loaded to a second feature extraction unit, the second feature extraction unit continuously optimizes the video expression vector of the important video segment according to the video expression vector of each secondary video segment, and the second optimized first video interaction data is output, wherein the video expression vector of the important video segment is also secondarily optimized.
The first video interaction data in this embodiment includes a plurality of segments (e.g., important video segments, secondary video segments), where a segment vector of each segment characterizes a video expression vector of the video expression content represented by the segment. In addition, each segment in the first video interaction data can be regarded as an important video segment, and the video expression vector of the video expression content corresponding to the segment is optimized through the embodiment, for example, when the segment is regarded as an important video segment, the first video interaction data of which the segment is regarded as an important video segment is obtained, and the first video interaction data is loaded to a vector optimization thread to optimize the video expression vector of the segment.
According to the short video transmission processing method based on artificial intelligence, the video expression vector is optimized and screened by using the vector optimization thread, and the vector optimization thread optimizes the video expression vector of the important video segment according to the video expression vector of the secondary video segment of the important video segment, so that the video expression vector of the identified target video expression content can be controlled to transmit, the target video expression content can be expressed more accurately, and the video expression content identification process is more accurate and reliable.
The processing flow of the vector optimization thread in this embodiment describes how the vector optimization thread optimizes the video expression vector loaded to the video expression content of the thread. Taking the feature extraction unit as an example, the vector optimization thread may include the following steps.
In step 200, a confidence level between the important video segment and the secondary video segment is determined based on the video expression vectors of the important video segment and the secondary video segment.
In this embodiment, the important video clip may be a target video expression content of the thread application stage, and the secondary video clip may be a secondary video expression content of the target video expression content.
In step 202, the confidence is combined to cluster the video expression vectors of the secondary video segments to obtain the weighted vectors of the important video segments.
In step 204, the video expression vector of the important video segment and the weighting vector are combined to obtain an optimized vector of the optimized target video expression content.
Through the steps 200 to 204, the segment vector of the important video segment in the first video interaction data is optimized, and the video expression vector of the optimized important video segment is obtained.
According to the short video transmission processing method based on artificial intelligence, the feature extraction unit clusters according to the video expression vectors of the secondary video segments of the important video segments to determine the vectors of the important video segments, so that the video expression vectors of the sample video expression content and the vectors of other related video expression contents can be comprehensively referred to, the identified sample video expression vectors are more accurate and reliable, the accuracy of video expression content to be processed is improved, and the video data transmission can be accurately controlled.
The method for configuring the vector optimization thread provided by the embodiment is used for describing the configuration process of the vector optimization thread, and specifically comprises the following steps.
In step 300, configuration secondary video presentation associated with the presence of the example video presentation is obtained from a set of configuration pre-stored video presentation according to the example video presentation for configuring the vector optimization thread.
For example, in the application scenario where the video presentation content is to be processed, the set of pre-stored video presentation content may be a set of searched pre-stored video presentation content, i.e. the set of searched pre-stored video presentation content is searched to obtain video presentation content associated with the exemplary video presentation content.
In the present embodiment, obtaining video presentation content associated with the existence of the example video presentation content may be referred to as "configuring secondary video presentation content".
The configuration secondary video expression content may be obtained, for example, by determining, as the configuration secondary video expression content, video expression content having a higher proximity according to the degree of vector commonality association between video expression contents.
In step 302, second video interaction data is obtained, where the second video interaction data includes configuration important video segments and at least one configuration secondary video segment, where a segment vector of the configuration important video segment represents a video expression vector of an example video expression content, and a segment vector of the configuration secondary video segment represents a video expression vector of a configuration secondary video expression content, and the configuration secondary video expression content is a video expression content associated with the example video expression content.
In this embodiment, the second video interaction data may include a plurality of segments thereon.
Wherein the fragment may comprise: one configuration important video clip, and not less than one configuration secondary video clip. The configuration important video segments represent example video presentation content, and each configuration secondary video segment represents one configuration secondary video presentation content determined in step 300. The segment vector of each segment is a video expression vector, e.g., the segment vector configuring the important video segment is a video expression vector of the example video expression content, and the segment vector configuring the secondary video segment is a video expression vector configuring the secondary video expression content.
In step 304, the second video interaction data is loaded to a vector optimization thread, which optimally configures segment vectors of important video segments in combination with segment vectors of configured secondary video segments in the second video interaction data.
The number of the feature extraction units may be one or a number accumulated one by one, for example. For example, when the number of feature extraction units is two, the video interaction data is loaded to a first feature extraction unit that optimizes the video expression vector of the important video segment based on the video expression vector of each of the secondary video segments, and the video interaction data output from the first feature extraction unit has been optimized. The optimized video interaction data is continuously loaded to a second feature extraction unit, the second feature extraction unit continuously optimizes the video expression vector of the important video segment according to the video expression vector of each secondary video segment, and outputs the video expression vector of the important video segment after secondary optimization.
In step 306, regression analysis data of the sample video presentation content is obtained according to the video presentation vectors of the sample video presentation content screened by the vector optimization thread.
In step 308, the debug vector optimizes thread coefficients of the thread in conjunction with the regression analysis data.
According to the configuration method of the vector optimization thread, when the thread is configured, the video expression vector of the sample video expression content is identified by combining the near video expression content of the sample video expression content, so that the video expression vector of the sample video expression content and the related other video expression content vectors of the sample video expression content can be comprehensively referred to, the identified sample video expression vector is more accurate and reliable, the accuracy to be processed of the video expression content is improved, and the video data transmission can be accurately controlled.
In another embodiment, the method for configuring the vector optimization thread filters the video expression vector through a pre-configured thread (which may be referred to as a vector recognition thread) for filtering the vector, and obtains a configuration secondary video expression content associated with the sample video expression content from a configuration pre-stored video expression content set according to a similarity measure of the video expression vector. Specifically, the method comprises the following steps.
In step 400, a thread for filtering vectors is preconfigured using a configuration set.
The video presentation content in the configuration set may be referred to as configuration video presentation content. The configuration process of the vector identification thread may include: screening and configuring video expression vectors of video expression content through a vector identification thread; combining the video expression vectors of the configuration video expression content to obtain regression analysis data of the configuration video expression content; and debugging the thread coefficient of the vector identification thread based on the regression analysis data and the identification information of the configuration video expression content.
It should be understood that the above-mentioned configuration video expression content refers to video expression content used for configuring the vector identification thread, and the above-mentioned exemplary video expression content refers to a configuration process to be applied to a vector optimization thread after the vector identification thread is configured, for example, the pre-configured vector identification thread firstly filters an exemplary video expression content and configures a video expression vector of each audio content in a pre-stored video expression content set, and then loads the video expression vector into the vector optimization thread to perform video expression vector optimization after regenerating video interaction data, wherein the video expression vector is loaded into the video expression content, i.e., the exemplary video expression content, used in the vector optimization thread configuration process. The example video presentation content and the configuration video presentation content may be the same or different.
In step 402, the exemplary video presentation content and the video presentation vector configuring each audio content in the set of pre-stored video presentation content are obtained one by one through a vector identification thread.
In step 404, a first video presentation associated with the presence of the example video presentation is obtained from each audio content in combination with a degree of vector commonality association between the example video presentation and the video presentation vector of each audio content.
In this embodiment, the audio content is video expression content in a search pre-stored video expression content set.
For example, the degree of vector commonality between the video expression vector of the exemplary video expression content and the video expression vector of each audio content may be calculated one by one, and each audio content may be distributed according to the similarity, for example, in order from high to low. And selecting the audio content ranked in the first K bits from the distribution result as the first video expression content of the example video expression content.
In step 406, second video presentation content associated with the first video presentation content is obtained from the audio content according to a degree of vector commonality association between video presentation vectors of the first video presentation content and the audio content.
In this embodiment, the degree of vector commonality association between the video expression vectors of the first video expression content and the audio content may then be calculated, and the audio content associated with the first video expression content is obtained from the audio content, and regarded as the second video expression content.
In this embodiment, the first video presentation content of the important video segment corresponding to the exemplary video presentation content may be found, and then the search for the secondary video presentation content is stopped. Alternatively, a greater number of secondary video presentations, such as a third video presentation, or a fourth video presentation, may also be found. The specific searching of several layers of secondary video expression contents can be determined according to the effects of real-time testing in different application scenes. The first video expression content, the second video expression content and the like can be called secondary video expression content, and the configuration stage of the thread can be called configuration of the secondary video expression content; in the thread application phase, it may be referred to as secondary video presentation content.
It will also be appreciated that the secondary video presentation may be obtained in other ways than the example of this step. For example, a similarity target value may be set, and all or part of the audio content with a vector commonality association degree higher than the target value may be directly regarded as the secondary video presentation content of the example video presentation content. For another example, instead of using a vector recognition thread to filter video expression vectors, the video expression vectors may be determined by taking values of several dimensions of the video expression content.
In step 408, second video interaction data is generated from the example video presentation content and the secondary video presentation content, the segments in the second video interaction data comprising: a configuration important video clip for representing the example video presentation content, and at least one configuration secondary video clip for representing the secondary video presentation content, and the clip vector of the clip is the video presentation vector of the example video presentation content or secondary video presentation content.
In step 410, the second video interaction data is loaded to a vector optimization thread, and the vector optimization thread optimizes and configures video expression vectors of important video segments in combination with video expression vectors configuring secondary video segments in the second video interaction data, screens out video expression vectors of sample video expression contents, and obtains regression analysis data of the sample video expression contents according to the video expression vectors.
In step 412, the thread coefficients of the vector optimized thread and the thread coefficients of the vector identified thread are debugged based on regression analysis data of the example video presentation.
The thread coefficient debugging in the step can be used for debugging the thread coefficient of the vector identification thread, or not debugging the thread coefficient of the vector identification thread, and can be determined according to the real-time configuration condition.
According to the configuration method of the vector optimization thread, when the thread is configured, the video expression vector of the sample video expression content is identified by combining the near video expression content of the sample video expression content, so that the video expression vector of the sample video expression content and the related other video expression content vectors of the sample video expression content can be comprehensively referred to, the identified sample video expression vector is more accurate and reliable, the accuracy to be processed of the video expression content is improved, and the video data transmission can be accurately controlled; in addition, the video expression vector is screened by adopting the vector identification thread, so that the screening efficiency of the video expression vector can be improved, the thread configuration speed is further improved, the thread coefficient of the vector identification thread can be debugged according to the loss value, and the vector identification thread screening vector is more accurate.
The embodiment of the application also provides a video expression content searching method, which aims to search the video expression content related to the target video expression content from the prestored video expression content set. Specifically, the method comprises the following steps.
In step 700, target video presentation content to be processed is obtained.
In step 702, a video expression vector of the target video expression content is obtained through screening, and the video expression vector is controlled to be transmitted.
In this embodiment, the method for processing short video transmission based on artificial intelligence according to one embodiment of the present application may be described.
In step 704, the video expression vector of each audio content in the set of pre-stored video expression content is filtered.
In this embodiment, the method for processing short video transmission based on artificial intelligence according to any embodiment of the present application may be, for example, screening video expression vectors of each audio content in a set of pre-stored video expression contents.
In step 706, based on the video expression vectors of the target video expression content, the degree of vector commonality association between the video expression vectors of the target video expression content and the video expression vectors of each audio content is controlled to obtain the near video expression content of the target video expression content as a search result.
In this embodiment, the video expression vector of the target video expression content may be controlled to transmit the video expression vector and the vector common correlation process metric is performed between the video expression vectors of each audio content, so that the audio content having correlation is regarded as a search result.
According to the video expression content searching method, the screened sample video expression vectors are more accurate and reliable, so that the accuracy of the searching result is improved.
Video interaction data may be generated based on the example video presentation content and the searched secondary video presentation content. Comprising an important video segment and a number of secondary video segments. Wherein the important video segments represent example video representations, each of the secondary video segments represents a secondary video representation, the secondary video segments including the first video representation and also including the second video representation. The segment vector of each segment is a video expression vector of the video expression content represented by the segment, and the video expression vector is a video expression vector which is screened when the secondary video expression content is obtained for vector commonality association degree comparison, for example, the video expression vector screened by the vector identification thread can be used.
In step 102, the first video interaction data is loaded to a vector optimization thread, and the vector optimization thread optimizes the segment vector of the important video segment in combination with the segment vector of the secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit.
It can be appreciated that when executing the content described in the above steps 101 and 102, the video expression vector of the exemplary video expression content is identified by combining the secondary video expression content associated with the exemplary video expression content when configuring the thread, so that the video expression vector of the identified exemplary video expression content is more accurate and reliable, the accuracy of the video expression content to be processed is improved, and thus, the video data transmission can be accurately controlled.
On the basis of the above, there is provided an artificial intelligence based short video transmission processing apparatus 200 applied to an artificial intelligence based short video transmission processing system, the apparatus comprising:
a data obtaining module 210, configured to obtain first video interaction data, where the first video interaction data includes an important video segment and at least one secondary video segment, a segment vector of the important video segment represents a video expression vector of a target video expression content, the video expression vector is controlled to transmit, and a segment vector of the secondary video segment represents a video expression vector of a secondary video expression content, and the secondary video expression content is a video expression content associated with the target video expression content;
The vector transmission module 220 is configured to load the first video interaction data to a vector optimization thread, where the vector optimization thread optimizes a segment vector of the important video segment in combination with a segment vector of a secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit.
Based on the above, an artificial intelligence based short video transmission processing system 300 is shown, comprising a processor 310 and a memory 320 in communication with each other, said processor 310 being adapted to read a computer program from said memory 320 and execute it for implementing the method described above.
On the basis of the above, there is also provided a computer readable storage medium on which a computer program stored which, when run, implements the above method.
In summary, based on the above scheme, when a thread is configured, the video expression vector of the example video expression content is identified by combining the secondary video expression content associated with the example video expression content, so that the video expression vector of the identified example video expression content is more accurate and reliable, the accuracy of the video expression content to be processed is improved, and the video data transmission can be accurately controlled.
It should be appreciated that the systems and modules thereof shown above may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may then be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only with hardware circuitry, such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also with software, such as executed by various types of processors, and with a combination of the above hardware circuitry and software (e.g., firmware).
It should be noted that, the advantages that may be generated by different embodiments may be different, and in different embodiments, the advantages that may be generated may be any one or a combination of several of the above, or any other possible advantages that may be obtained.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations of the present application may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this application, and are therefore within the spirit and scope of the exemplary embodiments of this application.
Meanwhile, the present application uses specific words to describe embodiments of the present application. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present application. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present application may be combined as suitable.
Furthermore, those skilled in the art will appreciate that the various aspects of the invention are illustrated and described in the context of a number of patentable categories or circumstances, including any novel and useful procedures, machines, products, or materials, or any novel and useful modifications thereof. Accordingly, aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.
The computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take on a variety of forms, including electro-magnetic, optical, etc., or any suitable combination thereof. A computer storage medium may be any computer readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated through any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.
The computer program code necessary for operation of portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, scala, smalltalk, eiffel, JADE, emerald, C ++, c#, vb net, python, etc., a conventional programming language such as C language, visual Basic, fortran 2003, perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, ruby and Groovy, or other programming languages, etc. The program code may execute entirely on the user's computer or as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any form of network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or the use of services such as software as a service (SaaS) in a cloud computing environment.
Furthermore, the order in which the elements and sequences are presented, the use of numerical letters, or other designations are used in the application and are not intended to limit the order in which the processes and methods of the application are performed unless explicitly recited in the claims. While certain presently useful inventive embodiments have been discussed in the foregoing disclosure, by way of various examples, it is to be understood that such details are merely illustrative and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements included within the spirit and scope of the embodiments of the present application. For example, while the system components described above may be implemented by hardware devices, they may also be implemented solely by software solutions, such as installing the described system on an existing server or mobile device.
Likewise, it should be noted that in order to simplify the presentation disclosed herein and thereby aid in understanding one or more inventive embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof. This method of disclosure, however, is not intended to imply that more features than are presented in the claims are required for the subject application. Indeed, less than all of the features of a single embodiment disclosed above.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the numbers allow for adaptive variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this application is hereby incorporated by reference in its entirety. Except for application history documents that are inconsistent or conflicting with the present application, documents that are currently or later attached to this application for which the broadest scope of the claims to the present application is limited. It is noted that the descriptions, definitions, and/or terms used in the subject matter of this application are subject to such descriptions, definitions, and/or terms if they are inconsistent or conflicting with such descriptions, definitions, and/or terms.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of this application. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present application may be considered in keeping with the teachings of the present application. Accordingly, embodiments of the present application are not limited to only the embodiments explicitly described and depicted herein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. An artificial intelligence-based short video transmission processing method is characterized in that the method at least comprises the following steps:
obtaining first video interaction data, wherein the first video interaction data comprises important video fragments and at least one secondary video fragment, the fragment vector of the important video fragment represents a video expression vector of target video expression content, the video expression vector is controlled to transmit, the fragment vector of the secondary video fragment represents a video expression vector of secondary video expression content, and the secondary video expression content is video expression content related to the target video expression content;
and loading the first video interaction data to a vector optimization thread, wherein the vector optimization thread optimizes the segment vector of the important video segment by combining the segment vector of the secondary video segment in the first video interaction data to obtain a video expression vector of the optimized target video expression content, and controls the video expression vector to transmit.
2. The method of claim 1, wherein the vector optimization thread optimizes the segment vectors of the important video segments in combination with the segment vectors of the secondary video segments in the first video interaction data to obtain video expression vectors of the optimized target video expression content, and controlling the video expression vectors to transmit, comprising:
Determining a confidence level between the important video segment and each of the secondary video segments in the first video interaction data;
combining the confidence coefficient to fuse the video expression vectors of the secondary video segments to obtain the weighting vector of the important video segment;
and combining the video expression vector of the important video segment with the weighting vector to obtain the video expression vector of the optimized target video expression content, and controlling the video expression vector to transmit.
3. The method of claim 1, wherein prior to obtaining the first video interaction data, the method further comprises: and combining the target video expression content, and acquiring secondary video expression content associated with the target video expression content from a prestored video expression content set.
4. The method of claim 2, wherein the obtaining secondary video presentation content associated with the target video presentation content from a set of pre-stored video presentation content in conjunction with the target video presentation content comprises:
the video expression vectors of the target video expression contents are obtained one by one through a vector identification thread, and the video expression vectors are controlled to be transmitted and prestored for the video expression vectors of each audio content in a video expression content set;
And controlling the video expression vector to transmit and pre-store the vector commonality association degree between the video expression vector of each audio content in the video expression content set based on the video expression vector of the target video expression content, and determining secondary video expression content associated with the target video expression content from the pre-store video expression content set.
5. The method of claim 4, wherein the controlling the degree of vector commonality association between the video enunciation vector for transmission and the video enunciation vector for each audio content in a set of pre-stored video enunciation content based on the video enunciation vector for the target video enunciation content, determining secondary video enunciation content associated with the target video enunciation content, comprises:
distributing the vector commonality association degree between the target video expression content and each audio content according to the association coefficient of the vector commonality association degree from big to small;
and screening the audio content corresponding to the vector commonality association degree of the interval data, and regarding the audio content as the secondary video expression content with the association of the target video expression content.
6. The method of claim 4, wherein said controlling the degree of vector commonality association between the video enunciation vector for transmission and the video enunciation vector for each of the audio content in a set of pre-stored video enunciation content based on the video enunciation vector for the target video enunciation content, determining secondary video enunciation content from the set of pre-stored video enunciation content that is associated with the target video enunciation content, comprises:
combining the video expression vectors of the target video expression contents, controlling the vector commonality association degree between the video expression vectors for transmission and the video expression vectors of each audio content, and obtaining a first video expression content associated with the target video expression contents from each audio content; obtaining second video expression content associated with the first video expression content from each of the audio contents in combination with the vector commonality association degree between the video expression vector of the first video expression content and the video expression vector of the audio content;
and regarding the first video expression content and the second video expression content as secondary video expression content of the target video expression content.
7. The method according to claim 2, wherein the number of vector optimized threads is one or a number accumulated one by one; when the number of vector optimization threads is several: the calculation of one vector optimization thread is random, and the calculation is the first video interaction data output by the vector optimization thread which is the last time.
8. The method of claim 2, wherein said fusing the video expression vectors of each of the secondary video segments in combination with the confidence level to obtain the weight vector of the important video segment comprises: and clustering the video expression vectors of the secondary video segments by combining the confidence coefficient to obtain the weighting vector of the important video segment.
9. The method of claim 8, wherein the combining the video expression vector of the important video segment and the weighting vector to obtain the video expression vector of the optimized target video expression content, and controlling the video expression vector to transmit, comprises: combining a video expression vector of the important video segment with the weighting vector; performing first projection on the combined vectors to obtain video expression vectors of the optimized target video expression content, and controlling the video expression vectors to be transmitted;
Wherein said determining a confidence level between said important video segment and each said secondary video segment in said first video interaction data comprises:
performing second projection on the important video segment and the secondary video segment;
determining an association relation between the important video segment and the secondary video segment after the second projection;
determining the confidence coefficient according to the association relation after the first projection processing;
wherein, the target video expression content comprises: searching video expression content to be processed and pre-storing each audio content in the video expression content set;
after obtaining the video expression vector of the target video expression content corresponding to the important video segment and controlling the video expression vector to transmit, the method further comprises: and controlling the vector commonality association degree between the video expression vector and the video expression vector of each audio content based on the video expression vector of the optimized target video expression content, and obtaining the near video expression content of the target video expression content from the audio content as a search result.
10. An artificial intelligence based short video transmission processing system comprising a processor and a memory in communication with each other, the processor being adapted to read a computer program from the memory and execute it to implement the method of any one of claims 1-9.
CN202310222805.XA 2023-03-09 2023-03-09 Short video transmission processing method and system based on artificial intelligence Pending CN116156279A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310222805.XA CN116156279A (en) 2023-03-09 2023-03-09 Short video transmission processing method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310222805.XA CN116156279A (en) 2023-03-09 2023-03-09 Short video transmission processing method and system based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN116156279A true CN116156279A (en) 2023-05-23

Family

ID=86358223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310222805.XA Pending CN116156279A (en) 2023-03-09 2023-03-09 Short video transmission processing method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116156279A (en)

Similar Documents

Publication Publication Date Title
US11113181B2 (en) Debugging a live streaming application
US11288047B2 (en) Heterogenous computer system optimization
CN115732050A (en) Intelligent medical big data information acquisition method and system
CN116112746B (en) Online education live video compression method and system
CN115473822B (en) 5G intelligent gateway data transmission method, system and cloud platform
CN115514570B (en) Network diagnosis processing method, system and cloud platform
CN117037982A (en) Medical big data information intelligent acquisition method and system
CN116156279A (en) Short video transmission processing method and system based on artificial intelligence
CN113626538B (en) Medical information intelligent classification method and system based on big data
CN115756576B (en) Translation method of software development kit and software development system
CN115564476A (en) Advertisement playing progress adjusting method and system and cloud platform
CN115509811B (en) Distributed storage data recovery method, system and cloud platform
CN113626559B (en) Semantic-based intelligent network document retrieval method and system
CN112600939B (en) Monitor control information detection method, system, server and storage medium
CN112685328B (en) Graphical interface testing method and device and storage medium
CN113611425B (en) Method and system for intelligent regional medical integrated database based on software definition
CN114863585B (en) Intelligent vehicle testing and monitoring system and method and cloud platform
CN115409510B (en) Online transaction security system and method
CN113609323B (en) Image dimension reduction method and system based on neural network
CN113643701B (en) Method and system for intelligently recognizing voice to control home
CN115495017B (en) Data storage method and system based on big data
CN113609362B (en) Data management method and system based on 5G
CN113613252B (en) 5G-based network security analysis method and system
CN115511524B (en) Advertisement pushing method, system and cloud platform
CN115858418B (en) Data caching method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination