TWI716033B - 影像配樂智能系統 - Google Patents

影像配樂智能系統 Download PDF

Info

Publication number
TWI716033B
TWI716033B TW108124933A TW108124933A TWI716033B TW I716033 B TWI716033 B TW I716033B TW 108124933 A TW108124933 A TW 108124933A TW 108124933 A TW108124933 A TW 108124933A TW I716033 B TWI716033 B TW I716033B
Authority
TW
Taiwan
Prior art keywords
music
image
analysis
module
analysis module
Prior art date
Application number
TW108124933A
Other languages
English (en)
Other versions
TW202105302A (zh
Inventor
李姿慧
朱沛全
陳玉璇
陳克強
Original Assignee
李姿慧
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 李姿慧 filed Critical 李姿慧
Priority to TW108124933A priority Critical patent/TWI716033B/zh
Priority to US16/749,195 priority patent/US20210020149A1/en
Priority to CN202010679269.2A priority patent/CN112231499A/zh
Application granted granted Critical
Publication of TWI716033B publication Critical patent/TWI716033B/zh
Publication of TW202105302A publication Critical patent/TW202105302A/zh
Priority to US17/951,133 priority patent/US20230015498A1/en

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/483Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • G06F16/433Query formulation using audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/036Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal of musical genre, i.e. analysing the style of musical pieces, usually for selection, filtering or classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/056Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/071Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for rhythm pattern analysis or rhythm style recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/076Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/155User input interfaces for electrophonic musical instruments
    • G10H2220/441Image sensing, i.e. capturing images or optical patterns for musical purposes or musical control purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/081Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/085Mood, i.e. generation, detection or selection of a particular emotional content or atmosphere in a musical piece
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/121Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
    • G10H2240/131Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/171Transmission of musical instrument data, control or status information; Transmission, remote access or control of music data for electrophonic musical instruments
    • G10H2240/181Billing, i.e. purchasing of data contents for use with electrophonic musical instruments; Protocols therefor; Management of transmission or connection time therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Acoustics & Sound (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Auxiliary Devices For Music (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本發明係提供一種影像配樂智能系統,係一種根據色調、分鏡節奏、影像對白、長度與分類、導演特殊需求之影像分析模組及一種紀錄曲式、段落轉折、風格、旋律、情緒張力之音樂分析模組,再利用AI配對模組將上述影像分析模組之影像與音樂分析模組之音樂特徵做合適的配對,以快速完成影像配樂的創作選曲功能者。

Description

影像配樂智能系統
本發明係有關於一種影像配樂智能系統,特別是指一種透過AI配對將影像配樂之音樂編輯系統。
目前有關提供音樂資訊的歌手、音樂人、專輯製作人員、單曲製作人員、唱片公司或版權公司,在製作影像創作選曲時,由音樂人選曲或由影像單位或應用音樂單位自行選曲,通常是影片剪輯、製作人員如廣告公司、電影宣傳片製作、電影公司、影片製作學生、攝影師製作相片配樂、戲劇劇團、跳舞舞團、遊戲公司、網頁設計音樂、企業宣傳曲、活動背景音樂、活動現場演出、展演音樂、展覽音樂、互動設計音樂、AR/VR互動裝置音樂、多媒體影像配樂;或其餘音樂應用之單位被上述需應用到音樂公司委任選曲或作曲之音樂製作/配樂/錄音室/創作者/歌手、音樂人、專輯製作人員、單曲製作人員、唱片公司或版權公司/單位,而上述有音樂需求的使用者,譬如上述影像製作、戲劇創作等應用音樂單位,常會遇到音樂授權的種種問題,有時只是上傳給喜歡的影片到VouTube網站上,而產生侵權警告的行為,甚至遭到刪除帳號的制裁可能;而上述提供音樂資訊者欲尋找影像配樂及授權時,係相當地耗費時間,往往為了要找到一支好的影像 配樂,得花上8小時至6個月的時間在選曲、試聽及尋找授權上,其中:在影像創作選曲單元,應用音樂創作者自行選曲每次約花5小時、委託製作每次約花5日、權利簽署時間過程係非常的繁瑣;在音樂買賣單元,所花費的時間每次約花5小時、權利簽署的時間約6個月、權利金的分配,有很多狀況並沒有分配,最多僅可獲得60%的權利金,可獲得權利金的平均數約10~20%左右,因此如何提供一種讓影像人製作影像配樂時或劇團行戲劇創作時,能大幅度縮短影像創作選曲的時間以及音樂買賣授權簽署的時間,乃是眾多音樂應用者或影像創作者較希望解決的問題。
本發明人有鑑於此,於是精心研究並再三研究改良,如今終於發明出一種影像配樂智能系統,可以摒除目前有關找尋音樂授權的單位,如影像製作單位、劇團等,在製作影像創作選曲時,常會遇到音樂種種的問題,進而提供產業上的利用價值者。
鑒於以上的問題,本發明的主要目的在於提供一種影像配樂智能系統,係利用AI配對模組連結影像分析模組及音樂分析模組,運用影像與音樂特徵做合適的配對,可推薦數首歌曲來做配對,若不滿意時,亦可重新推薦其他歌曲來做配對,以智能配對快速地達到影像創作選曲之目的。
為達上述目的,本發明採用如下的技術手段:一種影像配樂智能系統,該系統係包含:一影像分析模組,係根據色調、分鏡節奏、影像對白、長度 與分類、導演特殊需求與特徵來做分析;一音樂分析模組,係根據記錄曲式、段落轉折、風格、旋律、速度、樂器、和弦伴奏、聲部、節奏、音量及情緒張力來做分析;上述音樂分析與內容係包括有樂性分析、情緒分析及音樂特徵資訊;一AI配對模組,係連結影像分析模組及音樂分析模組,運用影像與音樂特徵做合適的配對;及一音樂編輯模組,係與AI配對模組連結,透過影片剪接、音樂剪貼串聯、音樂音量調整及音場模擬,將音樂與影像兩檔案之時間軸及撞擊點(Hit Point)完全對上。
為了讓 貴審查委員對本發明有更進一步的了解,茲佐以圖式詳細說明本發明如下:
(10):影像分析模組
(20):音樂分析模組
(30):AI配對模組
(40):音樂編輯模組
(50):API端點區塊鏈智能合約
(100):影像配樂智能平台
第一圖:係本發明影像配樂智能系統之系統架構圖。
第二圖:係現行影像分析中之色彩分析示意圖。
第三圖:係現行影像分析中之色彩分析群類結構圖。
第四圖:係本發明影像配樂智能系統文字分析之情緒字典示意圖。
第五圖:係現行音樂分析中之情緒參數示意圖。
第六圖:係本發明影像配樂智能系統配樂參考資訊之示意圖。
第七圖:係本發明影像配樂智能系統配樂方式之流程圖。
第七之一圖:係本發明第七圖配樂架構之部分放大圖。
第八圖:係本發明影像配樂智能系統之另一系統架構圖。
第九圖:係本發明影像配樂智能系統之商業模式示意圖。
第十圖:係本發明影像配樂智能系統其他商業行為之示意圖。
第十一圖:係本發明影像配樂智能系統之影像配樂智能平台之介紹示意圖。
第十二圖:係本發明影像配樂智能系統之系統截圖示意圖。
請參閱第一圖所示,本發明影像配樂智能系統之系統架構圖,如圖所示,本發明之系統包含有影像分析模組10、音樂分析模組20、AI配對模組30及音樂編輯模組40。
其中影像分析模組10係根據色調、分鏡節奏、影像對白(如故事性或轉折詞等)、長度與分類、導演特殊需求與特徵來做分析;上述影像分析模組10中之影像內容分析係包括有:色彩分析、內容分析及人物表情分析,其中色彩分析請參閱第二圖所示,現行影像分析中之色彩分析示意圖,係分析在電影中的色彩功能、色彩數值及如第三圖所示現行影像分析中之色彩分析群類結構;內容分析係根據影像中場景、人物、物品、光線,來分辨人事時地物(如年代、地點、時間、劇情等);人物表情分析係根據表情判斷影像中人物的情緒、劇情及可能的對話…等等;綜合以上影像內容分析,可以分別得到各種影像之向量值。上述影像分析模組10中處理分鏡節奏之分鏡檔分析係根據分鏡節奏的時間點做分析,之後再輸入模式,便於做鏡頭切換的時間點記錄、音樂與音效插入點的參考。上述分鏡檔分析所得到每個分鏡的秒數,係可 針對每個分鏡內容做分析或對點的設計;而影像分析模組及音樂分析模組20的音效或配樂分析中之分鏡表,係可蒐集逐格分析的Word分鏡檔案與影片本身者。上述影像分析模組10中處理影像對白之人本分析係根據影像對白與劇本分析,處理影像對白找出故事性或刪去轉折詞,使得關鍵字清晰且以依附性(或影響力)來排列,等比例平均地找到相對應之情緒參數;請參閱第四圖所示,關於文字分析依據現有中文情緒字典做處理。上述影像分析模組10處理導演特殊需求時,係依據導演提出之特殊需求加權於結果之排序(此種因素影響的結果比例較大)。
音樂分析模組20係根據記錄曲式、段落轉折、風格、旋律、速度、樂器、和弦伴奏、聲部、節奏、音量及情緒張力來做分析,上述音樂分析模組20中之音樂分析與內容係包括有:樂性分析、情緒分析及音樂特徵資訊,其中樂性分析係分析音樂調性、器樂編曲結構、節奏、和弦、和弦進行、旋律音高、音階進行、風格、曲式、段落、樂句、歌詞句、曲風及其他音樂檔案資訊;情緒分析請參閱第五圖所示現行音樂分析中之情緒參數示意圖,係依據音樂內容,透過機器訓練、智能學習,記錄每首歌在不同時間點時之情緒參數(x,y),該情緒參數之x軸(Valence)係為情緒正向與負面(數字正數為正向、負數為負向情緒)的數值,情緒參數之y軸(Arousal)係為情緒的激動程度。音樂資訊係依據歌手、音樂人、專輯製作人員、單曲製作人員、唱片公司、版權公司、OP、SP、區域團體、集管團體、著作權、合約關係等, 記錄音樂長度、風格、檔案位置、公開區域、串流連結、下載連結、視聽連結、midi檔案、wav檔案及mp3檔案;另,音樂分析模組20中之參考音樂分析,係輸入偏愛的參考音樂、程式,將根據輸入之參考音樂做音樂分析,找到與資料庫分析結果相符之曲目。
請參閱第六圖所示,本發明配樂參考資訊之示意圖,本發明依照下列分鏡檔分析、文本分析、導演特殊需求、參考音樂分析、影像內容分析及音樂分析得到相對應數值,將影像與音樂兩者數值做對應相配。配合第七圖所示,本發明配樂方法之流程圖,本發明依照配樂常用的分類功能、分類,歸納出影像與音樂最終的結果,其中有關影像類型係根據故事調性來設定確認;主要依隨係根據想要配樂所欲強調的哪部分;如角色(包括角色性格、角色內心情感)、劇情、場景(包括地點或城市)、時間、動作對點等;畫面特殊需求係不依照影像內容進行之反向或平行作用力,如反向進行作用力、平行鋪墊(或隱喻性音樂)、欺騙或暗示觀眾、以音樂來做轉場連結等。
本發明主要特徵所述為AI配對模組30,該AI配對模組30係連結影像分析模組10及音樂分析模組20,運用影像與音樂特徵做合適的配對,實務上可推薦五首歌曲來做配對,不滿意時可重新推薦其他歌曲來做配對。音樂編輯模組40係與AI配對模組30連結,本發明透過影片剪接、音樂剪貼串聯、音樂音量調整及音場模擬,將音樂與影像兩檔案之時間軸及撞擊點(Hit Point)完全對上。上述音樂編輯模組40與音樂分析模組20之音效對點,所 引用到的影片資料中,卡通音效係可以多一點,分析波形可以得到音效的插入點。
本發明訓練AI配對模組30引用到的影片資料如:YouTube-Movie、YouTube-movieclips、Roku Channel、Crackle、Dailymotion及愛奇藝網站…等。
請參閱第八圖所示,本發明影像配樂智能系統之另一系統架構圖,如圖所示,本發明之系統包含有影像分析模組10、音樂分析模組20、AI配對模組30、音樂編輯模組40。本發明影像配樂智能系統亦可運用API端點區塊鏈智能合約50連結音樂編輯模組40,來達到授權使用自由的功能;請參閱第九圖所示,本發明與音樂人簽訂的API端點區塊鏈智能合約50,協力將音樂販售給影像人,影像人則透過本發明影像配樂智能平台100完成結帳之商業行為;上述的音樂亦可為一片段或一分軌,假設該首歌的音樂是一個搖滾樂團的編制,該首歌內有電吉他聲、人聲、鼓聲或電貝斯聲,利用本發明影像配樂智能系統的程式,就能夠將該首歌單純的〝鼓聲〞或其他別首歌〝分軌〞的音樂或其他如電吉他〝分軌〞的音樂,一起融合到本發明影像配樂智能系統之程式裡做處理。請參閱第十圖所示,本發明影像智能平台100可以跟使用者(如:影像人)、科技商、音樂人(或音樂公司)、樂迷等完成影片(如:應用音樂)、媒體露出(如:廣告主)、下載音樂或串流音樂平台導流行銷等商業行為。請參閱第十一圖所示,本發明影像配樂智能平台100透過使用者可以輸入資訊,如:選擇要上傳的 影片、選擇要上傳的分鏡檔、選擇要上傳的參考音樂或劇本文本與對白後,下一頁面可得到影像配樂與建議,使用者可直接觀看成果與購買音樂。
請參閱第十二圖所示,本發明系統截圖之示意圖,其中資料庫頁面中有關關鍵字搜尋係包括:名稱、曲風、風格、速度、樂器、相關的關鍵字、演唱者、情緒、封面照片等;音訊訊號之獨家功能係以視聽MP3、下載wav或下載MP3等格式;而有關授權與訂單上係以Loop、midi、音樂授權等估算訂單金額、下訂單、更新訂單、下載已購買的音樂等商業行為。
本發明AI配對模組30演算法之內容係包括:一篩選方式及一計分方式;其中篩選方式係以常態分佈標準差之範圍內,給予篩選與否之標準,其在68%可信任度內(一個標準差的誤差範圍內)的值是被允許的,該篩選的類別包括有曲風或情緒參數等。計分方式係將節奏、樂器編制、和弦、音樂情緒(x,y)、關鍵字情緒(x,y)、導演輸入資訊、影像主色調及影像內容等類別之內容作量化,以計算出每個項目的分數做加權平均。
綜上所述,本發明影像配樂智能系統,係一種專業的影像配樂智能平台,主要利用AI配對模組連結影像分析模組與音樂分析模組,運用影像與音樂特徵做合適的配對,影像公司多元登入,在選擇影片後,經由導演審核,只要在平台上透過API端點區塊鏈智能合約,音樂人、影像公司及版權公司便可快速地完成影像配樂的功能者。因而具有產上之應用價值者,而本發明又從 未公諸於世或已見於其他刊物,實已符合專利法的規定,爰依法提出發明專利申請。
(10)‧‧‧影像分析模組
(20)‧‧‧音樂分析模組
(30)‧‧‧AI配對模組
(40)‧‧‧音樂編輯模組

Claims (5)

  1. 一種影像配樂智能系統,該系統係包含:一影像分析模組,係根據色調、分鏡節奏、影像對白、長度與分類、導演特殊需求與特徵來做分析,其中影像分析模組中處理分鏡節奏之分鏡檔分析係根據分鏡節奏的時間點做分析,之後再輸入模式,便於做鏡頭切換的時間點記錄、音樂與音效插入點的參考,影像分析模組中處理影像對白之人本分析係根據影像對白與劇本分析,處理影像對白找出故事性或刪去轉折詞,使得關鍵字清晰且以依附性(或影響力)來排列,等比例平均找到相對應之情緒參數;一音樂分析模組,係根據記錄曲式、段落轉折、風格、旋律、速度、樂器、和弦伴奏、聲部、節奏、音量及情緒張力來做分析;上述音樂分析與內容係包括有樂性分析、情緒分析及音樂特徵資訊,其中音樂分析模組中之情緒分析係依據音樂內容,透過機器訓練、智能學習,記錄每首歌在不同時間點時之情緒參數(x,y),其中情緒參數之x軸(Valence)係為情緒正向的數值,情緒參數之y軸(Arousal)係為情緒負面的激動程度;一AI配對模組,係連結影像分析模組及音樂分析模組,運用影像與音樂特徵做合適的配對,其中AI配對模組之篩選方式係以常態分佈標準差之範圍內,給予篩選與否之標準,其在68%可信任度內(一個標準差的誤差範圍內)的值是被 允許的,該篩選的類別包括有曲風或情緒參數,AI配對模組之計分方式係將節奏、樂器編制、和弦、音樂情緒(x,y)、關鍵字情緒(x,y)、導演輸入資訊、影像主色調及影像內容等類別之內容作量化,以計算出每個項目的分數做加權平均;及一音樂編輯模組,係與AI配對模組連結,透過影片剪接、音樂剪貼串聯、音樂音量調整及音場模擬,將音樂與影像兩檔案之時間軸及撞擊點(Hit Point)完全對上。
  2. 如申請專利範圍第1項所述之影像配樂智能系統,其中影像分析模組包括有分析在電影中的色彩功能、色彩數值及色彩分析群類結構之色彩分析;根據影像中場景、人物、物品、光線,來分辨人事時地物之內容分析及依據表情判斷影像中人物的情緒、劇情及可能的對話之人物表情分析。
  3. 如申請專利範圍第1項所述之影像配樂智能系統,其中音樂分析模組中之樂性分析係分析音樂調性、器樂編曲結構、節奏、和弦、和弦進行、旋律音高、音階進行、風格、曲式、段落、樂句、歌詞句及其他音樂檔案資訊。
  4. 如申請專利範圍第1項所述之影像配樂智能系統,其中音樂分析模組中之音樂特徵資訊係依據歌手、音樂人、專輯製作人員、單曲製作人員、唱片公司、版權公司、OP、SP、區域團體、集管團體、著作權、合約關係等,記錄音樂長度、風格、檔案位置、公開區域、串流連結、下載連結、視聽連結、midi 檔案、wav檔案及mp3檔案。
  5. 如申請專利範圍第1項所述之影像配樂智能系統,其中AI配對模組演算法之內容係包括:一篩選方式及一計分方式。
TW108124933A 2019-07-15 2019-07-15 影像配樂智能系統 TWI716033B (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
TW108124933A TWI716033B (zh) 2019-07-15 2019-07-15 影像配樂智能系統
US16/749,195 US20210020149A1 (en) 2019-07-15 2020-01-22 Intelligent system for matching audio with video
CN202010679269.2A CN112231499A (zh) 2019-07-15 2020-07-15 一种视频配乐智能系统
US17/951,133 US20230015498A1 (en) 2019-07-15 2022-09-23 Intelligent system for matching audio with video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW108124933A TWI716033B (zh) 2019-07-15 2019-07-15 影像配樂智能系統

Publications (2)

Publication Number Publication Date
TWI716033B true TWI716033B (zh) 2021-01-11
TW202105302A TW202105302A (zh) 2021-02-01

Family

ID=74116857

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108124933A TWI716033B (zh) 2019-07-15 2019-07-15 影像配樂智能系統

Country Status (3)

Country Link
US (1) US20210020149A1 (zh)
CN (1) CN112231499A (zh)
TW (1) TWI716033B (zh)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7147384B2 (ja) * 2018-09-03 2022-10-05 ヤマハ株式会社 情報処理方法および情報処理装置
US11232773B2 (en) * 2019-05-07 2022-01-25 Bellevue Investments Gmbh & Co. Kgaa Method and system for AI controlled loop based song construction
WO2022217438A1 (zh) * 2021-04-12 2022-10-20 苏州思萃人工智能研究所有限公司 基于人工智能视频理解的视频音乐适配方法与系统
US20220366881A1 (en) * 2021-05-13 2022-11-17 Microsoft Technology Licensing, Llc Artificial intelligence models for composing audio scores
KR20240028353A (ko) * 2021-05-27 2024-03-05 엑스디마인드 인코퍼레이티드 비디오 분석에 기초하는 보충 오디오 세그먼트들의 선택
CN115695899A (zh) * 2021-07-23 2023-02-03 花瓣云科技有限公司 视频的生成方法、电子设备及其介质
CN113656643B (zh) * 2021-08-20 2024-05-03 珠海九松科技有限公司 一种使用ai分析观影心情的方法
CN113542626B (zh) * 2021-09-17 2022-01-18 腾讯科技(深圳)有限公司 视频配乐方法、装置、计算机设备和存储介质
CN113923517B (zh) * 2021-09-30 2024-05-07 北京搜狗科技发展有限公司 一种背景音乐生成方法、装置及电子设备
US12009013B2 (en) * 2021-12-09 2024-06-11 Bellevue Investments Gmbh & Co. Kgaa System and method for AI/XI based automatic song finding method for videos
IT202200000080A1 (it) * 2022-01-04 2023-07-04 Sounzone S R L Metodo e sistema di sincronizzazione audio/video in tempo reale
US11727618B1 (en) * 2022-08-25 2023-08-15 xNeurals Inc. Artificial intelligence-based system and method for generating animated videos from an audio segment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005001715A1 (en) * 2003-06-30 2005-01-06 Koninklijke Philips Electronics, N.V. System and method for generating a multimedia summary of multimedia streams
TW201408085A (zh) * 2012-08-01 2014-02-16 Acer Inc 音樂及影像整合播放方法及系統
CN107016134A (zh) * 2017-05-24 2017-08-04 万业(天津)科技有限公司 可自动匹配的歌曲智能检索方法及系统
CN108933970A (zh) * 2017-05-27 2018-12-04 北京搜狗科技发展有限公司 视频的生成方法和装置
CN109472120A (zh) * 2018-09-19 2019-03-15 侯锐 一种数字声音的版权保护和获取方法、装置以及设备

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004090752A1 (en) * 2003-04-14 2004-10-21 Koninklijke Philips Electronics N.V. Method and apparatus for summarizing a music video using content analysis
CN101466010B (zh) * 2009-01-15 2010-10-27 华为终端有限公司 一种移动终端上播放视频的方法和移动终端
WO2014001607A1 (en) * 2012-06-29 2014-01-03 Nokia Corporation Video remixing system
CN103714130B (zh) * 2013-12-12 2017-08-22 深圳先进技术研究院 视频推荐系统及方法
CA2905105A1 (en) * 2015-09-28 2017-03-28 Gerard Voon Mental artificial intelligence algorythms
CN107145326B (zh) * 2017-03-28 2020-07-28 浙江大学 一种基于目标面部表情采集的音乐自动播放系统及方法
CN109240488A (zh) * 2018-07-27 2019-01-18 重庆柚瓣家科技有限公司 一种ai场景定位引擎的实现方法
CN109246474B (zh) * 2018-10-16 2021-03-02 维沃移动通信(杭州)有限公司 一种视频文件编辑方法及移动终端
CN109862393B (zh) * 2019-03-20 2022-06-14 深圳前海微众银行股份有限公司 视频文件的配乐方法、系统、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005001715A1 (en) * 2003-06-30 2005-01-06 Koninklijke Philips Electronics, N.V. System and method for generating a multimedia summary of multimedia streams
TW201408085A (zh) * 2012-08-01 2014-02-16 Acer Inc 音樂及影像整合播放方法及系統
CN107016134A (zh) * 2017-05-24 2017-08-04 万业(天津)科技有限公司 可自动匹配的歌曲智能检索方法及系统
CN108933970A (zh) * 2017-05-27 2018-12-04 北京搜狗科技发展有限公司 视频的生成方法和装置
CN109472120A (zh) * 2018-09-19 2019-03-15 侯锐 一种数字声音的版权保护和获取方法、装置以及设备

Also Published As

Publication number Publication date
US20210020149A1 (en) 2021-01-21
CN112231499A (zh) 2021-01-15
TW202105302A (zh) 2021-02-01

Similar Documents

Publication Publication Date Title
TWI716033B (zh) 影像配樂智能系統
Brauneis Musical work copyright for the era of digital sound technology: looking beyond composition and performance
Baggi et al. Music navigation with symbols and layers: Toward content browsing with IEEE 1599 XML encoding
Théberge Digitalization
Fillon et al. Telemeta: An open-source web framework for ethnomusicological audio archives management and automatic analysis
Lu et al. Musecoco: Generating symbolic music from text
Collins Computational Analysis of Musical Influence: A Musicological Case Study Using MIR Tools.
Wang et al. Scene-aware background music synthesis
Negus et al. Copying, copyright and originality: imitation, transformation and popular musicians
Zhuo et al. Video background music generation: Dataset, method and evaluation
Lin et al. Audio musical dice game: A user-preference-aware medley generating system
Schindler Multi-modal music information retrieval: Augmenting audio-analysis with visual computing for improved music video analysis
Hirai et al. MusicMixer: Automatic DJ system considering beat and latent topic similarity
Arbo The Normativity of Musical Works: A Philosophical Inquiry
Leman Musical audio-mining
US20230015498A1 (en) Intelligent system for matching audio with video
Gardner et al. LLark: A Multimodal Instruction-Following Language Model for Music
Barnett et al. Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
Woloshyn Imogen Heap as Musical Cyborg: Renegotiations of Power, Gender, and Sound
Selfridge-Field Substantial musical similarity in sound and notation: Perspectives from digital musicology
Onwuegbuna Production, propagation and consumption of Nigerian popular music
Turgeon Indie Rock 101: running, recording, promoting your band
Simonetta Music interpretation analysis. A multimodal approach to score-informed resynthesis of piano recordings
O'Connor et al. Determining the Composition
Sibilla Dancing in the Dark. MTV, Music Videos, Bruce Springsteen and the Aesthetics of Rock in the Eighties