TW202315414A - Infrared remote controlled video and audio device and infrared remote control method of playing video and audio - Google Patents

Infrared remote controlled video and audio device and infrared remote control method of playing video and audio Download PDF

Info

Publication number
TW202315414A
TW202315414A TW110134476A TW110134476A TW202315414A TW 202315414 A TW202315414 A TW 202315414A TW 110134476 A TW110134476 A TW 110134476A TW 110134476 A TW110134476 A TW 110134476A TW 202315414 A TW202315414 A TW 202315414A
Authority
TW
Taiwan
Prior art keywords
audio
target
indication signal
processing
infrared remote
Prior art date
Application number
TW110134476A
Other languages
Chinese (zh)
Inventor
丁國基
Original Assignee
英業達股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 英業達股份有限公司 filed Critical 英業達股份有限公司
Priority to TW110134476A priority Critical patent/TW202315414A/en
Publication of TW202315414A publication Critical patent/TW202315414A/en

Links

Images

Abstract

An infrared remote control method of playing video and audio includes playing display frames of a video by a display interface and outputting an audio signal by an audio output interface, receiving a first instruction signal by an infrared sensor, pausing the display interface and the audio output interface according to the first instruction signal to make the display interface display a pause frame, receiving a second instruction signal by the infrared sensor, determining a target character pattern in a target block in the pause frame according to the second instruction signal, extracting a determined audio track corresponding to the target character pattern from the audio signal according to the correspondence between pre-processed character feature groups and pre-processed audio tracks, and controlling the display interface to continue playing and the audio output interface to output the determined audio track.

Description

紅外線遙控影音裝置及紅外線遙控影音播放方法Infrared remote control audio-visual device and infrared remote control audio-visual playing method

本發明係關於一種影音播放控制方法,特別係關於一種紅外線遙控影音播放方法。The present invention relates to a video and audio playback control method, in particular to an infrared remote control audio and video playback method.

傳統電視必須接上數位電視天線、HDMI影音輸入或是第四台來觀看節目,按照既定的播放時間來進行觀賞。近年來智慧家電興起,所有家電都訴求可連網、可連結應用程式(APP)等功能,電視亦包含在內。智慧電視提供了如同智慧型手機可以上網透過YouTube、NETFLIX、Apple TV+、myVideo等平台觀看影片的功能。然而目前電視在播放影片時,通常聲音都是混著播出,因此使用者常遇到無法分辨聲音是屬於顯示畫面中的何角色的問題。Traditional TV must be connected to digital TV antenna, HDMI video input or the fourth channel to watch the program, and watch according to the established broadcast time. With the rise of smart home appliances in recent years, all home appliances require functions such as being able to connect to the Internet and connecting to applications (APPs), including TVs. Smart TV provides the function of watching videos on the Internet through YouTube, Netflix, Apple TV+, myVideo and other platforms just like a smartphone. However, when a TV is currently playing a video, the sound is usually mixed and broadcast, so users often encounter the problem of being unable to distinguish which character the sound belongs to in the display screen.

鑒於上述,本發明提供一種紅外線遙控影音裝置及紅外線遙控影音播放方法,可以提供指定角色的聲音。In view of the above, the present invention provides an infrared remote control audio-visual device and an infrared remote-control audio-visual playing method, which can provide the voice of a specified character.

依據本發明一實施例的紅外線遙控影音裝置,包含紅外線感測器、顯示介面、音訊輸出介面、記憶體及處理器,其中處理器連接於紅外線感測器、顯示介面、音訊輸出介面及記憶體。紅外線感測器用於接收第一指示訊號及第二指示訊號。顯示介面用於播放影片的多個顯示畫面。音訊輸出介面用於輸出影片的音訊。記憶體儲存多個預處理角色特徵組、多個預處理音軌及所述多個預處理角色特徵組與所述多個預處理音軌的對應關係。處理器用於:依據第一指示訊號,暫停顯示介面的播放及音訊輸出介面的輸出,使顯示介面顯示暫停畫面;依據第二指示訊號,判斷位於暫停畫面的目標區塊中的目標角色圖案,其中第二指示訊號指示目標區塊,且目標角色圖案符合所述多個預處理角色特徵組中之一者;依據所述多個預處理角色特徵組與所述多個預處理音軌的對應關係,從音訊中提取對應於目標角色圖案的判定音軌;以及控制顯示介面繼續播放,且控制音訊輸出介面輸出判定音軌。According to an embodiment of the present invention, an infrared remote control audio-visual device includes an infrared sensor, a display interface, an audio output interface, a memory, and a processor, wherein the processor is connected to the infrared sensor, the display interface, the audio output interface, and the memory . The infrared sensor is used for receiving the first indication signal and the second indication signal. The display interface is used to play multiple display screens of the video. The audio output interface is used to output the audio of the video. The memory stores multiple pre-processing character feature sets, multiple pre-processing audio tracks, and correspondence between the multiple pre-processing character feature sets and the multiple pre-processing audio tracks. The processor is used to: according to the first instruction signal, pause the playback of the display interface and the output of the audio output interface, so that the display interface displays a pause image; according to the second instruction signal, determine the target character pattern located in the target block of the pause image, wherein The second indication signal indicates the target block, and the target character pattern conforms to one of the plurality of pre-processing character feature groups; according to the corresponding relationship between the plurality of pre-processing character feature groups and the plurality of pre-processing audio tracks , extracting the determination audio track corresponding to the target character pattern from the audio; and controlling the display interface to continue playing, and controlling the audio output interface to output the determination audio track.

依據本發明一實施例的紅外線遙控影音播放方法,包含:藉由顯示介面播放影片的多個顯示畫面,且藉由音訊輸出介面輸出影片的音訊;藉由紅外線感測器接收第一指示訊號;藉由處理器,依據第一指示訊號,暫停顯示介面的播放及音訊輸出介面的輸出,使顯示介面顯示暫停畫面;藉由紅外線感測器接收第二指示訊號;藉由處理器依據第二指示訊號,判斷位於暫停畫面的目標區塊中的目標角色圖案,其中第二指示訊號指示目標區塊,且目標角色圖案符合多個預處理角色特徵組中之一者;藉由處理器依據所述多個預處理角色特徵組與多個預處理音軌的對應關係,從音訊中提取對應於目標角色圖案的判定音軌;以及控制顯示介面繼續播放,且控制音訊輸出介面輸出判定音軌。According to an embodiment of the present invention, an infrared remote control video playback method includes: playing multiple display screens of a video through a display interface, and outputting the audio of the video through an audio output interface; receiving a first indication signal through an infrared sensor; The processor, according to the first instruction signal, pauses the playback of the display interface and the output of the audio output interface, so that the display interface displays a pause image; receives the second instruction signal through the infrared sensor; and uses the processor according to the second instruction signal, judging the target character pattern located in the target block of the pause screen, wherein the second indication signal indicates the target block, and the target character pattern conforms to one of a plurality of pre-processed character feature groups; by the processor according to the Correspondence between multiple pre-processed character feature groups and multiple pre-processed audio tracks, extracting a determination audio track corresponding to the target character pattern from the audio; and controlling the display interface to continue playing, and controlling the audio output interface to output the determination audio track.

藉由上述架構,本案所揭示的紅外線遙控影音裝置及紅外線遙控影音播放方法,基於多個預處理角色特徵組與多個預處理音軌的對應關係,判定紅外線指示訊號所指定的角色的對應音軌,可以提供單獨播放指定角色聲音的功能。With the above structure, the infrared remote control audio-visual device and the infrared remote control audio-visual playback method disclosed in this case determine the corresponding sound of the character specified by the infrared indication signal based on the correspondence between multiple pre-processed character feature groups and multiple pre-processed audio tracks. track, it can provide the function of playing the sound of the specified character independently.

以上之關於本揭露內容之說明及以下之實施方式之說明係用以示範與解釋本發明之精神與原理,並且提供本發明之專利申請範圍更進一步之解釋。The above description of the disclosure and the following description of the implementation are used to demonstrate and explain the spirit and principle of the present invention, and provide a further explanation of the patent application scope of the present invention.

以下在實施方式中詳細敘述本發明之詳細特徵組以及優點,其內容足以使任何熟習相關技藝者了解本發明之技術內容並據以實施,且根據本說明書所揭露之內容、申請專利範圍及圖式,任何熟習相關技藝者可輕易地理解本發明相關之目的及優點。以下之實施例係進一步詳細說明本發明之觀點,但非以任何觀點限制本發明之範疇。The detailed feature groups and advantages of the present invention are described in detail below in the embodiments, the content of which is sufficient to enable anyone familiar with the relevant art to understand the technical content of the present invention and implement it accordingly, and according to the content disclosed in this specification, the patent scope of the application and the drawings Any person skilled in the relevant art can easily understand the related objects and advantages of the present invention. The following examples are to further describe the concept of the present invention in detail, but not to limit the scope of the present invention in any way.

請參考圖1,圖1係依據本發明一實施例所繪示的紅外線遙控影音裝置的功能方塊圖。如圖1所示,紅外線遙控影音裝置10包含紅外線感測器11、顯示介面13、音訊輸出介面15、記憶體17及處理器19,其中處理器19透過有線或無線方式連接於紅外線感測器11、顯示介面13、音訊輸出介面15及記憶體17。特別來說,紅外線遙控影音裝置10可以智慧電視實現,但不以此為限。Please refer to FIG. 1 . FIG. 1 is a functional block diagram of an infrared remote control audio-visual device according to an embodiment of the present invention. As shown in FIG. 1 , the infrared remote control audio-visual device 10 includes an infrared sensor 11, a display interface 13, an audio output interface 15, a memory 17 and a processor 19, wherein the processor 19 is connected to the infrared sensor by wired or wireless means. 11. A display interface 13 , an audio output interface 15 and a memory 17 . In particular, the infrared remote control audio-visual device 10 can be realized by a smart TV, but not limited thereto.

紅外線感測器11用於接收紅外線指示訊號,例如由電視遙控器所發出的紅外線指示訊號。進一步來說,紅外線指示訊號可以依電視遙控器遭觸發(例如按壓)之按鍵的不同而帶有不同代碼或其他可供處理器19辨識的標記。顯示介面13例如為螢幕,音訊輸出介面15則例如為喇叭。顯示介面13及音訊輸出介面15用於播放影片。進一步來說,顯示介面13用於播放影片的多個顯示畫面,音訊輸出介面15則用於輸出影片的音訊。The infrared sensor 11 is used for receiving an infrared indication signal, such as an infrared indication signal sent by a TV remote control. Furthermore, the infrared indication signal may carry different codes or other marks that can be identified by the processor 19 depending on the key that is triggered (for example, pressed) on the TV remote control. The display interface 13 is, for example, a screen, and the audio output interface 15 is, for example, a speaker. The display interface 13 and the audio output interface 15 are used for playing video. Furthermore, the display interface 13 is used to play multiple display images of the video, and the audio output interface 15 is used to output the audio of the video.

記憶體17例如為快閃(flash)記憶體、硬碟(HDD)、固態硬碟(SSD)、動態隨機存取記憶體(DRAM)、靜態隨機存取記憶體(SRAM)或其他非揮發性記憶體。記憶體17可以為本地儲存媒介或可以為遠端儲存媒介,例如雲端資料庫。記憶體17儲存多個預處理角色特徵組、多個預處理音軌及所述多個預處理角色特徵組與所述多個預處理音軌的對應關係,其中所述對應關係例如係以查找表的形式儲存。Memory 17 is, for example, flash (flash) memory, hard disk (HDD), solid state disk (SSD), dynamic random access memory (DRAM), static random access memory (SRAM) or other non-volatile Memory. The memory 17 can be a local storage medium or a remote storage medium, such as a cloud database. The memory 17 stores a plurality of pre-processing character feature groups, a plurality of pre-processing audio tracks, and the corresponding relationship between the plurality of pre-processing character feature groups and the plurality of pre-processing audio tracks, wherein the corresponding relationship is, for example, searched for stored in the form of a table.

處理器19例如為中央處理器、微控制器、可程式化邏輯控制器或其他處理器。處理器19用於依據紅外線感測器11所接收的紅外線指示訊號來對顯示介面13及音訊輸出介面15所播放的影片進行處理,以播放指定角色所對應的聲音。進一步的執行步驟將於後描述。The processor 19 is, for example, a central processing unit, a microcontroller, a programmable logic controller or other processors. The processor 19 is used for processing the video played by the display interface 13 and the audio output interface 15 according to the infrared indication signal received by the infrared sensor 11, so as to play the sound corresponding to the specified character. Further implementation steps will be described later.

請一併參考圖1及2,其中圖2係依據本發明一實施例所繪示的紅外線遙控影音播放方法的流程圖。如圖2所示,紅外線遙控影音播放方法可以包含步驟S201~S207。圖2所示的紅外線遙控影音播放方法可由圖1所示的紅外線遙控影音裝置10執行,但不限於此。為了方便理解,以下示例性地以紅外線遙控影音裝置10的運作來說明紅外線遙控影音播放方法。Please refer to FIGS. 1 and 2 together, wherein FIG. 2 is a flow chart of an infrared remote control video playback method according to an embodiment of the present invention. As shown in FIG. 2 , the method for playing audio and video by infrared remote control may include steps S201 - S207 . The infrared remote control audio-visual playing method shown in FIG. 2 can be executed by the infrared remote control audio-visual device 10 shown in FIG. 1 , but is not limited thereto. For the convenience of understanding, the operation of the infrared remote control audio-visual device 10 is exemplified below to illustrate the infrared remote control audio-visual playing method.

於步驟S201中,紅外線遙控影音裝置10藉由顯示介面13播放影片的多個顯示畫面,且藉由音訊輸出介面15輸出影片的音訊。於步驟S202及S203中,紅外線遙控影音裝置10藉由紅外線感測器11接收第一指示訊號,並藉由處理器19依據第一指示訊號,暫停顯示介面13的播放及音訊輸出介面15的輸出,使顯示介面13顯示暫停畫面,即暫停影片的播放。進一步來說,第一指示訊號可以源自於紅外線遙控器,紅外線遙控器具有多個按鍵,當特定按鍵或按鍵組合遭觸發時,紅外線遙控器便輸出第一指示訊號,且第一指示訊號帶有代表所述特定按鍵或按鍵組合的代碼(後稱暫停代碼)。舉例來說,特定按鍵可以為數字零的按鍵或暫停播放按鍵,本發明不予限制。處理器19預存有暫停代碼與暫停影片播放作業的對應關係。當處理器19接收到第一指示訊號時,便可依據第一指示訊號帶有的暫停代碼控制顯示介面13及音訊輸出介面15暫停播放影片。In step S201 , the infrared remote control audio-visual device 10 plays multiple display frames of the video through the display interface 13 , and outputs the audio of the video through the audio output interface 15 . In steps S202 and S203, the infrared remote control audio-visual device 10 receives the first indication signal through the infrared sensor 11, and the processor 19 pauses the playback of the display interface 13 and the output of the audio output interface 15 according to the first indication signal. , to make the display interface 13 display a pause image, that is, to pause the playing of the movie. Further, the first indication signal can be derived from an infrared remote controller, and the infrared remote controller has a plurality of keys. When a specific key or combination of keys is triggered, the infrared remote controller outputs the first indication signal, and the first indication signal is accompanied by There are codes (hereafter referred to as pause codes) that represent the specific key or combination of keys in question. For example, the specific key may be the key of number zero or the key of pausing playback, which is not limited in the present invention. The processor 19 pre-stores the corresponding relationship between the pause code and the pause movie playback operation. When the processor 19 receives the first indication signal, it can control the display interface 13 and the audio output interface 15 to pause playing the video according to the pause code included in the first indication signal.

於步驟S204中,紅外線遙控影音裝置10藉由紅外線感測器11接收第二指示訊號。於此要特別說明的是,步驟S204或可執行於步驟S202與步驟S203之間。舉例來說,使用者可以透過紅外線遙控器先後發送兩個指示訊號。處理器19可以依據紅外線感測器11接收到兩指示訊號的時間順序判斷先接收者為第一指示訊號而後接收者為第二指示訊號。In step S204 , the infrared remote control audio-visual device 10 receives the second indication signal through the infrared sensor 11 . It should be noted here that step S204 may be executed between step S202 and step S203. For example, the user can send two indication signals successively through the infrared remote controller. The processor 19 may determine that the receiver of the first indication signal is the first indication signal and the receiver of the second indication signal is the second indication signal according to the time sequence in which the infrared sensor 11 receives the two indication signals.

接著,紅外線遙控影音裝置10藉由處理器19執行步驟S205~S207。於步驟S205中,處理器19依據第二指示訊號,判斷位於暫停畫面的目標區塊中的目標角色圖案,其中第二指示訊號指示目標區塊,且目標角色圖案符合多個預處理角色特徵組中之一者。進一步來說,處理器19可以將暫停畫面分割為多個分割區塊,並分配每個分割區塊一指定代碼,每個區塊的指定代碼對應於紅外線遙控器上的不同按鍵或按鍵組合。由紅外線遙控器產生的第二指示訊號依其觸發的按鍵或按鍵組合,帶有對應的指定代碼。當處理器19接收到第二指示訊號時,便可判定其中的指定代碼所對應的分割區塊為目標區塊。特別來說,指定代碼所對應的按鍵或按鍵組合有別於前述暫停代碼所對應的按鍵或按鍵組合。Next, the infrared remote control audio-visual device 10 executes steps S205 - S207 through the processor 19 . In step S205, the processor 19 judges the target character pattern located in the target block of the paused screen according to the second indication signal, wherein the second indication signal indicates the target block, and the target character pattern conforms to a plurality of pre-processed character feature groups one of them. Further, the processor 19 may divide the paused screen into multiple divided blocks, and assign a designated code to each divided block, and the designated code of each block corresponds to a different key or combination of keys on the infrared remote controller. The second indication signal generated by the infrared remote controller carries a corresponding designated code according to the key or combination of keys triggered by it. When the processor 19 receives the second indication signal, it can determine that the divided block corresponding to the specified code is the target block. Specifically, the key or key combination corresponding to the specified code is different from the key or key combination corresponding to the aforementioned pause code.

處理器19依據儲存於記憶體17中的多個預處理角色特徵組判斷目標區塊中存在符合的目標角色圖案。更進一步來說,每個預處理角色特徵組包含多個特徵,例如五官特徵、臉型特徵等,當處理器19判斷目標區塊中包含相同於同一預處理角色特徵組中的預設數量的特徵時,判斷目標區塊中存在符合此預處理角色特徵組的目標角色圖案。特別來說,當處理器19判斷目標區塊不存在符合預處理角色特徵組的圖案時,便不作動,或是透過顯示介面13或音訊輸出介面15輸出索取訊號以索取另一第二指示訊號,並在接收到新的第二指示訊號後再次執行步驟S205。The processor 19 judges that there is a matching target character pattern in the target block according to a plurality of pre-processed character feature groups stored in the memory 17 . Furthermore, each pre-processing character feature group contains multiple features, such as facial features, facial features, etc., when the processor 19 judges that the target block contains the same preset number of features in the same pre-processing character feature group , it is judged that there is a target character pattern in the target block that conforms to the pre-processed character feature group. Specifically, when the processor 19 judges that there is no pattern matching the pre-processed character feature set in the target block, it does not act, or outputs a request signal through the display interface 13 or the audio output interface 15 to request another second instruction signal , and execute step S205 again after receiving a new second indication signal.

於圖2的步驟S206中,處理器19依據所述多個預處理角色特徵組與多個預處理音軌的對應關係,從音訊中提取對應於目標角色圖案的判定音軌。進一步來說,所述多個預處理角色特徵組分別包含影片中不同角色的五官特徵、臉型特徵等,所述多個預處理音軌分別包含影片中不同角色的聲音,而所述多個預處理角色特徵組與多個預處理音軌的對應關係指示屬於同個角色的預處理角色特徵組與多個預處理音軌的關聯性。預處理角色特徵組、預處理音軌及兩者間的對應關係可以由處理器19或外部處理器(例如雲端伺服器)在播放影片前以人工智慧(AI)技術處理而得並儲存至記憶體17,或由處理器19在接獲第一指示訊號前的影片播放期間以AI技術處理而得並儲存至記憶體17,進一步的AI處理方法將於後描述。於一實施態樣中,預處理音軌係對部分影片的音訊進行處理而得,處理器19可以依據所述多個預處理角色特徵組與所述多個預處理音軌的對應關係,判斷對應於目標角色圖案的預處理音軌,並依據預處理音軌的聲紋從音訊中提取具有相同聲紋的判定音軌。於另一實施態樣中,預處理音軌係對完整影片的音訊進行處理而得,處理器19可以依據所述多個預處理角色特徵組與所述多個預處理音軌的對應關係,判斷對應於目標角色圖案的預處理音軌並將此預處理音軌作為判定音軌。In step S206 of FIG. 2 , the processor 19 extracts the determined audio track corresponding to the target character pattern from the audio according to the correspondence between the plurality of pre-processed character feature groups and the plurality of pre-processed audio tracks. Further, the plurality of pre-processing character feature groups respectively include facial features and facial features of different characters in the movie, the plurality of pre-processing audio tracks respectively include the voices of different characters in the movie, and the plurality of pre-processing The correspondence between the processing character feature set and the multiple pre-processing tracks indicates the association between the pre-processing character feature set belonging to the same character and the multiple pre-processing tracks. The pre-processed character feature group, the pre-processed audio track and the correspondence between the two can be processed by the processor 19 or an external processor (such as a cloud server) with artificial intelligence (AI) technology before playing the video and stored in the memory 17, or processed by the processor 19 with AI technology during video playback before receiving the first instruction signal and stored in the memory 17, further AI processing methods will be described later. In one implementation, the pre-processed audio track is obtained by processing the audio of a part of the video, and the processor 19 can determine the A preprocessed audio track corresponding to the pattern of the target character, and extracting a determination audio track with the same voiceprint from the audio according to the voiceprint of the preprocessed audio track. In another embodiment, the pre-processed audio track is obtained by processing the audio of the complete movie, and the processor 19 may, according to the correspondence between the plurality of pre-processed character feature groups and the plurality of pre-processed audio tracks, A pre-processed track corresponding to the pattern of the target character is judged and this pre-processed track is used as the judged track.

於步驟S207中,處理器19控制顯示介面13繼續播放,且控制音訊輸出介面15輸出判定音軌。於一實施態樣中,處理器19可以控制音訊輸出介面15僅輸出判定音軌而不輸出音訊中的其他音軌。於另一實施態樣中,處理器19可以控制音訊輸出介面15以高於其他音軌的音量輸出判定音軌。In step S207, the processor 19 controls the display interface 13 to continue playing, and controls the audio output interface 15 to output the determined audio track. In an implementation, the processor 19 can control the audio output interface 15 to only output the determined audio track and not output other audio tracks in the audio. In another implementation, the processor 19 can control the audio output interface 15 to output the determined audio track at a higher volume than other audio tracks.

舉一個實際的例子來說明步驟S205~S207,請參考圖3及圖4,圖3係依據本發明一實施例所繪示的紅外線遙控影音裝置的使用情境圖,圖4則係依據本發明一實施例所繪示的紅外線遙控影音裝置的暫停畫面示意圖。如圖3所示,作為顯示介面的電視螢幕的暫停畫面F3可以被分割為九個分割區塊F31,這九個分割區塊F31可以分別被分配指定代碼1~9,分別對應於電視遙控器上的數字按鍵1~9。當數字按鍵1遭觸發而產生第二指示訊號至電視時,電視的處理器便會據以判斷目標區塊為具有指定代碼1的分割區塊F31,以此類推。To give a practical example to illustrate steps S205-S207, please refer to FIG. 3 and FIG. 4. FIG. A schematic diagram of the pause screen of the infrared remote control audio-visual device shown in the embodiment. As shown in FIG. 3 , the pause screen F3 of the TV screen as the display interface can be divided into nine divided blocks F31, and these nine divided blocks F31 can be assigned designated codes 1 to 9, corresponding to the TV remote control respectively. Number buttons 1~9 on the screen. When the number key 1 is triggered to generate the second indication signal to the TV, the processor of the TV will judge the target block to be the divided block F31 with the specified code 1, and so on.

假設圖4所示的暫停畫面F4具有如圖3所示的指定代碼配置。當第二指示訊號指示指定代碼4時,處理器依據記憶體中的預處理角色特徵組判斷目標區塊中存在符合的目標角色圖案41,再依據預處理角色特徵組與預處理音軌的對應關係,從音訊中提取對應於目標角色圖案41的判定音軌,並控制顯示介面繼續播放且控制音訊輸出介面輸出判定音軌。當第二指示訊號指示指定代碼5時,同理於上述,處理器判斷目標角色圖案42,從音訊中提取對應於目標角色圖案42,並控制顯示介面繼續播放且控制音訊輸出介面輸出判定音軌。特別來說,圖4所示之分割暫停畫面F4的虛線可以選擇性地呈現於實際顯示的畫面上。舉例來說,處理器在控制顯示介面暫停播放時可以同時控制顯示介面呈現分割暫停畫面F4的虛線,以方便使用者選擇目標區塊。Assume that the pause screen F4 shown in FIG. 4 has the specified code configuration as shown in FIG. 3 . When the second indication signal indicates the specified code 4, the processor judges that there is a matching target character pattern 41 in the target block according to the pre-processing character feature group in the memory, and then according to the correspondence between the pre-processing character feature group and the pre-processing audio track relationship, extract the determination track corresponding to the target character pattern 41 from the audio, and control the display interface to continue playing and control the audio output interface to output the determination track. When the second indication signal indicates the designated code 5, similar to the above, the processor judges the target character pattern 42, extracts the corresponding target character pattern 42 from the audio, and controls the display interface to continue playing and controls the audio output interface to output the judgment track . In particular, the dotted line dividing the pause frame F4 shown in FIG. 4 can be selectively presented on the actually displayed frame. For example, when the processor controls the display interface to pause the playback, it can simultaneously control the display interface to display the dotted line dividing the pause screen F4, so as to facilitate the user to select the target block.

圖4示例性地繪示目標區塊中存在單個目標角色圖案41或42的實施例。於目標區塊中存在多個目標角色圖案的一實施例中,處理器可以依據預處理角色特徵組與預處理音軌的對應關係,從音訊中提取分別對應於多個目標角色圖案的多個判定音軌,控制顯示介面繼續播放且控制音訊輸出介面輸出所述多個判定音軌。FIG. 4 exemplarily illustrates an embodiment in which a single target character pattern 41 or 42 exists in the target block. In an embodiment where there are multiple target character patterns in the target block, the processor can extract multiple target character patterns respectively corresponding to the multiple target character patterns from the audio according to the correspondence between the pre-processed character feature set and the pre-processed audio track. Determine the audio track, control the display interface to continue playing and control the audio output interface to output the plurality of determined audio tracks.

對於目標區塊中存在多個目標角色圖案的另一實施例,請一併參考圖1、圖2及圖5,其中圖5係依據該實施例所繪示紅外線遙控影音播放方法的流程圖。圖5所示的紅外線遙控影音播放方法包含前述圖2實施例所述的步驟S201~S204、S206及S207,差別在於步驟S205包含了步驟S501~S505。步驟S501~S505同樣可由圖1所示的紅外線遙控影音裝置10執行,但不限於此。為了方便理解,以下示例性地以紅外線遙控影音裝置10的運作來說明步驟S501~S505。For another embodiment in which there are multiple target character patterns in the target block, please refer to FIG. 1 , FIG. 2 and FIG. 5 , wherein FIG. 5 is a flow chart of an infrared remote control video playback method according to this embodiment. The infrared remote control video playback method shown in FIG. 5 includes steps S201-S204, S206, and S207 described in the embodiment of FIG. 2, the difference is that step S205 includes steps S501-S505. Steps S501-S505 can also be performed by the infrared remote control audio-visual device 10 shown in FIG. 1 , but is not limited thereto. For the convenience of understanding, steps S501 - S505 will be described below by using the operation of the infrared remote control audio-visual device 10 as an example.

於步驟S501中,處理器19判斷位於暫停畫面的目標區塊(對應於第二指示訊號)中存在符合預處理角色特徵組的預選圖案數量是否為一個。當判斷結果為是時,如步驟S502所示,處理器19將預選圖案作為目標角色圖案,並接著執行步驟S206。當判斷結果為否(即預選圖案數量為多個)時,如步驟S503所示,處理器19透過顯示介面13或/及音訊輸出介面15索取第三指示訊號。舉例來說,處理器19可以控制顯示介面13將目標區塊放大以填滿畫面,且/或顯示索取第三指示訊號的文字訊息。舉另個例子來說,處理器19可以控制音訊輸出介面15輸出索取第三指示訊號的語音訊息。In step S501, the processor 19 judges whether there is one pre-selected pattern in the target block (corresponding to the second indication signal) of the paused screen. When the judgment result is yes, as shown in step S502, the processor 19 takes the preselected pattern as the target character pattern, and then executes step S206. When the judgment result is negative (that is, the number of preselected patterns is multiple), as shown in step S503 , the processor 19 requests a third indication signal through the display interface 13 or/and the audio output interface 15 . For example, the processor 19 may control the display interface 13 to enlarge the target block to fill the screen, and/or display a text message requesting the third indication signal. For another example, the processor 19 may control the audio output interface 15 to output a voice message requesting the third indication signal.

於步驟S504及步驟S505中,處理器19透過紅外線感測器11取得第三指示訊號,且依據第三指示訊號,判斷所述多個預選圖案中位於目標區塊的目標子區塊中的預選圖案為目標角色圖案,其中第三指示訊號指示目標子區塊。進一步來說,處理器19可以將目標區塊分隔為多個分割子區塊,並分配每個分割區塊一指定代碼,每個分割子區塊的指定代碼對應於紅外線遙控器上的不同按鍵或按鍵組合。由紅外線遙控器產生的第三指示訊號依其觸發的按鍵或按鍵組合,帶有對應的指定代碼。當處理器19接收到第三指示訊號時,便可判定其中的指定代碼所對應的分割子區塊為目標子區塊。處理器19將位於目標子區塊中的預選圖案作為目標角色圖案,並接著進行步驟S206。特別來說,當處理器19判斷目標子區塊不存在預選圖案時,便不作動,或是透過顯示介面13或音訊輸出介面15輸出索取訊號以索取另一第三指示訊號,並在接收到新的第三指示訊號後再次執行步驟S505。In step S504 and step S505, the processor 19 obtains a third indication signal through the infrared sensor 11, and according to the third indication signal, determines which of the plurality of pre-selected patterns is located in the target sub-block of the target block The pattern is a target character pattern, wherein the third indication signal indicates the target sub-block. Further, the processor 19 may divide the target block into a plurality of divided sub-blocks, and assign each divided block a specified code, and the specified code of each divided sub-block corresponds to a different button on the infrared remote controller or key combination. The third indication signal generated by the infrared remote controller carries a corresponding designated code according to the key or combination of keys triggered by it. When the processor 19 receives the third indication signal, it can determine that the divided sub-block corresponding to the specified code is the target sub-block. The processor 19 takes the preselected pattern located in the target sub-block as the target character pattern, and proceeds to step S206. Specifically, when the processor 19 judges that there is no preselected pattern in the target sub-block, it does not act, or outputs a request signal through the display interface 13 or the audio output interface 15 to request another third instruction signal, and upon receiving Step S505 is executed again after a new third indication signal.

舉一個實際的例子來說明步驟S501~S505,請參考圖6A及6B,圖6A及6B係依據本發明另一實施例所繪示的紅外線遙控影音裝置的暫停畫面示意圖。假設圖6A所示的暫停畫面F6具有如圖3所示的指定代碼配置。當第二指示訊號指示指定代碼6時,處理器判斷目標區塊中存在符合預處理角色特徵組的預選圖案61的數量為一個,並將預選圖案61作為目標角色圖案。當第二指示訊號指示指定代碼4時,處理器判斷目標區塊中存在符合預處理角色特徵組的預選圖案62及63的數量為多個,接著控制顯示介面將目標區塊放大以填滿畫面,如圖6B所示。假設圖6B所示的目標區塊F61具有如圖3所示的指定代碼配置。當第三指示訊號指定代碼6時,處理器判斷預選圖案63為目標角色圖案。當第三指示訊號指示指定代碼5時,處理器判斷預選圖案62為目標角色圖案。To give a practical example to illustrate steps S501 - S505 , please refer to FIGS. 6A and 6B . FIGS. 6A and 6B are schematic diagrams of pause screens of an infrared remote control audio-visual device according to another embodiment of the present invention. Assume that the pause screen F6 shown in FIG. 6A has the specified code configuration as shown in FIG. 3 . When the second indication signal indicates the designation code 6, the processor determines that there is only one preselected pattern 61 in the target block, and takes the preselected pattern 61 as the target character pattern. When the second indication signal indicates the designation code 4, the processor judges that there are multiple preselected patterns 62 and 63 in the target block, and then controls the display interface to enlarge the target block to fill the screen , as shown in Figure 6B. Assume that the target block F61 shown in FIG. 6B has the specified code configuration as shown in FIG. 3 . When the third indication signal specifies the code 6, the processor determines that the preselected pattern 63 is the target character pattern. When the third indication signal indicates the designation code 5, the processor determines that the preselected pattern 62 is the target character pattern.

特別來說,圖6A/6B所示之分割暫停畫面F6/目標區塊F61的虛線可以選擇性地呈現於實際顯示的畫面上。舉例來說,處理器在控制顯示介面暫停播放時可以同時控制顯示介面呈現分割暫停畫面F6的虛線,以方便使用者選擇目標區塊,且/或在控制顯示介面放大目標區塊F61時可以同時控制顯示介面呈現分割目標區塊F61的虛線,以方便使用者選擇目標子區塊。In particular, the dotted line dividing the pause frame F6/target block F61 shown in FIG. 6A/6B can be selectively presented on the actually displayed frame. For example, when the processor controls the display interface to pause playback, it can simultaneously control the display interface to present the dotted line that divides the pause screen F6, so as to facilitate the user to select the target block, and/or can simultaneously control the display interface to enlarge the target block F61 The control display interface presents dotted lines dividing the target block F61 to facilitate the user to select the target sub-block.

如前所述,紅外線遙控影音裝置的記憶體中所儲存的預處理角色特徵組、預處理音軌及兩者間的對應關係可以由裝置內部處理器或外部處理器(例如雲端伺服器)在播放影片前處理,或由處理器在接獲第一指示訊號前的影片播放期間處理而得。上述資料的處理流程請參考圖7,圖7係依據本發明一實施例所繪示的紅外線遙控影音播放方法的預處理流程圖。如圖7所示,紅外線遙控影音播放方法的預處理流程涉及AI技術,可以包含步驟S701~S704。As mentioned above, the pre-processed character feature group, pre-processed audio track and the corresponding relationship between them stored in the memory of the infrared remote control audio-visual device can be controlled by the internal processor of the device or an external processor (such as a cloud server). Processed before playing the video, or processed by the processor during video playback before receiving the first instruction signal. Please refer to FIG. 7 for the processing flow of the above data. FIG. 7 is a preprocessing flow chart of an infrared remote control video playback method according to an embodiment of the present invention. As shown in FIG. 7 , the preprocessing flow of the infrared remote control video and audio playback method involves AI technology and may include steps S701 - S704 .

於步驟S701中,處理器對影片的多個顯示畫面執行多目標追蹤以取得多個角色各自在所述多個顯示畫面中所對應的多個特徵區塊。此處所述的多個顯示畫面可以為影片的所有顯示畫面,或是處理器在接獲第一指示訊號前影片已播放的顯示畫面。進一步來說,處理器所執行之多目標追蹤可以包含:調整顯示畫面大小;將調整後的顯示畫面輸入預先訓練好的物件偵測模型(例如Yolov3或其他可偵測人物的偵測模型),以產生多個偵測框;將所述多個偵測框輸入追蹤器處理,以取得多個角色的追蹤結果,即各角色在各顯示畫面中的特徵區塊。其中,追蹤器可以對輸入資料執行多目標追蹤演算法,例如SORT(Simple Online and Real-time Tracking)。In step S701 , the processor executes multi-object tracking on multiple display frames of the video to obtain multiple feature blocks corresponding to each of the multiple characters in the multiple display frames. The plurality of display frames mentioned here may be all display frames of the video, or the display frames of the video that have been played before the processor receives the first instruction signal. Further, the multi-target tracking performed by the processor may include: adjusting the size of the display screen; inputting the adjusted display screen into a pre-trained object detection model (such as Yolov3 or other detection models that can detect people), to generate multiple detection frames; input the multiple detection frames into the tracker for processing, so as to obtain the tracking results of multiple characters, that is, the feature blocks of each character in each display screen. Among them, the tracker can perform multi-target tracking algorithms on the input data, such as SORT (Simple Online and Real-time Tracking).

於步驟S702中,處理器依據角色各自對應的特徵區塊,取得多個外觀特徵組以作為所述多個預處理角色特徵組。進一步來說,處理器對於各個角色,可以從其追蹤結果取得五官特徵、臉型特徵等外觀特徵,組合為外觀特徵組,以作為預處理角色特徵組。In step S702, the processor obtains a plurality of appearance feature groups according to the corresponding feature blocks of the characters as the plurality of pre-processed character feature groups. Furthermore, for each character, the processor can obtain appearance features such as facial features and facial features from the tracking results, and combine them into an appearance feature group as a pre-processed character feature group.

於步驟S703中,處理器將音訊分離為具有不同聲紋的多個預處理音軌。進一步來說,處理器可以藉由預先訓練好的聲源分離模型將音訊分離為具有不同聲紋的多個預處理音軌。所述聲源分離模型可以係以AI智慧神經網路人聲辨識演算法訓練而成的模型,其中所述AI智慧神經網路人聲辨識演算法例如為SORT。於此要特別說明的是,處理器對於影片畫面的預處理以及對於影片音訊的預處理可以分別或同時執行。除了如圖7所示地執行於步驟S702之後,步驟S703可以執行於步驟S701之前,或可以執行於步驟S701與S702之間,或可以與步驟S701或S702同時執行。In step S703, the processor separates the audio into multiple pre-processed tracks with different voiceprints. Furthermore, the processor can separate the audio into multiple pre-processed tracks with different voiceprints by using a pre-trained sound source separation model. The sound source separation model may be a model trained with an AI intelligent neural network human voice recognition algorithm, wherein the AI intelligent neural network human voice recognition algorithm is, for example, SORT. It should be noted here that the preprocessing of the video image and the video audio by the processor can be performed separately or simultaneously. In addition to being performed after step S702 as shown in FIG. 7 , step S703 may be performed before step S701 , or may be performed between steps S701 and S702 , or may be performed simultaneously with step S701 or S702 .

於步驟S704中,處理器依據所述多個顯示畫面、所述多個預處理角色特徵組及所述多個預處理音軌,建立所述多個預處理角色特徵組與所述多個預處理音軌的對應關係。進一步來說,此步驟可以包含將每一預處理音軌作為目標音軌,執行:對所述多個顯示畫面中對應於目標音軌具有訊號之期間的多個目標畫面中的特徵區塊執行臉部動作偵測;以及依據臉部動作偵測結果,判斷目標音軌對應於所述多個預處理角色特徵組中之一者。簡而言之,處理器可以判斷在目標音軌有訊號時,顯示畫面中嘴部有開合動作的角色,並建立目標音軌與該角色的預處理角色特徵組的對應關係。特別來說,處理器會預先判斷目標音軌具有訊號之期間,再執行顯示畫面中嘴部有開合動作之角色的判斷,以避免目標音軌有訊號之期間與畫面中嘴部有開合動作的角色之聲音與動作無法對應,角色動作有延遲的現象出現之問題。In step S704, the processor establishes the multiple pre-processing character feature groups and the multiple pre-processing character feature groups and the multiple pre-processing character feature groups according to the multiple display screens, the multiple pre-processing character feature Handle the correspondence between audio tracks. Further, this step may include taking each pre-processed audio track as a target audio track, and executing: performing on the feature blocks in the plurality of target images corresponding to the period when the target audio track has a signal in the plurality of display images Facial motion detection; and according to the facial motion detection result, judging that the target audio track corresponds to one of the plurality of pre-processed character feature groups. In short, the processor can determine the character whose mouth is opening and closing in the display screen when the target audio track has a signal, and establishes the correspondence between the target audio track and the pre-processed character feature set of the character. In particular, the processor will pre-judge the period when the target audio track has a signal, and then execute the judgment of the character whose mouth is opening and closing in the display screen, so as to avoid the period when the target audio track has a signal and the opening and closing of the mouth in the screen. The voice of the moving character does not correspond to the movement, and there is a problem that the movement of the character is delayed.

藉由上述架構,本案所揭示的紅外線遙控影音裝置及紅外線遙控影音播放方法,基於多個預處理角色特徵組與多個預處理音軌的對應關係,判定紅外線指示訊號所指定的角色的對應音軌,可以提供單獨播放指定角色聲音的功能。With the above structure, the infrared remote control audio-visual device and the infrared remote control audio-visual playback method disclosed in this case determine the corresponding sound of the character specified by the infrared indication signal based on the correspondence between multiple pre-processed character feature groups and multiple pre-processed audio tracks. track, it can provide the function of playing the sound of the specified character independently.

雖然本發明以前述之實施例揭露如上,然其並非用以限定本發明。在不脫離本發明之精神和範圍內,所為之更動與潤飾,均屬本發明之專利保護範圍。關於本發明所界定之保護範圍請參考所附之申請專利範圍。Although the present invention is disclosed by the aforementioned embodiments, they are not intended to limit the present invention. Without departing from the spirit and scope of the present invention, all changes and modifications are within the scope of patent protection of the present invention. For the scope of protection defined by the present invention, please refer to the appended scope of patent application.

10:紅外線遙控影音裝置 11:紅外線感測器 13:顯示介面 15:音訊輸出介面 17:記憶體 19:處理器 1~9:指定代碼 F3、F4、F6:暫停畫面 F31:分割區塊 F61:目標區塊 41、42:目標角色圖案 61、62、63:預選圖案 10: Infrared remote control audio and video device 11: Infrared sensor 13: Display interface 15: Audio output interface 17: Memory 19: Processor 1~9: specify the code F3, F4, F6: Pause screen F31: split block F61: target block 41, 42: target character pattern 61, 62, 63: Preselected patterns

圖1係依據本發明一實施例所繪示的紅外線遙控影音裝置的功能方塊圖。 圖2係依據本發明一實施例所繪示的紅外線遙控影音播放方法的流程圖。 圖3係依據本發明一實施例所繪示的紅外線遙控影音裝置的使用情境圖。 圖4係依據本發明一實施例所繪示的紅外線遙控影音裝置的暫停畫面示意圖。 圖5係依據本發明另一實施例所繪示紅外線遙控影音播放方法的流程圖。 圖6A及6B係依據本發明另一實施例所繪示的紅外線遙控影音裝置的暫停畫面示意圖。 圖7係依據本發明一實施例所繪示的紅外線遙控影音播放方法的預處理流程圖。 FIG. 1 is a functional block diagram of an infrared remote control audio-visual device according to an embodiment of the present invention. FIG. 2 is a flow chart of an infrared remote control audio-visual playing method according to an embodiment of the present invention. FIG. 3 is a diagram of the usage situation of the infrared remote control audio-visual device according to an embodiment of the present invention. FIG. 4 is a schematic diagram of a pause screen of an infrared remote control audio-visual device according to an embodiment of the present invention. FIG. 5 is a flow chart of an infrared remote control video and audio playback method according to another embodiment of the present invention. 6A and 6B are schematic diagrams of a pause screen of an infrared remote control audio-visual device according to another embodiment of the present invention. FIG. 7 is a preprocessing flowchart of an infrared remote control video and audio playback method according to an embodiment of the present invention.

Claims (10)

一種紅外線遙控影音裝置,包含: 一紅外線感測器,用於接收一第一指示訊號及一第二指示訊號; 一顯示介面,用於播放一影片的多個顯示畫面; 一音訊輸出介面,用於輸出該影片的音訊; 一記憶體,儲存多個預處理角色特徵組、多個預處理音軌及該些預處理角色特徵組與該些預處理音軌的對應關係;以及 一處理器,連接於該紅外線感測器、該顯示介面、該音訊輸出介面及該記憶體,且用於: 依據該第一指示訊號,暫停該顯示介面的播放及該音訊輸出介面的輸出,使該顯示介面顯示一暫停畫面; 依據該第二指示訊號,判斷位於該暫停畫面的一目標區塊中的一目標角色圖案,其中該第二指示訊號指示該目標區塊,且該目標角色圖案符合該些預處理角色特徵組中之一者; 依據該些預處理角色特徵組與該些預處理音軌的該對應關係,從該音訊中提取對應於該目標角色圖案的一判定音軌;以及 控制該顯示介面繼續播放,且控制該音訊輸出介面輸出該判定音軌。 An infrared remote control audio-visual device, comprising: An infrared sensor for receiving a first indication signal and a second indication signal; a display interface for playing multiple display images of a video; an audio output interface for outputting the audio of the video; A memory for storing a plurality of pre-processing character feature groups, a plurality of pre-processing audio tracks, and correspondence between the pre-processing character feature groups and the pre-processing audio tracks; and A processor, connected to the infrared sensor, the display interface, the audio output interface and the memory, and used for: Pausing the playback of the display interface and the output of the audio output interface according to the first instruction signal, so that the display interface displays a pause image; According to the second indication signal, determine a target character pattern located in a target block of the pause screen, wherein the second indication signal indicates the target block, and the target character pattern conforms to the pre-processing character feature groups one of Extracting a determined audio track corresponding to the target character pattern from the audio according to the correspondence between the pre-processed character feature sets and the pre-processed audio tracks; and Control the display interface to continue playing, and control the audio output interface to output the determined audio track. 如請求項1所述的紅外線遙控影音裝置,其中該處理器更用於: 對該些顯示畫面執行多目標追蹤以取得多個角色各自在該些顯示畫面中所對應的多個特徵區塊; 依據該些角色各自對應的該些特徵區塊,取得多個外觀特徵組以作為該些預處理角色特徵組; 將該音訊分離為具有不同聲紋的多個預處理音軌;以及 依據該些顯示畫面、該些預處理角色特徵組及該些預處理音軌,建立該些預處理角色特徵組與該些預處理音軌的該對應關係。 The infrared remote control audio-visual device as described in claim 1, wherein the processor is further used for: performing multi-target tracking on the display images to obtain a plurality of feature blocks corresponding to the respective characters in the display images; Obtaining a plurality of appearance feature groups as the pre-processed character feature groups according to the feature blocks corresponding to the characters; separating the audio into multiple pre-processed tracks with different voiceprints; and According to the display images, the pre-processing character feature groups and the pre-processing audio tracks, the corresponding relationship between the pre-processing character feature groups and the pre-processing audio tracks is established. 如請求項2所述的紅外線遙控影音裝置,其中該處理器所執行之依據該些顯示畫面、該些預處理角色特徵組及該些預處理音軌,建立該些預處理角色特徵組與該些預處理音軌的該對應關係包含: 將每一該些預處理音軌作為一目標音軌,執行: 對該些顯示畫面中對應於該目標音軌具有訊號之期間的多個目標畫面中的該些特徵區塊執行臉部動作偵測;以及 依據臉部動作偵測結果,判斷該目標音軌對應於該些預處理角色特徵組中之一者。 The infrared remote control audio-visual device as described in claim 2, wherein the processor executes to establish the pre-processing character feature groups and the pre-processing character feature groups and the pre-processing audio tracks according to the display screens, the pre-processing character feature groups and the This correspondence for some preprocessed tracks includes: To use each of these preprocessed tracks as a target track, execute: performing facial motion detection on the feature blocks in the plurality of target frames corresponding to the period in which the target audio track has a signal in the display frames; and According to the facial motion detection result, it is determined that the target audio track corresponds to one of the pre-processed character feature groups. 如請求項1所述的紅外線遙控影音裝置,其中該處理器用於在判斷該暫停畫面的該目標區塊中存在符合該些預處理角色特徵組中之多者的多個預選圖案時,透過該顯示介面或該音訊輸出介面索取一第三指示訊號,透過該紅外線感測器取得該第三指示訊號,且依據該第三指示訊號,判斷該些預選圖案中位於該目標區塊的一目標子區塊中的預選圖案為該目標角色圖案,其中該第三指示訊號指示該目標子區塊。The infrared remote control audio-visual device as described in claim 1, wherein the processor is configured to, when judging that there are a plurality of pre-selected patterns in the target block of the paused screen that match more of the pre-processed character feature groups, through the The display interface or the audio output interface requests a third indication signal, obtains the third indication signal through the infrared sensor, and judges a target object located in the target block among the preselected patterns according to the third indication signal The preselected pattern in the block is the target character pattern, wherein the third indication signal indicates the target sub-block. 如請求項1所述的紅外線遙控影音裝置,其中該處理器更用於將該暫停畫面分割為多個分割區塊,分配每一該些分割區塊一指定代碼,其中該目標區塊係該些分割區塊中的一者,且該第二指示訊號指示對應於該目標區塊的該指定代碼。The infrared remote control audio-visual device as described in claim 1, wherein the processor is further used to divide the paused screen into a plurality of divided blocks, and assign a designated code to each of the divided blocks, wherein the target block is the one of the divided blocks, and the second indication signal indicates the designated code corresponding to the target block. 一種紅外線遙控影音播放方法,包含: 藉由一顯示介面播放一影片的多個顯示畫面,且藉由一音訊輸出介面輸出該影片的音訊; 藉由一紅外線感測器接收一第一指示訊號; 藉由一處理器,依據該第一指示訊號,暫停該顯示介面的播放及該音訊輸出介面的輸出,使該顯示介面顯示一暫停畫面; 藉由該紅外線感測器接收一第二指示訊號; 藉由該處理器依據該第二指示訊號,判斷位於該暫停畫面的一目標區塊中的一目標角色圖案,其中該第二指示訊號指示該目標區塊,且該目標角色圖案符合多個預處理角色特徵組中之一者; 藉由該處理器依據該些預處理角色特徵組與多個預處理音軌的對應關係,從該音訊中提取對應於該目標角色圖案的一判定音軌;以及 控制該顯示介面繼續播放,且控制該音訊輸出介面輸出該判定音軌。 An infrared remote control audio and video playback method, comprising: Play multiple display images of a video through a display interface, and output the audio of the video through an audio output interface; receiving a first indication signal through an infrared sensor; A processor, according to the first instruction signal, pauses the playback of the display interface and the output of the audio output interface, so that the display interface displays a pause image; receiving a second indication signal through the infrared sensor; judging by the processor a target character pattern located in a target block of the paused screen according to the second indication signal, wherein the second indication signal indicates the target block, and the target character pattern conforms to a plurality of predetermined Dealing with one of the group of character traits; Extracting, by the processor, a determination track corresponding to the target character pattern from the audio according to the correspondence between the pre-processing character feature groups and a plurality of pre-processing tracks; and Control the display interface to continue playing, and control the audio output interface to output the determined audio track. 如請求項6所述的紅外線遙控影音播放方法,更包含: 對該些顯示畫面執行多目標追蹤以取得多個角色各自在該些顯示畫面中所對應的多個特徵區塊; 依據該些角色各自對應的該些特徵區塊,取得多個外觀特徵組以作為該些預處理角色特徵組; 將該音訊分離為具有不同聲紋的多個預處理音軌;以及 依據該些顯示畫面、該些預處理角色特徵組及該些預處理音軌,建立該些預處理角色特徵組與該些預處理音軌的該對應關係。 The infrared remote control audio and video playback method as described in request item 6 further includes: performing multi-target tracking on the display images to obtain a plurality of feature blocks corresponding to the respective characters in the display images; Obtaining a plurality of appearance feature groups as the pre-processed character feature groups according to the feature blocks corresponding to the characters; separating the audio into multiple pre-processed tracks with different voiceprints; and According to the display images, the pre-processing character feature groups and the pre-processing audio tracks, the corresponding relationship between the pre-processing character feature groups and the pre-processing audio tracks is established. 如請求項7所述的紅外線遙控影音播放方法,其中依據該些顯示畫面、該些預處理角色特徵組及該些預處理音軌,建立該些預處理角色特徵組與該些預處理音軌的該對應關係包含: 將每一該些預處理音軌作為一目標音軌,執行: 對該些顯示畫面中對應於該目標音軌具有訊號之期間的多個目標畫面中的該些特徵區塊執行臉部動作偵測;以及 依據臉部動作偵測結果,判斷該目標音軌對應於該些預處理角色特徵組中之一者。 The infrared remote control audio-visual playback method as described in claim 7, wherein the pre-processing character feature groups and the pre-processing audio tracks are established according to the display screens, the pre-processing character feature groups and the pre-processing audio tracks This correspondence for contains: To use each of these preprocessed tracks as a target track, execute: performing facial motion detection on the feature blocks in the plurality of target frames corresponding to the period in which the target audio track has a signal in the display frames; and According to the facial motion detection result, it is determined that the target audio track corresponds to one of the pre-processed character feature groups. 如請求項6所述的紅外線遙控影音播放方法,其中依據該第二指示訊號,判斷位於該暫停畫面的該目標區塊中的該目標角色圖案包含: 在判斷該暫停畫面的該目標區塊中存在符合該些預處理角色特徵組中之多者的多個預選圖案時,透過該顯示介面或該音訊輸出介面索取一第三指示訊號; 透過該紅外線感測器取得該第三指示訊號;以及 依據該第三指示訊號,判斷該些預選圖案中位於該目標區塊的一目標子區塊中的預選圖案為該目標角色圖案,其中該第三指示訊號指示該目標子區塊。 The infrared remote control audio-visual playback method as described in claim 6, wherein according to the second indication signal, it is determined that the target character pattern located in the target block of the pause screen includes: requesting a third indication signal through the display interface or the audio output interface when it is judged that there are multiple pre-selected patterns matching more than one of the pre-processed character feature sets in the target block of the pause screen; obtain the third indication signal through the infrared sensor; and According to the third indication signal, it is determined that the pre-selected pattern located in a target sub-block of the target block among the pre-selected patterns is the target character pattern, wherein the third indication signal indicates the target sub-block. 如請求項6所述的紅外線遙控影音播放方法,更包含: 將該暫停畫面分割為多個分割區塊,分配每一該些分割區塊一指定代碼; 其中該目標區塊係該些分割區塊中的一者,且該第二指示訊號指示對應於該目標區塊的該指定代碼。 The infrared remote control audio and video playback method as described in request item 6 further includes: dividing the paused picture into a plurality of divided blocks, and assigning a specified code to each of the divided blocks; Wherein the target block is one of the divided blocks, and the second indication signal indicates the designated code corresponding to the target block.
TW110134476A 2021-09-15 2021-09-15 Infrared remote controlled video and audio device and infrared remote control method of playing video and audio TW202315414A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW110134476A TW202315414A (en) 2021-09-15 2021-09-15 Infrared remote controlled video and audio device and infrared remote control method of playing video and audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW110134476A TW202315414A (en) 2021-09-15 2021-09-15 Infrared remote controlled video and audio device and infrared remote control method of playing video and audio

Publications (1)

Publication Number Publication Date
TW202315414A true TW202315414A (en) 2023-04-01

Family

ID=86943048

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110134476A TW202315414A (en) 2021-09-15 2021-09-15 Infrared remote controlled video and audio device and infrared remote control method of playing video and audio

Country Status (1)

Country Link
TW (1) TW202315414A (en)

Similar Documents

Publication Publication Date Title
TWI778477B (en) Interaction methods, apparatuses thereof, electronic devices and computer readable storage media
US20210249012A1 (en) Systems and methods for operating an output device
JP7431291B2 (en) System and method for domain adaptation in neural networks using domain classifiers
US10015385B2 (en) Enhancing video conferences
WO2021109673A1 (en) Audio and video quality enhancement method and system employing scene recognition, and display device
KR20190093722A (en) Electronic apparatus, method for controlling thereof, and computer program product thereof
US11762905B2 (en) Video quality evaluation method and apparatus, device, and storage medium
US11503375B2 (en) Systems and methods for displaying subjects of a video portion of content
US20150271570A1 (en) Audio/video system with interest-based ad selection and methods for use therewith
JP2011164681A (en) Device, method and program for inputting character and computer-readable recording medium recording the same
US11405675B1 (en) Infrared remote control audiovisual device and playback method thereof
KR20190097687A (en) Electronic device and Method for generating summary image of electronic device
TW202315414A (en) Infrared remote controlled video and audio device and infrared remote control method of playing video and audio
US10798459B2 (en) Audio/video system with social media generation and methods for use therewith
US20150271553A1 (en) Audio/video system with user interest processing and methods for use therewith
CN115811590A (en) Mobile audio/video device and audio/video playing control method
US11099811B2 (en) Systems and methods for displaying subjects of an audio portion of content and displaying autocomplete suggestions for a search related to a subject of the audio portion
CN114727119A (en) Live broadcast and microphone connection control method and device and storage medium
US20210089781A1 (en) Systems and methods for displaying subjects of a video portion of content and displaying autocomplete suggestions for a search related to a subject of the video portion
US20200204856A1 (en) Systems and methods for displaying subjects of an audio portion of content
TWI777771B (en) Mobile video and audio device and control method of playing video and audio
CA3104211A1 (en) Systems and methods for displaying subjects of a portion of content
CN115866312A (en) Server and subtitle position setting method
CN115859970A (en) Server and subtitle generating method
CN108427548A (en) User interaction approach, device, equipment based on microphone and storage medium