WO2020138618A1 - Procédé et appareil de reconnaissance d'émotion de morceau de musique - Google Patents

Procédé et appareil de reconnaissance d'émotion de morceau de musique Download PDF

Info

Publication number
WO2020138618A1
WO2020138618A1 PCT/KR2019/008959 KR2019008959W WO2020138618A1 WO 2020138618 A1 WO2020138618 A1 WO 2020138618A1 KR 2019008959 W KR2019008959 W KR 2019008959W WO 2020138618 A1 WO2020138618 A1 WO 2020138618A1
Authority
WO
WIPO (PCT)
Prior art keywords
music
emotion recognition
words
weight
processor
Prior art date
Application number
PCT/KR2019/008959
Other languages
English (en)
Korean (ko)
Inventor
김은이
Original Assignee
건국대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 건국대학교 산학협력단 filed Critical 건국대학교 산학협력단
Publication of WO2020138618A1 publication Critical patent/WO2020138618A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments below relate to a method and apparatus for recognizing musical sensibility.
  • lyrics The importance of audio or lyrics depends on the style of the music. For example, in dance music, audio is relevant, and in poetic music, lyrics are key. Various psychological studies have confirmed the importance of lyrics to convey meaning information. However, despite the importance of lyrics, conventional studies on lyrics-based music emotion recognition have limitations.
  • Embodiments may provide a technique for recognizing musical sensibility.
  • the music emotion recognition method includes receiving data related to lyrics of music, generating a vocabulary dictionary based on the data, and weights corresponding to the words included in the vocabulary dictionary and the data And extracting a feature vector based on the result, and determining the emotion of the music using the feature vector as an input of an artificial neural network.
  • the generating may include filtering the data based on the type of language, filtering the data filtered based on the type of language based on the part of speech of the word, and filtering the data based on the part of speech of the word And generating the vocabulary dictionary by removing words that have no meaning from.
  • the step of generating the vocabulary dictionary by removing meaningless words from the data filtered based on the parts of speech of the word may include removing at least one of numbers, exclamation marks, alphabets, and relative pronouns from the data filtered based on the parts of speech. And generating the vocabulary dictionary.
  • the extracting may include generating a set of words based on the data, generating a generation vector based on the vocabulary dictionary and the set of words, and the components of the generating vector based on the weights. And extracting the feature vector by transforming.
  • the generating of the set of words may include dividing the words included in the data, and generating the set of words by restoring the original form of the words.
  • Extracting the feature vector by transforming the components of the generated vector based on the weight includes calculating a first weight based on the number of words included in the set of words, and nonlinear according to a predetermined constant.
  • the method may include calculating a second weight based on a function and extracting the feature vector by transforming a component of the generation vector based on a product of the first weight and the second weight.
  • the calculating of the first weight may include calculating the first weight using TF-IDF (Term Frequency-Inverse Document Frequency).
  • TF-IDF Term Frequency-Inverse Document Frequency
  • the calculating of the second weight may include calculating the second weight based on a sigmoid function.
  • the determining may include calculating a probability value corresponding to a plurality of emotional groups using the artificial neural network, and determining the emotion of the music based on the probability value.
  • the artificial neural network is a Deep Belief Network (DBN), and the DBN can be trained using transfer learning.
  • DBN Deep Belief Network
  • the apparatus for recognizing music emotion includes a receiver for receiving data related to lyrics of music, a vocabulary dictionary based on the data, and weights corresponding to words included in the vocabulary dictionary and the data And a processor for extracting a feature vector and using the feature vector as an input of an artificial neural network to determine the sensitivity of the music.
  • the processor filters the data based on the type of language, filters the data filtered based on the type of language based on the part of speech of the word, and extracts words that have no meaning from the filtered data based on the part of speech of the word.
  • the vocabulary dictionary can be generated by removing it.
  • the processor may generate the vocabulary dictionary by removing at least one of numbers, interjections, alphabets, and relational pronouns from data filtered based on the part-of-speech.
  • the processor generates the set of words based on the data, generates a generation vector based on the vocabulary dictionary and the set of words, and transforms the feature vector by converting the components of the generation vector based on the weight. Can be extracted.
  • the processor may generate a set of words by dividing words included in the data and restoring the original form of the words.
  • the processor calculates a first weight based on the number of words included in the set of words, calculates a second weight based on a nonlinear function according to a predetermined constant, and calculates the first weight and the second weight
  • the feature vector can be extracted by transforming the components of the generation vector based on the product of.
  • the processor may calculate the first weight using TF-IDF (Term Frequency-Inverse Document Frequency).
  • TF-IDF Term Frequency-Inverse Document Frequency
  • the processor may calculate the second weight based on a sigmoid function.
  • the processor may calculate a probability value corresponding to a plurality of emotional groups using the artificial neural network, and determine the emotion of the music based on the probability value.
  • the artificial neural network is a Deep Belief Network (DBN), and the DBN can be trained using transfer learning.
  • DBN Deep Belief Network
  • FIG. 1 is a schematic block diagram of a music emotion recognition apparatus according to an embodiment.
  • FIG. 2 shows the overall operation of the music emotion recognition device shown in FIG. 1.
  • FIG. 3 illustrates an operation in which the music emotion recognition apparatus illustrated in FIG. 1 generates a vocabulary dictionary.
  • 4A shows the distribution of feature vectors by TF-IDF weights.
  • FIG. 4B shows the distribution of feature vectors by weights according to the apparatus for recognizing music emotion shown in FIG. 1.
  • FIG. 5 shows a comparison result of recognition accuracy between the prior art and the music emotion recognition apparatus shown in FIG. 1.
  • FIG. 6 is a flowchart illustrating an operation of the music emotion recognition device illustrated in FIG. 1.
  • first or second may be used to describe various components, but the components should not be limited by the terms. The terms are for the purpose of distinguishing one component from another component, for example, without departing from the scope of rights according to the concept of the embodiment, the first component may be referred to as the second component, and similarly The second component may also be referred to as the first component.
  • FIG. 1 is a schematic block diagram of a music emotion recognition apparatus according to an embodiment.
  • the music emotion recognition device 10 may recognize emotion of music.
  • Music emotion recognition device 10 may recognize the emotion of music based on the lyrics of the music.
  • the music emotion recognition device 10 may recognize the emotion of music by analyzing lyrics of music using an artificial neural network.
  • the sensibility of music can include the emotions of the person listening to the music. Emotions can mean feelings or feelings that arise about a phenomenon or thing.
  • the emotion of music can mean the emotion included in the Russel emotion group.
  • the sensibility of music can include happiness, tension, sadness, and relaxation.
  • the music emotion recognition device 10 may be implemented as a printed circuit board (PCB) such as a mother board, an integrated circuit (IC), or a system on chip (SoC).
  • PCB printed circuit board
  • IC integrated circuit
  • SoC system on chip
  • the music emotion recognition device 10 may be implemented as an application processor.
  • the music emotion recognition device 10 may be implemented in a personal computer (PC), a data server, or a portable device.
  • PC personal computer
  • data server a data server
  • portable device a portable device
  • Portable devices include laptop computers, mobile phones, smart phones, tablet PCs, mobile internet devices (MIDs), personal digital assistants (PDAs), and enterprise digital assistants (EDAs). , Digital still camera, digital video camera, portable multimedia player (PMP), personal navigation device or portable navigation device (PND), handheld game console, e-book ( e-book), or a smart device.
  • the smart device may be implemented as a smart watch, a smart band, or a smart ring.
  • the music emotion recognition device 10 uses a new feature extraction technology utilizing semantics to give higher weight to words related to emotion words to achieve higher Provide recognition rates.
  • the music emotion recognition device 10 includes a receiver 100 and a processor 200.
  • the music emotion recognition device 10 may further include a memory 300.
  • the receiver 100 may receive data related to lyrics of music.
  • the receiver 100 may receive data related to lyrics of music from the outside, or may receive data from the memory 300.
  • the receiver 100 may output data related to lyrics of music to the processor 200.
  • the data related to the lyrics of the music may include the target music to be recognized, the lyrics of the target music to be recognized, the lyrics of the music to train the artificial neural network, and the lyrics dataset to generate the vocabulary dictionary.
  • the processor 200 may generate a vocabulary dictionary based on data related to lyrics of the received music.
  • the processor 200 may generate a vocabulary dictionary by filtering data related to lyrics of music.
  • the processor 200 may filter data related to lyrics of music based on the type of language.
  • the processor 200 may filter the filtered data based on the type of language based on the part of speech of the word.
  • the processor 200 may remove the meaningless word from the filtered data based on the part of speech of the word. Specifically, the processor 200 may remove at least one of numbers, interjections, alphabets, and relative pronouns from the data filtered based on the parts of speech. The order and type of filtering can be changed as needed. The filtering operation will be described in detail with reference to FIG. 3.
  • the processor 200 may extract the feature vector based on the generated vocabulary dictionary and weights corresponding to words included in the data.
  • the processor 200 may generate a set of words based on data related to lyrics of music.
  • the processor 200 may divide words included in the data.
  • the processor 200 may generate a set of words by restoring the original form of the divided words.
  • the processor 200 may generate an occurrence vector based on a vocabulary dictionary and a set of words.
  • the processor 200 may extract a feature vector by transforming the components of the generation vector based on weights.
  • the processor 200 may calculate the first weight based on the number of words included in the set of words.
  • the processor 200 may calculate a first weight using TF-IDF (Term Frequency-Inverse Document Frequency).
  • the processor 200 may calculate the second weight based on a nonlinear function according to a predetermined constant. For example, the processor 200 may calculate the second weight based on the sigmoid function.
  • the processor 200 may extract the feature vector by transforming the components of the generation vector based on the product of the first weight and the second weight.
  • the processor 200 may determine the emotion of music by using the feature vector as an input of an artificial neural network.
  • the processor 200 may calculate a probability value corresponding to a plurality of emotional groups using an artificial neural network. At this time, the processor 200 may determine the emotion of the music based on the probability value.
  • the processor 200 may train an artificial neural network.
  • the artificial neural network may include a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN), and a Deep Belief Network (DBN).
  • RNN Recurrent Neural Network
  • CNN Convolutional Neural Network
  • DNN Deep Belief Network
  • an artificial neural network may include DNB.
  • the processor 200 may train an artificial neural network using transfer learning.
  • the processor 200 may train the DBN using transfer learning.
  • the memory 300 may store learning parameters of an artificial neural network, probability values according to an emotional model, data about received music, lyrics information, and the like.
  • the memory 300 may be implemented as a volatile memory device or a nonvolatile memory device.
  • the volatile memory device may be implemented with dynamic random access memory (DRAM), static random access memory (SRAM), thyristor RAM (T-RAM), zero capacitor RAM (Z-RAM), or twin transistor RAM (TTRAM).
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • T-RAM thyristor RAM
  • Z-RAM zero capacitor RAM
  • TTRAM twin transistor RAM
  • Nonvolatile memory devices include EEPROM (Electrically Erasable Programmable Read-Only Memory), Flash memory, Magnetic RAM (MRAM), Spin-Transfer Torque (STT)-MRAM), Conductive Bridging RAM (CBRAM) , Ferroelectric RAM (FeRAM), Phase change RAM (PRAM), Resistive RAM (RRAM), Nanotube RRAM, Polymer RAM (PoRAM), Nano Floating Gate Memory Memory (NFGM)), a holographic memory (holographic memory), a molecular electronic memory device (Molecular Eelectronic Memory Device), or an insulation resistance change memory (Insulator Resistance Change Memory).
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • Flash memory Flash memory
  • MRAM Magnetic RAM
  • STT Spin-Transfer Torque
  • CBRAM Conductive Bridging RAM
  • FeRAM Ferroelectric RAM
  • PRAM Phase change RAM
  • Resistive RAM RRAM
  • Nanotube RRAM Nanotube RRAM
  • Polymer RAM Polymer RAM
  • NFGM Nano Floating
  • FIG. 2 illustrates the overall operation of the music emotion recognition device illustrated in FIG. 1
  • FIG. 3 illustrates an operation in which the music emotion recognition apparatus illustrated in FIG. 1 generates a vocabulary dictionary.
  • the music emotion recognition apparatus 10 may classify and recommend music based on emotion by recognizing the emotion of a person hidden in lyrics of music.
  • the music emotion recognition device 10 may construct a vocabulary dictionary for lyrics required for music lyrics analysis.
  • the music emotion recognition apparatus 10 may extract a 1082-dimensional feature vector for the music lyrics inputted through vector quantization on the music lyrics based on a vocabulary dictionary and an enhanced weighting technique for the emotion words.
  • the music emotion recognition apparatus 10 may generate a probability value for an emotional group of Russell of individual music through a Deep Belief Network (DBN) developed using transfer learning.
  • DBN Deep Belief Network
  • the music emotion recognition device 10 may generate a vocabulary dictionary based on data related to lyrics of the received music.
  • the music emotion recognition device 10 constructs a vocabulary dictionary based on the most frequently used vocabularies in the MSD (Million Song Dataset), which is the most used data set in the field of emotion-based music recognition (MER). Can.
  • the music emotion recognition device 10 may generate a vocabulary dictionary by filtering data related to lyrics of music.
  • the music emotion recognition device 10 may generate a vocabulary dictionary by filtering data related to lyrics of music.
  • the music emotion recognition apparatus 10 may remove words irrelevant to emotion and words corresponding to noise, and select only vocabularies to be used for the feature vector.
  • the music emotion recognition device 10 may filter data based on the type of language. For example, the music emotion recognition apparatus 10 may remove words written in languages other than English using an available language detection (API) application programming interface (API).
  • API available language detection
  • API application programming interface
  • the music emotion recognition device 10 may filter data based on the part of speech of the language. For example, the music emotion recognition apparatus 10 uses only the part-of-speech to leave only parts of speech (eg, adjectives, nouns, verbs) that can affect the emotional words of music. You can filter.
  • parts of speech eg, adjectives, nouns, verbs
  • the music emotion recognition device 10 may remove words that have no meaning.
  • the music emotion recognition apparatus 10 may filter languages such as numbers and simple exclamation (eg, min).
  • the music emotion recognition apparatus 10 may generate a vocabulary dictionary including a plurality of words.
  • the music emotion recognition device 10 may generate a vocabulary dictionary composed of 1082 English words.
  • the music emotion recognition apparatus 10 may extract the feature vector based on the generated vocabulary dictionary and weights corresponding to words included in the data.
  • the music emotion recognition apparatus 10 may express lyrics using a new BoW that improves the existing BoW expression and gives greater weight to the emotional vocabulary.
  • the music emotion recognition device 10 may extract a feature vector through three processes. First, the music emotion recognition device 10 may pre-process the lyrics of the received music.
  • the music emotion recognition device 10 may divide words included in the data.
  • the music emotion recognition apparatus 10 may generate a set of words by restoring the original form of the divided words.
  • the music emotion recognition apparatus 10 may perform tokenization that divides the lyrics of each music into a set of words and stemming that restores each prototype for the divided words.
  • the music emotion recognition apparatus 10 may perform vector quantization.
  • the music emotion recognition apparatus 10 may quantify individual music lyrics based on the generated vocabulary dictionary, and express them as a 1082-dimensional occurrence vector.
  • the music emotion recognition apparatus 10 may calculate a weight based on the emotional vocabulary.
  • the music emotion recognition apparatus 10 may calculate a first weight based on the number of words included in the set of words.
  • the music emotion recognition apparatus 10 may calculate a first weight using TF-IDF (Term Frequency-Inverse Document Frequency). For example, the music emotion recognition apparatus 10 may calculate the first weight ⁇ a,i using Equation (1).
  • N is the frequency in which the word appears in the lyrics of all the music in the database
  • N i may mean the frequency in which the word appears in the lyrics of the i-th music.
  • the music emotion recognition apparatus 10 may calculate a second weight based on a nonlinear function according to a predetermined constant.
  • the music emotion recognition apparatus 10 may calculate a second weight based on a sigmoid function.
  • the music emotion recognition apparatus 10 may calculate the second weight ⁇ s,i using Equation (2).
  • the second weight may be defined as a sentiment score.
  • may mean a slope determination constant
  • S i may mean a constant determined by an emotional dictionary.
  • S i may be a value provided in a dictionary of emotional words, which is priced from -3 to +3 for words related to emotion.
  • the emotional word dictionary used by the music emotional recognition device 10 may include SentiStrength and SentiWordNet used in data mining. Also, the slope determination constant ⁇ can be determined through various experiments.
  • the music emotion recognition apparatus 10 may extract the feature vector by transforming the components of the generated vector based on the product of the first weight and the second weight. For example, the music emotion recognition apparatus 10 may calculate the weight value ⁇ i of each element of the generation vector using Equation (3).
  • the music emotion recognition device 10 may recognize the emotion of music using an artificial neural network. In addition, the music emotion recognition device 10 may train an artificial neural network.
  • the music emotion recognition apparatus 10 may perform emotion recognition using a DBN.
  • the emotional class used by the music emotional recognition device 10 is representative of each quadrant in the emotional model of Russell composed of arousal and balance axes, such as happiness, relaxation, It may include sadness and tension.
  • the music emotion recognition device 10 may use transfer learning to reduce this cost.
  • the structure of the DBN used by the music emotion recognition device 10 may include an input layer, two hidden layers, and an output layer. At this time, 1082 input nodes, and the number of nodes in the first and second hidden layers may be 1000 and 500, respectively.
  • the output layer can be four because it corresponds to the quadrant sensibility of Russell.
  • the music emotion recognition apparatus 10 may learn the parameter values of the DBN through fine-tuning using the collected 3000 lyrics of music lyrics.
  • FIG. 4A shows the distribution of feature vectors by TF-IDF weights
  • FIG. 4B shows the distribution of feature vectors by weights according to the music emotion recognition apparatus shown in FIG. 1.
  • the music emotion recognition apparatus 10 may improve the discrimination power of individual music lyrics by considering emotion-word-based weights using emotion scores. Referring to FIG. 4B, individual musicians may be expressed in a more identifiable form through the feature vector extraction method described above.
  • the feature vectors generated by the music emotion recognition apparatus 10 may show different feature vector distributions according to the emotion group, unlike when only the TF-IDF is used. Through this, the music emotion recognition device 10 can further improve the performance of the music classification result.
  • FIG. 5 shows a comparison result of recognition accuracy between the prior art and the music emotion recognition apparatus shown in FIG. 1.
  • the music emotion recognition device 10 exhibits a higher level of recognition rate compared to the conventional technology.
  • the music emotion recognition device 1 may show performance when using the SentiWordNet emotion word dictionary, and the music emotion recognition device 2 may indicate performance when using the SentiStrenth emotion word dictionary.
  • SentiStrenth shows the highest recognition performance.
  • FIG. 6 is a flowchart illustrating an operation of the music emotion recognition device illustrated in FIG. 1.
  • the receiver 100 may receive data related to lyrics of music (610).
  • the receiver 100 may receive data related to lyrics of music from the outside, or may receive data from the memory 300.
  • the receiver 100 may output data related to lyrics of music to the processor 200.
  • the processor 200 may generate a vocabulary dictionary based on data related to lyrics of the received music (630).
  • the processor 200 may generate a vocabulary dictionary by filtering data related to lyrics of music.
  • the processor 200 may filter data related to lyrics of music based on the type of language.
  • the processor 200 may filter the filtered data based on the type of language based on the part of speech of the word.
  • the processor 200 may remove the meaningless word from the filtered data based on the part of speech of the word. Specifically, the processor 200 may remove at least one of numbers, interjections, alphabets, and relative pronouns from the data filtered based on the parts of speech.
  • the processor 200 may extract a feature vector based on the generated vocabulary dictionary and weights corresponding to words included in the data (650 ).
  • the processor 200 may generate a set of words based on data related to lyrics of music.
  • the processor 200 may divide words included in the data.
  • the processor 200 may generate a set of words by restoring the original form of the divided words.
  • the processor 200 may generate an occurrence vector based on a vocabulary dictionary and a set of words.
  • the processor 200 may extract a feature vector by transforming the components of the generation vector based on weights.
  • the processor 200 may calculate the first weight based on the number of words included in the set of words.
  • the processor 200 may calculate a first weight using TF-IDF (Term Frequency-Inverse Document Frequency).
  • the processor 200 may calculate the second weight based on a nonlinear function according to a predetermined constant. For example, the processor 200 may calculate the second weight based on the sigmoid function.
  • the processor 200 may extract the feature vector by transforming the components of the generation vector based on the product of the first weight and the second weight.
  • the processor 200 may determine the emotion of the music by using the feature vector as an input of the artificial neural network (670 ).
  • the processor 200 may calculate a probability value corresponding to a plurality of emotional groups using an artificial neural network. At this time, the processor 200 may determine the emotion of the music based on the probability value.
  • the method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium.
  • Computer-readable media may include program instructions, data files, data structures, or the like alone or in combination.
  • the program instructions recorded on the medium may be specially designed and constructed for the embodiments, or may be known and usable by those skilled in computer software.
  • Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs, DVDs, and magnetic media such as floptical disks. Includes hardware devices specifically configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like.
  • Examples of program instructions include high-level language code that can be executed by a computer using an interpreter, etc., as well as machine language codes produced by a compiler.
  • the hardware device can be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.
  • the software may include a computer program, code, instruction, or a combination of one or more of these, and configure the processing device to operate as desired, or process independently or collectively You can command the device.
  • Software and/or data may be interpreted by a processing device, or to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodied in the signal wave being transmitted.
  • the software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé et un appareil de reconnaissance d'émotion de morceau de musique. Un procédé de reconnaissance d'émotion de morceau de musique, selon un mode de réalisation, comprend les étapes consistant : à recevoir des données liées à des paroles de chansons ; à générer un dictionnaire de vocabulaire sur la base des données ; à extraire un vecteur de caractéristiques sur la base d'un poids correspondant à un mot compris dans le dictionnaire de vocabulaire et les données ; et à déterminer l'émotion du morceau de musique en utilisant le vecteur de caractéristiques comme une entrée d'un réseau neuronal artificiel.
PCT/KR2019/008959 2018-12-28 2019-07-19 Procédé et appareil de reconnaissance d'émotion de morceau de musique WO2020138618A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180172014A KR101987605B1 (ko) 2018-12-28 2018-12-28 음악 감성 인식 방법 및 장치
KR10-2018-0172014 2018-12-28

Publications (1)

Publication Number Publication Date
WO2020138618A1 true WO2020138618A1 (fr) 2020-07-02

Family

ID=66848023

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/008959 WO2020138618A1 (fr) 2018-12-28 2019-07-19 Procédé et appareil de reconnaissance d'émotion de morceau de musique

Country Status (2)

Country Link
KR (1) KR101987605B1 (fr)
WO (1) WO2020138618A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656560B (zh) * 2021-10-19 2022-02-22 腾讯科技(深圳)有限公司 情感类别的预测方法和装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040071720A (ko) * 2001-12-12 2004-08-12 소니 일렉트로닉스 인코포레이티드 텍스트 메시지내에 감정을 표현하는 방법
KR20100024769A (ko) * 2008-08-26 2010-03-08 금오공과대학교 산학협력단 음악추천 시스템 및 방법
KR20130022075A (ko) * 2011-08-24 2013-03-06 한국전자통신연구원 감성 어휘 정보 구축 방법 및 장치
WO2017058844A1 (fr) * 2015-09-29 2017-04-06 Amper Music, Inc. Machines, systèmes et procédés de composition et de génération automatique de musique au moyen de descripteurs d'expérience musicale basés sur des icônes linguistiques et/ou graphiques
KR20180092733A (ko) * 2017-02-10 2018-08-20 강원대학교산학협력단 관계 추출 학습 데이터 생성 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040071720A (ko) * 2001-12-12 2004-08-12 소니 일렉트로닉스 인코포레이티드 텍스트 메시지내에 감정을 표현하는 방법
KR20100024769A (ko) * 2008-08-26 2010-03-08 금오공과대학교 산학협력단 음악추천 시스템 및 방법
KR20130022075A (ko) * 2011-08-24 2013-03-06 한국전자통신연구원 감성 어휘 정보 구축 방법 및 장치
WO2017058844A1 (fr) * 2015-09-29 2017-04-06 Amper Music, Inc. Machines, systèmes et procédés de composition et de génération automatique de musique au moyen de descripteurs d'expérience musicale basés sur des icônes linguistiques et/ou graphiques
KR20180092733A (ko) * 2017-02-10 2018-08-20 강원대학교산학협력단 관계 추출 학습 데이터 생성 방법

Also Published As

Publication number Publication date
KR101987605B1 (ko) 2019-06-10

Similar Documents

Publication Publication Date Title
Yu et al. Deep cross-modal correlation learning for audio and lyrics in music retrieval
Bharti et al. Sarcastic sentiment detection in tweets streamed in real time: a big data approach
Dai et al. Document embedding with paragraph vectors
Aggarwal et al. Classification of fake news by fine-tuning deep bidirectional transformers based language model
JP2019125343A (ja) 曖昧なエンティティワードに基づくテキスト処理方法及び装置
WO2012070840A2 (fr) Dispositif et procédé de recherche de consensus
US10474747B2 (en) Adjusting time dependent terminology in a question and answer system
US10083398B2 (en) Framework for annotated-text search using indexed parallel fields
US20180285448A1 (en) Producing personalized selection of applications for presentation on web-based interface
Sogancioglu et al. Is Everything Fine, Grandma? Acoustic and Linguistic Modeling for Robust Elderly Speech Emotion Recognition.
CN114880447A (zh) 信息检索方法、装置、设备及存储介质
Ragusa et al. Image polarity detection on resource-constrained devices
Zhang et al. AIA-net: Adaptive interactive attention network for text–audio emotion recognition
WO2020138618A1 (fr) Procédé et appareil de reconnaissance d'émotion de morceau de musique
WO2018143490A1 (fr) Système de prédiction de l'humeur d'un utilisateur à l'aide d'un contenu web, et procédé associé
WO2022060066A1 (fr) Dispositif électronique, système et procédé de recherche de contenu
CN111555960A (zh) 信息生成的方法
Hebbar et al. A dataset for audio-visual sound event detection in movies
US9946762B2 (en) Building a domain knowledge and term identity using crowd sourcing
Ding et al. Bert-based chinese medical keyphrase extraction model enhanced with external features
Zhu et al. Enhanced double-carrier word embedding via phonetics and writing
Montefalcon et al. Filipino sign language recognition using long short-term memory and residual network architecture
Zhang et al. Enriching ontology with temporal commonsense for low-resource audio tagging
Soğancıoğlu et al. Is everything fine, grandma? acoustic and linguistic modeling for robust elderly speech emotion recognition
CN116821781A (zh) 分类模型的训练方法、文本分析方法及相关设备

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19902840

Country of ref document: EP

Kind code of ref document: A1