WO2021251188A1 - 推奨情報提供装置 - Google Patents
推奨情報提供装置 Download PDFInfo
- Publication number
- WO2021251188A1 WO2021251188A1 PCT/JP2021/020516 JP2021020516W WO2021251188A1 WO 2021251188 A1 WO2021251188 A1 WO 2021251188A1 JP 2021020516 W JP2021020516 W JP 2021020516W WO 2021251188 A1 WO2021251188 A1 WO 2021251188A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- user
- pitch
- learning model
- music
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/361—Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for
- G10K15/04—Sound-producing devices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/066—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/091—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/311—Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation
Definitions
- One aspect of the present invention relates to a recommended information providing device that provides recommended information.
- the pitch setting information of the recommended music is output by using the history of the scoring result corresponding to the setting key information of the music that has already been set by the user. Therefore, it tends to be difficult to obtain recommended information on the pitch setting contents recommended by the user for the music having a small singing history of the user. Therefore, it has been conventionally desired to provide recommended information that matches the user's past singing tendency with respect to a wide variety of songs.
- the recommended information providing device of the present embodiment is a recommended information providing device that provides recommended information, and includes at least one processor, and the at least one processor determines the scoring result of the user's past music singing.
- Pitch information is acquired for each temporal section and is the sound that constitutes the music, and the pitch information indicating the pitch of the sounds arranged in time series in the section is acquired, and the scoring result and the pitch information are used as training data.
- a learning model that predicts the scoring result of the user's song singing, and inputting the pitch information about the target song into the learning model while changing the pitch of the sound indicated by the pitch information to multiple types.
- the scoring result for the singing of the target song of the user is acquired, and the pitch of the sound recommended to the user is based on the scoring result for multiple types of pitch information related to the target song.
- a learning model for predicting the scoring result is constructed by using the scoring result for each section regarding the singing of the past music of the user and the pitch information of the section as training data. Then, the pitch information related to the target music is input to the constructed learning model while the pitch of the sound indicated by the pitch information is changed to a plurality of types, and the user's target music is based on the output. The scoring result for singing is acquired. Further, based on the scoring result for the pitch information changed to a plurality of types, the recommended information regarding the pitch setting content is output. As a result, based on the scoring tendency of the user's past pitch pattern, it is possible to obtain the predicted value of the scoring result when the pitch setting contents are variously changed when singing the target music. .. In addition, by outputting recommended information regarding pitch settings using these predicted values, it is possible to provide recommended information regarding settings suitable for singing for a wide variety of songs.
- FIG. 1 is a system configuration diagram showing the configuration of the karaoke system 1 according to the present embodiment.
- the karaoke system 1 is a device having a known function of playing a music designated by a user and a known function of collecting singing voices by the user according to the playback, evaluating the singing voices, and scoring the singing voices. Is.
- the karaoke system 1 also has a function of providing a user with recommended information regarding a setting key for the pitch (pitch) of a musical piece.
- the karaoke system 1 includes a karaoke device 2, a front server 3, a data management device 4, and a recommended information providing device 5.
- the front server 3, the data management device 4, and the recommended information providing device 5 can send and receive data to and from each other via a communication network such as a LAN (Local Area Network), a WAN (Wide Area Network), and a mobile communication network. It is configured in.
- a communication network such as a LAN (Local Area Network), a WAN (Wide Area Network), and a mobile communication network. It is configured in.
- the karaoke device 2 provides a music reproduction function and a sound collection function of a user's singing voice.
- the front server 3 is electrically connected to the karaoke device 2, has a playback function for providing playback data for playing a song designated by the user to the karaoke device 2, and a music search function according to the user's operation. It also has a scoring function for receiving singing voice data collected by the karaoke device 2 according to the reproduction of a musical piece and calculating the scoring result of the singing voice.
- the front server 3 has a function of providing the reproduction data in which the pitch of the music is uniformly changed according to the setting key set in advance by the user.
- the front server 3 also has a function of storing the scoring result of the singing voice by the user as history information in the data management device 4 each time.
- the front server 3 provides a user interface for accepting user operations and displaying information to the user, and includes a terminal device connected to the front server 3 by wire or wirelessly.
- the data management device 4 is a data storage device (database device) that stores data processed by the front server 3 and the recommended information providing device 5.
- the data management device 4 has a history information storage unit 101 that stores history information that records scoring results related to the user's past song singing, and a music information storage unit that stores pitch information about music that can be played by the karaoke device 2. Includes 102.
- Various information stored in the data management device 4 is updated at any time by the processing of the front server 3 or the data acquired from the outside.
- FIG. 2 shows an example of the data structure of the history information stored in the data management device 4
- FIGS. 3 and 4 show an example of the data structure of the music information stored in the data management device 4. ..
- the history information includes a "user identifier” that identifies a user, a “song identifier” that identifies a song that the user has sung in the past using the karaoke system 1, and the song in the past.
- "Singing time” indicating the time of singing
- “total score” indicating the scoring result for the singing of the entire section of the song by the function of the front server 3
- each section of the song by the function of the front server 3 are stored in association with each other.
- the temporal section of each song is divided into a predetermined number (for example, 24), the scoring result is calculated for each divided section, and the entire song is calculated from the scoring results of all the sections.
- the scoring result "total score" is calculated.
- the scoring result of each section and the scoring result of the whole section calculated by the front server 3 are recorded for the singing of each user's music.
- FIG. 3 shows an example of the data structure of the pitch information in the music information.
- the pitch information includes a "musical piece identifier" that identifies a piece of music that can be played using the karaoke system 1, and a “notebook” that indicates the start time of all the sounds (notes) that make up the piece of music. "Start time (ms)”, “Note end time (ms)” indicating the end time of the whole song, and “Pitch” indicating the standard pitch (pitch) of the sound numerically.
- the “strength” that indicates the strength of the sound numerically is stored in association with it.
- the data management device 4 configures each song that can be played by the front server 3, and stores pitch information about all standard sounds (sounds before change by the setting key) arranged in chronological order in each song. There is.
- FIG. 4 shows an example of the data structure of the section information in the music information.
- the section information includes a "musical piece identifier" that identifies a song that can be played using the karaoke system 1, and a “section start time (ms)” that indicates the start time of the entire section of the song. ] And the "section end time (ms)” indicating the end time in the entire music of the section are stored in association with each other.
- the data management device 4 stores section information regarding all sections constituting each musical piece that can be played by the front server 3.
- the recommended information providing device 5 is a device that provides recommended information regarding setting keys to the user, and has data acquisition unit 201, model construction unit 202, prediction unit 203, and recommended information generation unit 204 as functional components. Includes. The functions of each component will be described below.
- the data acquisition unit 201 acquires history information and music information from the data management device 4 prior to the process of constructing a learning model for predicting the scoring result. In addition, the data acquisition unit 201 also acquires music information prior to the scoring result prediction process. The data acquisition unit 201 passes each acquired information to the model construction unit 202 or the prediction unit 203.
- the data acquisition unit 201 combines the information read from the history information storage unit 101 and the music information storage unit 102 of the data management device 4 at the time of the learning model construction process, and the data acquisition unit 201 combines each section of the music sung by the user in the past. Generate history information of scoring results for each sound.
- FIG. 5 shows an example of the data structure of the history information generated by the data acquisition unit 201.
- the history information includes a "user identifier" that identifies the user, a "music identifier” that identifies the music that the user has sung in the past, and a "section” that identifies the section of the music, and the section thereof.
- the data acquisition unit 201 generates history information about all the sounds constituting each song sung by the user in the past. In the "pitch information" included in the history information, if the setting key has been changed from the standard key during the user's past singing, a numerical value corresponding to the changed pitch is recorded.
- the data acquisition unit 201 acquires music information related to the music to be predicted from the data management device 4 at the time of predicting the scoring result.
- the data acquisition unit 201 delivers the acquired music information to the prediction unit 203.
- the model building unit 202 uses the history information generated by the data acquisition unit 201 as training data, and predicts the scoring result regarding the singing of the target music by the user based on the pitch information of the target music. Build a learning model for machine learning. Prior to the construction of the learning model, the model construction unit 202 executes preprocessing for processing the history information handed over from the data acquisition unit 201. Specifically, the model building unit 202 converts each information of the history information into a one-dimensional vector (sound vector) in which information on the pitch and strength of each sound constituting the music sung by the user in the past is arranged. ..
- the model building unit 202 uses the history information as a one-dimensional vector (score vector) in which the scoring results of the sections corresponding to each sound are arranged, which is a one-dimensional vector corresponding to the sound vector. It is converted into a one-dimensional vector (user identification vector) in which user identification information about the singing user is arranged, which is a one-dimensional vector corresponding to the vector.
- FIG. 6 shows an example of the data structure of the one-dimensional vector generated by the preprocessing of the model construction unit 202.
- the model construction unit 202 converts the history information into the sound vector V1, the score vector V2, and the user identification vector V3.
- the model building unit 202 inputs the sound vector V1 and the user identification vector V3 to the learning model, and optimizes the parameters of the learning model so that the output result of the learning model approaches the score indicated by the score vector V2 ( Train the learning model).
- the model building unit 202 uses a learning model for deep learning as a learning model.
- FIG. 7 shows the configuration of the learning model M used by the model construction unit 202.
- the learning model M is composed of a one-hot encoding unit M1, a GRU unit M2, a coupling unit M3, and a dense unit M4.
- the one-hot encoding unit M1 receives the user identification vector V3 and converts the user identification vector V3 into a two-dimensional vector.
- FIG. 8 shows an example of the data structure of the two-dimensional vector converted by the one-hot encoding unit M1.
- each row corresponds to the sound indicated by each element of the sound vector V1
- each column corresponds to each user indicated by each element of the user identification vector V3.
- the "user identifier" of one element included in the user identification vector V3 is "A1”
- the value of the column corresponding to "user identifier A1" is set to "1" in the row corresponding to that element.
- the value of the column corresponding to the other user identifier is set to "0".
- the one-hot encoding unit M1 generates a two-dimensional vector for each row corresponding to all the elements included in the user identification vector V3.
- the GRU unit M2 is a kind of recurrent neural network (RNN: Recurrent Neural Network), and outputs a state in addition to a normal output.
- the input is a sound vector V1 as a normal input, and immediately before.
- the output state is input again.
- the GRU unit M2 has a function of storing past input information and can process long-term time-series information.
- the coupling unit M3 combines the output of the one-hot encoding unit M1 and the output of the GRU unit M2.
- the dense part M4 is a fully connected layer in deep learning, and is an arbitrary dimension by multiplying a numerical sequence of a certain number of dimensions output from the connecting part M3 by a weight (w) and adding a bias (b). Convert to number output (Y).
- the dense unit M4 converts the scoring results (scores) of each section of the music into a one-dimensional output vector Y in which they are arranged.
- FIG. 9 shows an example of the data structure of the output vector converted by the dense unit M4. As described above, in the output vector (Y), each element shows the predicted value of the scoring result of each section composed of the sounds corresponding to the elements of the input sound vector V1.
- the model building unit 202 inputs the user identification vector V3 and the sound vector V1 to the learning model M using the learning model M having the above configuration, and the output vector (Y) obtained as a result is each section indicated by the score vector V2.
- the learning model M is trained so as to approach the score of. As a result of the training, for example, the parameters of the weight (w) and the bias (b) in the dense portion M4 of the learning model M are optimized.
- the prediction unit 203 uses the learning model M constructed by the model construction unit 202 based on the music information related to the target music, and the scoring result of each section related to the singing of the target music by the user. Get the predicted value of. Specifically, the prediction unit 203 performs the same preprocessing as the model construction unit 202 on the music information, and generates the sound vector V1 and the user identification vector V3 for the target music. Then, the prediction unit 203 predicts the scoring result of each section of the target music based on the output vector (Y) obtained by inputting the generated sound vector V1 and the user identification vector V3 into the learning model M. To get.
- the prediction unit 203 changes the numerical value of the pitch information of each section to a plurality of types in the music information of the target music, and based on the music information in which the pitch information is changed to a plurality of types, a learning model.
- the predicted value of the scoring result of each section is acquired by using M.
- the prediction unit 203 makes the pitch information of the entire section correspond to the numerical value of the setting key set in the front server 3 in the music information of the target music, and uniformly sets a predetermined value from the standard pitch. Increase or decrease by minutes. For example, the value of the pitch information of all sections is increased by +1 corresponding to the setting key "+1", and the value of the pitch information of all sections is increased by +2 corresponding to the setting key "+2". ..
- the recommended information generation unit 204 repeatedly acquires the predicted value of the scoring information of each section related to the music whose pitch information has been changed to a plurality of types from the prediction unit 203, and the entire music whose pitch information has been changed to a plurality of types. Calculate the predicted value of the scoring result. For example, as the predicted value of the entire scoring result, the average value of the predicted values of the scoring results of all the sections is calculated. Then, the recommended information generation unit 204 selects the pitch setting content (setting key) recommended to the user based on the predicted value of the scoring result of the music whose pitch information has been changed to a plurality of types, and presses the selected setting key. The indicated recommended information is output together with the predicted value of the scoring result corresponding to the setting key.
- the recommended information generation unit 204 corresponds to a music having a relatively high predicted value of the scoring result and a music having a predicted value of the scoring result higher than a preset threshold value as setting keys recommended to the user. Etc. are selected.
- the recommended information and the predicted value information output by the recommended information generation unit 204 are output to the terminal device or the like of the front server 3.
- FIG. 10 is a flowchart showing the procedure of the learning model construction process by the recommended information providing device 5
- FIG. 11 is a flowchart showing the procedure of the recommended process regarding the setting key by the recommended information providing device 5.
- the learning model construction process is started at a preset timing (for example, periodic timing), or at a timing when a certain amount of historical information is accumulated in the data management device 4.
- the recommended processing related to the setting key is started at a preset timing, a timing at which an instruction is received from the user on the front server 3, or the like.
- the data acquisition unit 201 acquires history information regarding the scoring result of the user's past music singing from the data management device 4 (step S101). .. Further, the data acquisition unit 201 acquires music information related to the music recorded in the history information from the data management device 4 (step S102).
- preprocessing is executed by the model building unit 202, and the sound vector V1, the score vector V2, and the user identification vector V3 are generated based on the history information and the music information (step S103).
- the learning model M is trained by the model building unit 202 using the sound vector V1, the score vector V2, and the user identification vector V3, so that the parameters of the learning model M are optimized (construction of the learning model, step). S104), the learning model construction process is completed.
- the data acquisition unit 201 acquires music information related to the target music from the data management device 4 (step S201). After that, preprocessing is executed by the prediction unit 203, a sound vector V1 is generated based on the music information whose pitch information is changed to a plurality of types, and a user identification vector V3 that identifies a user to be predicted as a scoring result is generated. Is generated for the elements corresponding to the sound vector V1 (step S202).
- the prediction unit 203 inputs the sound vector V1 and the user identification vector V3 to the learning model M, and based on the output vector of the learning model M, the scoring result for each section of the music whose setting keys are changed to a plurality of types.
- the predicted value of is acquired (step S203).
- the recommended information generation unit 204 calculates the predicted value of the overall scoring result for each song of the plurality of setting keys based on the predicted value of the scoring result for each section of the music of the plurality of setting keys (step S204). ).
- the recommended information generation unit 204 selects a setting key recommended to the user based on the predicted value of the scoring result for each song of the plurality of setting keys, and generates and outputs the recommended information for the user (step S205). ..
- FIG. 12 shows an example of the data structure of the recommended information output by the recommended information providing device 5.
- a plurality of records in which the item of "key setting content” indicating the type of the setting key and the item of "estimated score” indicating the predicted value of the overall scoring result are associated with each other are output.
- the recommended setting key is indicated by the "key setting content” corresponding to the "estimated score” indicating a relatively high numerical value.
- a learning model M for predicting the scoring result is constructed by using the scoring result for each section regarding the singing of the past music of the user and the pitch information of the section as training data. .. Then, the pitch information related to the target music is input to the constructed learning model M while the pitches indicated by the pitch information are changed to a plurality of types, and the target music of the user is based on the output. The scoring result for the singing of is obtained. Further, based on the scoring result for the pitch information changed to a plurality of types, the recommended information regarding the pitch setting content is output.
- a learning model M is used in which time-series pitch information is input and the scoring result for each section of the music corresponding to the pitch information is output, and the output of the learning model M is included in the training data.
- the learning model M is constructed so as to approach each scoring result. By doing so, it is possible to construct a learning model M that grasps the tendency of the scoring result for the pitch pattern of each section of the music, and the prediction accuracy of the scoring result regarding the singing of the target music of the user is surely improved. be able to. As a result, it is possible to provide recommended information suitable for singing the target music of the user.
- the learning model M for further inputting the user's identification information is used.
- the learning model M grasps the tendency of the scoring result for the pitch pattern of each user, and it is possible to surely improve the prediction accuracy of the scoring result for each user.
- the scoring results related to the singing of the target song by the user are obtained by averaging the scoring results for each section of the target song, which is the output of the learning model M. By doing so, it is possible to easily determine the strengths and weaknesses of the user regarding the singing of the target song.
- the pitch of the pitch indicated by the pitch information in all the sections related to the target music is uniformly changed by a predetermined numerical value, and the pitch information is input to the learning model M to obtain the learning model M. Based on the output, the scoring result regarding the singing of the target music of the user is acquired.
- each functional block may be realized using one physically or logically coupled device, or two or more physically or logically separated devices can be directly or indirectly (eg, for example). , Wired, wireless, etc.) and may be realized using these plurality of devices.
- the functional block may be realized by combining the software with the one device or the plurality of devices.
- Functions include judgment, decision, judgment, calculation, calculation, processing, derivation, investigation, search, confirmation, reception, transmission, output, access, solution, selection, selection, establishment, comparison, assumption, expectation, and assumption. Broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating, mapping, assigning, etc., but limited to these I can't.
- a functional block (configuration unit) that makes transmission function is called a transmitting unit (transmitting unit) or a transmitter (transmitter).
- the realization method is not particularly limited.
- the data management device 4 and the recommended information providing device 5 in the embodiment of the present disclosure may function as a computer for processing the present disclosure.
- FIG. 13 is a diagram showing an example of the hardware configuration of the data management device 4 and the recommended information providing device 5 according to the embodiment of the present disclosure.
- the above-mentioned data management device 4 and recommended information providing device 5 are physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like. You may.
- the word “device” can be read as a circuit, device, unit, etc.
- the hardware configuration of the data management device 4 and the recommended information providing device 5 may be configured to include one or more of the devices shown in the figure, or may be configured not to include some of the devices. ..
- the processor 1001 For each function in the data management device 4 and the recommended information providing device 5, the processor 1001 performs an operation by loading predetermined software (program) on the hardware such as the processor 1001 and the memory 1002, and the communication device 1004 communicates. It is realized by controlling at least one of reading and writing of data in the memory 1002 and the storage 1003.
- the processor 1001 operates, for example, an operating system to control the entire computer.
- the processor 1001 may be configured by a central processing unit (CPU: Central Processing Unit) including an interface with peripheral devices, a control device, an arithmetic unit, a register, and the like.
- CPU Central Processing Unit
- the above-mentioned data acquisition unit 201, model construction unit 202, prediction unit 203, recommended information generation unit 204, and the like may be realized by the processor 1001.
- the processor 1001 reads a program (program code), a software module, data, etc. from at least one of the storage 1003 and the communication device 1004 into the memory 1002, and executes various processes according to these.
- a program program that causes a computer to execute at least a part of the operations described in the above-described embodiment is used.
- the data acquisition unit 201, the model construction unit 202, the prediction unit 203, and the recommended information generation unit 204 may be stored in the memory 1002 and realized by a control program operating in the processor 1001, and other functional blocks may also be used. It may be realized in the same way.
- Processor 1001 may be mounted by one or more chips.
- the program may be transmitted from the network via a telecommunication line.
- the memory 1002 is a computer-readable recording medium, and is composed of at least one such as a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), and a RAM (Random Access Memory). May be done.
- the memory 1002 may be referred to as a register, a cache, a main memory (main storage device), or the like.
- the memory 1002 can store a program (program code), a software module, or the like that can be executed to carry out the construction process and the recommended process according to the embodiment of the present disclosure.
- the storage 1003 is a computer-readable recording medium, and is, for example, an optical disk such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, an optical magnetic disk (for example, a compact disk, a digital versatile disk, or a Blu-ray). It may consist of at least one (registered trademark) disk), smart card, flash memory (eg, card, stick, key drive), floppy (registered trademark) disk, magnetic strip, and the like.
- the storage 1003 may be referred to as an auxiliary storage device.
- the storage medium described above may be, for example, a database, server or other suitable medium containing at least one of the memory 1002 and the storage 1003.
- the communication device 1004 is hardware (transmission / reception device) for communicating between computers via at least one of a wired network and a wireless network, and is also referred to as, for example, a network device, a network controller, a network card, a communication module, or the like.
- the communication device 1004 includes, for example, a high frequency switch, a duplexer, a filter, a frequency synthesizer, and the like in order to realize at least one of frequency division duplex (FDD: Frequency Division Duplex) and time division duplex (TDD: Time Division Duplex). It may be composed of.
- FDD Frequency Division Duplex
- TDD Time Division Duplex
- the data acquisition unit 201 may be physically or logically separated from each other in the transmission unit and the reception unit.
- the input device 1005 is an input device (for example, a keyboard, a mouse, a microphone, a switch, a button, a sensor, etc.) that accepts an input from the outside.
- the output device 1006 is an output device (for example, a display, a speaker, an LED lamp, etc.) that outputs to the outside.
- the above-mentioned recommended information generation unit 204 and the like may be realized by the output device 1006.
- the input device 1005 and the output device 1006 may have an integrated configuration (for example, a touch panel).
- each device such as the processor 1001 and the memory 1002 is connected by the bus 1007 for communicating information.
- the bus 1007 may be configured by using a single bus, or may be configured by using a different bus for each device.
- the data management device 4 and the recommended information providing device 5 include a microprocessor, a digital signal processor (DSP: Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). It may be configured to include such hardware, and a part or all of each functional block may be realized by the hardware. For example, processor 1001 may be implemented using at least one of these hardware.
- information notification includes physical layer signaling (for example, DCI (Downlink Control Information), UCI (Uplink Control Information)), higher layer signaling (for example, RRC (Radio Resource Control) signaling, MAC (Medium Access Control) signaling, etc. It may be carried out by notification information (MIB (Master Information Block), SIB (System Information Block)), other signals, or a combination thereof.
- RRC signaling may be referred to as an RRC message, and may be, for example, an RRC Connection Setup message, an RRC Connection Reconfiguration message, or the like.
- Each aspect / embodiment described in the present disclosure includes LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G (4th generation mobile communication system), and 5G (5th generation mobile communication).
- system FRA (Future Radio Access), NR (new Radio), W-CDMA (registered trademark), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi (registered trademark)) )), LTE 802.16 (WiMAX®), IEEE 802.20, UWB (Ultra-WideBand), Bluetooth®, and other systems that utilize appropriate systems and have been extended based on these. It may be applied to at least one of the next generation systems. Further, a plurality of systems may be applied in combination (for example, a combination of at least one of LTE and LTE-A and 5G).
- Information etc. can be output from the upper layer (or lower layer) to the lower layer (or upper layer). Input / output may be performed via a plurality of network nodes.
- the input / output information and the like may be stored in a specific location (for example, a memory) or may be managed using a management table. Information to be input / output may be overwritten, updated, or added. The output information and the like may be deleted. The input information or the like may be transmitted to another device.
- the determination may be made by a value represented by 1 bit (0 or 1), by a true / false value (Boolean: true or false), or by comparing numerical values (for example, a predetermined value). It may be done by comparison with the value).
- the notification of predetermined information (for example, the notification of "being X") is not limited to the explicit one, but is performed implicitly (for example, the notification of the predetermined information is not performed). May be good.
- Software whether called software, firmware, middleware, microcode, hardware description language, or other names, is an instruction, instruction set, code, code segment, program code, program, subprogram, software module.
- Applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, features, etc. should be broadly interpreted.
- software, instructions, information, etc. may be transmitted and received via a transmission medium.
- the software uses at least one of wired technology (coaxial cable, optical fiber cable, twisted pair, digital subscriber line (DSL: Digital Subscriber Line), etc.) and wireless technology (infrared, microwave, etc.) to create a website.
- wired technology coaxial cable, optical fiber cable, twisted pair, digital subscriber line (DSL: Digital Subscriber Line), etc.
- wireless technology infrared, microwave, etc.
- the information, signals, etc. described in this disclosure may be represented using any of a variety of different techniques.
- data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description are voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. It may be represented by a combination of.
- a channel and a symbol may be a signal (signaling).
- the signal may be a message.
- the component carrier CC: Component Carrier
- CC Component Carrier
- system and “network” used in this disclosure are used interchangeably.
- the information, parameters, etc. described in the present disclosure may be expressed using an absolute value, a relative value from a predetermined value, or another corresponding information. It may be represented.
- the radio resource may be one indicated by an index.
- determining and “determining” used in this disclosure may include a wide variety of actions.
- “Judgment” and “decision” are, for example, judgment (judging), calculation (calculating), calculation (computing), processing (processing), derivation (deriving), investigation (investigating), search (looking up, search, inquiry). It may include (eg, searching in a table, database or another data structure), ascertaining as “judgment” or “decision”.
- judgment and “decision” are receiving (for example, receiving information), transmitting (for example, transmitting information), input (input), output (output), and access. It may include (for example, accessing data in memory) to be regarded as “judgment” or “decision”.
- judgment and “decision” are considered to be “judgment” and “decision” when the things such as solving, selecting, choosing, establishing, and comparing are regarded as “judgment” and “decision”. Can include. That is, “judgment” and “decision” may include considering some action as “judgment” and “decision”. Further, “judgment (decision)” may be read as “assuming", “expecting”, “considering” and the like.
- connection means any direct or indirect connection or connection between two or more elements and each other. It can include the presence of one or more intermediate elements between two “connected” or “combined” elements.
- the connection or connection between the elements may be physical, logical, or a combination thereof.
- connection may be read as "access”.
- the two elements use at least one of one or more wires, cables and printed electrical connections, and as some non-limiting and non-comprehensive examples, the radio frequency domain. Can be considered to be “connected” or “coupled” to each other using electromagnetic energy having wavelengths in the microwave and light (both visible and invisible) regions.
- the term "A and B are different” may mean “A and B are different from each other”.
- the term may mean that "A and B are different from C”.
- Terms such as “separate” and “combined” may be interpreted in the same way as “different”.
- One embodiment of the present invention uses a recommended information providing device that provides recommended information, and makes it possible to provide recommended information regarding settings suitable for singing for a wide variety of music.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/000,964 US20230215406A1 (en) | 2020-06-09 | 2021-05-28 | Recommendation information provision device |
| JP2022530471A JP7714543B2 (ja) | 2020-06-09 | 2021-05-28 | 推奨情報提供装置 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020100169 | 2020-06-09 | ||
| JP2020-100169 | 2020-06-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021251188A1 true WO2021251188A1 (ja) | 2021-12-16 |
Family
ID=78845634
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/020516 Ceased WO2021251188A1 (ja) | 2020-06-09 | 2021-05-28 | 推奨情報提供装置 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230215406A1 (https=) |
| JP (1) | JP7714543B2 (https=) |
| WO (1) | WO2021251188A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116861208A (zh) * | 2023-07-04 | 2023-10-10 | 腾讯音乐娱乐科技(深圳)有限公司 | 音乐粗排模型的训练方法、装置、计算机设备和存储介质 |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7682175B2 (ja) * | 2020-06-09 | 2025-05-23 | 株式会社Nttドコモ | 予測装置 |
| JP7784420B2 (ja) * | 2021-04-27 | 2025-12-11 | 株式会社Nttドコモ | 特徴量出力モデル生成システム |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007010922A (ja) * | 2005-06-29 | 2007-01-18 | Daiichikosho Co Ltd | 利用者別楽曲別好適キー推奨システム |
| JP2011203479A (ja) * | 2010-03-25 | 2011-10-13 | Xing Inc | カラオケシステム、カラオケシステムの制御方法、及びカラオケシステムの制御プログラム並びにその情報記録媒体 |
| JP2016029429A (ja) * | 2014-07-25 | 2016-03-03 | 株式会社第一興商 | カラオケ装置 |
| JP2018091982A (ja) * | 2016-12-02 | 2018-06-14 | 株式会社第一興商 | カラオケシステム |
| JP2019148767A (ja) * | 2018-02-28 | 2019-09-05 | 株式会社第一興商 | サーバ装置、リコメンドシステム |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101602194B1 (ko) * | 2009-02-17 | 2016-03-10 | 고쿠리츠 다이가쿠 호진 교토 다이가쿠 | 음악 음향 신호 생성 시스템 |
| WO2020031544A1 (ja) * | 2018-08-10 | 2020-02-13 | ヤマハ株式会社 | 楽譜データの情報処理装置 |
| WO2020100671A1 (ja) * | 2018-11-15 | 2020-05-22 | ソニー株式会社 | 情報処理装置、情報処理方法及びプログラム |
| WO2021186928A1 (ja) * | 2020-03-17 | 2021-09-23 | ヤマハ株式会社 | 演奏情報に対する評価を推論する方法、システム、及びプログラム |
| CN115298733A (zh) * | 2020-03-24 | 2022-11-04 | 雅马哈株式会社 | 训练好的模型的建立方法、推定方法、演奏代理的推荐方法、演奏代理的调整方法、训练好的模型的建立系统、推定系统、训练好的模型的建立程序及推定程序 |
| JP7682175B2 (ja) * | 2020-06-09 | 2025-05-23 | 株式会社Nttドコモ | 予測装置 |
-
2021
- 2021-05-28 JP JP2022530471A patent/JP7714543B2/ja active Active
- 2021-05-28 WO PCT/JP2021/020516 patent/WO2021251188A1/ja not_active Ceased
- 2021-05-28 US US18/000,964 patent/US20230215406A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007010922A (ja) * | 2005-06-29 | 2007-01-18 | Daiichikosho Co Ltd | 利用者別楽曲別好適キー推奨システム |
| JP2011203479A (ja) * | 2010-03-25 | 2011-10-13 | Xing Inc | カラオケシステム、カラオケシステムの制御方法、及びカラオケシステムの制御プログラム並びにその情報記録媒体 |
| JP2016029429A (ja) * | 2014-07-25 | 2016-03-03 | 株式会社第一興商 | カラオケ装置 |
| JP2018091982A (ja) * | 2016-12-02 | 2018-06-14 | 株式会社第一興商 | カラオケシステム |
| JP2019148767A (ja) * | 2018-02-28 | 2019-09-05 | 株式会社第一興商 | サーバ装置、リコメンドシステム |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116861208A (zh) * | 2023-07-04 | 2023-10-10 | 腾讯音乐娱乐科技(深圳)有限公司 | 音乐粗排模型的训练方法、装置、计算机设备和存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230215406A1 (en) | 2023-07-06 |
| JPWO2021251188A1 (https=) | 2021-12-16 |
| JP7714543B2 (ja) | 2025-07-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7714543B2 (ja) | 推奨情報提供装置 | |
| JP7682175B2 (ja) | 予測装置 | |
| JP7542540B2 (ja) | 需要予測装置 | |
| JP7438191B2 (ja) | 情報処理装置 | |
| US11868734B2 (en) | Dialogue system | |
| WO2021070819A1 (ja) | 採点モデル学習装置、採点モデル及び判定装置 | |
| US20210034678A1 (en) | Dialogue server | |
| JP7016405B2 (ja) | 対話サーバ | |
| US11663420B2 (en) | Dialogue system | |
| WO2021256278A1 (ja) | 推奨情報提供装置 | |
| JP6775055B2 (ja) | リスク推定装置 | |
| JP7548912B2 (ja) | リランキング装置 | |
| JP2021113794A (ja) | 情報提供装置 | |
| JPWO2020054244A1 (ja) | 対話情報生成装置 | |
| JP7853220B2 (ja) | 血液情報推定装置 | |
| JP6705038B1 (ja) | 行動支援装置 | |
| US20260038166A1 (en) | Information processing device | |
| JP2020095200A (ja) | 楽曲解析システム | |
| WO2026009291A1 (ja) | 情報処理装置および情報処理方法 | |
| JP7829110B2 (ja) | 文生成モデル生成装置、文生成モデル生成方法、および文生成装置 | |
| WO2019211967A1 (ja) | 対話装置 | |
| JP7576178B2 (ja) | 時系列データ処理装置 | |
| WO2025253499A1 (ja) | 情報処理装置および情報処理方法 | |
| WO2026069461A1 (ja) | 情報処理装置および情報処理方法 | |
| JP2020064177A (ja) | 情報処理装置及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21822983 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022530471 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21822983 Country of ref document: EP Kind code of ref document: A1 |