WO2023170757A1 - Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme - Google Patents

Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme Download PDF

Info

Publication number
WO2023170757A1
WO2023170757A1 PCT/JP2022/009776 JP2022009776W WO2023170757A1 WO 2023170757 A1 WO2023170757 A1 WO 2023170757A1 JP 2022009776 W JP2022009776 W JP 2022009776W WO 2023170757 A1 WO2023170757 A1 WO 2023170757A1
Authority
WO
WIPO (PCT)
Prior art keywords
playback
performance
control
information
cost
Prior art date
Application number
PCT/JP2022/009776
Other languages
English (en)
Japanese (ja)
Inventor
陽 前澤
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Priority to PCT/JP2022/009776 priority Critical patent/WO2023170757A1/fr
Publication of WO2023170757A1 publication Critical patent/WO2023170757A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10GREPRESENTATION OF MUSIC; RECORDING MUSIC IN NOTATION FORM; ACCESSORIES FOR MUSIC OR MUSICAL INSTRUMENTS NOT OTHERWISE PROVIDED FOR, e.g. SUPPORTS
    • G10G3/00Recording music in notation form, e.g. recording the mechanical operation of a musical instrument
    • G10G3/04Recording music in notation form, e.g. recording the mechanical operation of a musical instrument using electrical means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments

Definitions

  • the present disclosure relates to technology for controlling audio or video playback.
  • Non-Patent Document 1 discloses a technique for estimating performance positions and performance speeds by integrating information on performances by a plurality of performers, and controlling playback of music according to the estimation results.
  • Non-Patent Document 2 discloses a configuration in which the reproduction of music is synchronized with the performance of a specific performer selected from a plurality of performers.
  • one aspect of the present disclosure aims to appropriately control the reproduction of a reproduction part according to a performance by a performer.
  • a playback control method performs model predictive control using a predictive model that predicts performance information including a performance position in a song for at least one performer. Control information is generated for at least one playback part of the song, and playback of the playback part of the song is controlled using the control information generated for the at least one playback part.
  • a playback control system performs model predictive control using a predictive model that predicts performance information including a performance position in a song for at least one performer, for at least one playback part of the song.
  • the present invention includes a predictive control section that generates control information, and a playback control section that controls playback of the playback part in the music piece based on the control information generated for the at least one playback part.
  • a program provides control information for at least one playback part of a song by model predictive control using a prediction model that predicts performance information including a performance position in a song for at least one performer.
  • the computer system is made to function as a predictive control unit that generates a prediction control unit that generates the playback part, and a playback control unit that controls playback of the playback part in the music piece using the control information generated for the at least one playback part.
  • An information processing method performs model predictive control using a predictive model that predicts performance information including a performance position in a music piece for at least one performer, for at least one playback part of the music piece. generating control information, controlling movement of the skeleton and joints represented by the motion data according to the control information, and generating a virtual demonstrator in a virtual space in a posture corresponding to the controlled skeleton and joints; An image of the virtual space captured by a virtual camera whose position and direction are controlled according to the behavior of the user's head is displayed on a display device.
  • FIG. 1 is a block diagram illustrating the configuration of a performance system.
  • FIG. 2 is a block diagram illustrating the functional configuration of a playback control system.
  • FIG. 2 is a block diagram illustrating the configuration of a predictive control unit. It is a schematic diagram of state cost and control cost. It is a graph showing the relationship between weight value and feedback gain. It is a flowchart of control processing. It is a schematic diagram of the setting screen in 2nd Embodiment. It is a schematic diagram of the setting screen in 2nd Embodiment.
  • FIG. 7 is an explanatory diagram of state variables and state costs in the third embodiment. It is an explanatory diagram of control information and control cost in a 3rd embodiment.
  • FIG. 1 is a block diagram illustrating the configuration of a performance system 100 according to a first embodiment.
  • a single performer plays a specific part (hereinafter referred to as a "performance part") out of a plurality of parts of a specific song (hereinafter referred to as a "target song”).
  • the performance parts are, for example, one or more parts that constitute the melody of the target song.
  • the performance system 100 controls the reproduction of parts other than the performance part (hereinafter referred to as "reproduction part") among the plurality of parts of the target music piece.
  • the reproduction parts are, for example, one or more parts that constitute the accompaniment of the target music piece.
  • the performance system 100 includes a playback control system 10 and a keyboard instrument 20.
  • the reproduction control system 10 and the keyboard instrument 20 are interconnected, for example, by wire or wirelessly.
  • the keyboard instrument 20 is an electronic musical instrument equipped with a plurality of keys corresponding to different pitches.
  • the performer plays the performance part by sequentially operating each key of the keyboard instrument 20.
  • the keyboard instrument 20 reproduces musical tones of pitches played by a player.
  • the keyboard instrument 20 supplies performance data E representing the performance to the reproduction control system 10 in parallel with the reproduction of musical tones according to the performance by the player.
  • the performance data E specifies the pitch and key depression intensity corresponding to the key operated by the player. That is, the performance data E is data representing a time series of notes played by the performer.
  • the performance data E is, for example, event data compliant with the MIDI (Musical Instrument Digital Interface) standard. Note that the instrument played by the player is not limited to the keyboard instrument 20.
  • the playback control system 10 includes a control device 11, a storage device 12, a display device 13, an operating device 14, and a sound emitting device 15.
  • the reproduction control system 10 is realized by a portable information device such as a smartphone or a tablet terminal, or a portable or stationary information device such as a personal computer. Note that the reproduction control system 10 is realized not only as a single device but also as a plurality of devices configured separately from each other. Furthermore, the playback control system 10 may be installed in the keyboard instrument 20. The entire performance system 100 including the playback control system 10 and the keyboard instrument 20 may be interpreted as a "playback control system.”
  • the control device 11 is one or more processors that control each element of the playback control system 10. Specifically, for example, CPU (Central Processing Unit), GPU (Graphics Processing Unit), SPU (Sound Processing Unit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array), or ASIC (Application Specific Integrated Circuit).
  • the control device 11 is composed of one or more types of processors such as the following.
  • the storage device 12 is one or more memories that store programs executed by the control device 11 and various data used by the control device 11.
  • a known recording medium such as a semiconductor recording medium and a magnetic recording medium, or a combination of multiple types of recording media is used as the storage device 12.
  • a portable recording medium that can be attached to and detached from the playback control system 10 or a recording medium that can be accessed by the control device 11 via a communication network (for example, cloud storage) is used as the storage device 12.
  • a communication network for example, cloud storage
  • the storage device 12 stores music data D and audio signals Z.
  • the music data D is data that specifies the time series of a plurality of notes constituting the target music. That is, the music data D is data representing the musical score of the target music.
  • the music data D includes first musical score data D1 and second musical score data D2.
  • the first musical score data D1 specifies the note string of the performance part of the target musical piece.
  • the second musical score data D2 specifies the note string of the reproduction part of the target music piece.
  • the music data D (D1, D2) is, for example, a file in a format compliant with the MIDI (Musical Instrument Digital Interface) standard.
  • the acoustic signal Z is a time domain signal representing the waveform of the musical tone (ie, accompaniment tone) of the reproduction part.
  • the display device 13 displays various images.
  • the display device 13 is configured with a display panel such as a liquid crystal panel or an organic EL (Electroluminescence) panel.
  • the operating device 14 accepts operations by the user.
  • the operating device 14 is a plurality of operating elements operated by a user, or a touch panel configured integrally with the display surface of the display device 13.
  • the user who operates the operating device 14 is, for example, a performer of a performance part or an operator other than the performer.
  • the sound emitting device 15 reproduces sound under the control of the control device 11.
  • the sound emitting device 15 reproduces the musical tone of the reproduction part represented by the acoustic signal Z.
  • the sound emitting device 15 is, for example, a speaker or headphones.
  • a sound emitting device 15 that is separate from the playback control system 10 may be connected to the playback control system 10 by wire or wirelessly. Note that illustration of a D/A converter that converts the audio signal Z from digital to analog and an amplifier that amplifies the audio signal Z are omitted for convenience.
  • FIG. 2 is a block diagram illustrating the functional configuration of the playback control system 10.
  • the control device 11 executes a program stored in the storage device 12 to provide a plurality of functions (predictive control unit 30 and playback control Part 40) is realized.
  • the predictive control unit 30 generates control information U[t] using the performance data E and the music data D. Control information U[t] is generated at different times t on the time axis. That is, the predictive control unit 30 generates a time series of control information U[t].
  • the control information U[t] is data in an arbitrary format for controlling the reproduction of the reproduction part.
  • the control information U[t] is a two-dimensional vector including a playback position u1[t] and a playback speed u2[t].
  • the playback position u1[t] is the position (point on the time axis) at which the playback part should be played at time t. Specifically, the playback position u1[t] is based on the playback position (hereinafter referred to as "reference position") at each time t when the playback part is played back at a predetermined speed (hereinafter referred to as "reference speed"). It is a relative position. That is, the playback position u1[t] is expressed as a difference (amount of change) from the reference position.
  • the playback speed u2[t] is the speed at which the playback part should be played back at time t. Specifically, the playback speed u2[t] is a relative speed with respect to the reference speed. That is, the playback speed u2[t] is expressed as a difference (amount of change) from the reference speed.
  • the playback control unit 40 controls the playback of the musical tone of the playback part according to the control information U[t]. Specifically, the playback control unit 40 controls the playback of the musical tone of the playback part by the sound emitting device 15.
  • the reproduction control unit 40 generates reproduction information P[t] from the control information U[t], and causes the sound emitting device 15 to reproduce the reproduction part according to the reproduction information P[t]. Specifically, the reproduction control unit 40 outputs a sample sequence of a portion of the acoustic signal Z corresponding to the reproduction information P[t] to the sound emitting device 15.
  • the reproduction information P[t] is information representing the actual reproduction of the reproduction part by the sound emitting device 15.
  • the playback information P[t] is a two-dimensional vector including a playback position p1[t] and a playback speed p2[t].
  • the reproduction position p1[t] is the position (point on the time axis) of the reproduction part to be reproduced at time t.
  • the playback position p1[t] is a position based on the starting point of the target music piece.
  • the playback speed p2[t] is the speed at which the playback part should be played back at time t.
  • the playback speed p2[t] is a speed with the stop of playback as a reference value (zero).
  • the predictive control unit 30 generates the control information U[t] using the performance data E corresponding to the performance by the performer, and the playback control unit 40 controls the playback of the playback part by the sound emitting device 15. Control is performed according to control information U[t].
  • the predictive control unit 30 of the first embodiment generates control information U[t] so that the reproduction of the reproduction part by the sound emitting device 15 follows the performance of the performance part by the performer.
  • Model Predictive Control (MPC) is used to generate the control information U[t].
  • FIG. 3 is a block diagram illustrating a specific configuration of the predictive control unit 30.
  • the prediction control section 30 includes a performance prediction section 31 , an information generation section 32 , an arithmetic processing section 33 , and a variable setting section 34 .
  • the performance prediction unit 31 predicts the performance information S[t] using the prediction model.
  • Performance information S[t] is predicted for each time t on the time axis. That is, the performance prediction unit 31 generates a time series of performance information S[t].
  • the performance information S[t] is information predicted from the performance of the performance part by the performer (that is, the performance data E). Specifically, the performance information S[t] is a two-dimensional vector including a performance position s1[t] and a performance speed s2[t].
  • the prediction model is a mathematical model for predicting performance information S[t].
  • the performance position s1[t] is the position (point on the time axis) where the performer is predicted to perform at time t in the performance part.
  • the performance position s1[t] is a position based on the starting point of the target music piece.
  • the performance speed s2[t] is the predicted performance speed at time t.
  • the performance speed s2[t] is a speed with the stop of the performance as a reference value (zero).
  • the performance prediction section 31 includes an analysis section 311 and a prediction section 312.
  • the analysis unit 311 estimates the performance time t[k] and the performance position s[k] by analyzing the performance data E (k is a natural number). Each time the performer plays each note of the performance part, the performance time t[k] and the performance position s[k] are estimated.
  • the performance time t[k] is the time when the k-th note among the plurality of notes of the performance part is played.
  • the performance position s[k] is the position of the k-th note among the plurality of notes of the performance part.
  • a known performance analysis technique score alignment technique
  • the analysis unit 311 uses a statistical estimation model such as a deep neural network (DNN) or a hidden Markov model (HMM) to calculate the performance time t[k] and the performance position s[ k] may be estimated.
  • DNN deep neural network
  • HMM hidden Markov model
  • the prediction unit 312 generates performance information S[t] for a time t after (that is, in the future) the performance time t[k].
  • the prediction model is used by the prediction unit 312 to predict the performance information S[t].
  • the prediction model is, for example, a state space model that assumes that the performance by the performer progresses at a constant speed. Specifically, it is assumed that the performance progresses at a constant speed during the intervals between successive notes.
  • the state variable ⁇ [k] in the state space model is expressed by the following equation (1).
  • the symbol ⁇ [k] in Equation (1) is a noise component (eg, white noise).
  • the covariance of the noise component ⁇ [k] is calculated from the performance tendency of the performer.
  • the probability that the performance position s[k] occurs under the observed condition of the state variable ⁇ [k] follows a normal distribution with a predetermined variance.
  • the performance information S[t] can be predicted as shown in Equation (2) below.
  • the prediction unit 312 may calculate the performance information S[t] by calculating the following formulas (3a) and (3b).
  • the symbol dt in Equations (3a) and (3b) is a predetermined time length.
  • the symbol ⁇ (s1[t]) means the performance speed at the performance position s1[t] of the performance information S[t].
  • the performance speed ⁇ (s1[t]) is calculated in advance using, for example, the performance speed at which the performer played the performance part in the past.
  • the expected value of the past performance speed of the performance part of the target music piece is calculated as the performance speed ⁇ (s1[t]).
  • a statistical estimation model such as a deep neural network or a hidden Markov model may be made to learn the relationship between musical scores played by the performer in the past and the performance speed ⁇ (s1[t]) in the performance.
  • the prediction unit 312 generates the performance speed ⁇ (s1[t]) by processing the performance data E using a statistical estimation model.
  • the information generation unit 32 in FIG. 3 generates control information U[t] from performance information S[t].
  • the control information U[t] is set so that the reproduction of the reproduction part by the sound emitting device 15 (control information U[t]) follows the performance of the performance part by the performer (performance information S[t]). generated.
  • control law an arithmetic expression
  • LQG Linear-Quadratic-Gaussian
  • the state variable X[t] is a variable that represents the error between the performance information S[t] and the playback information P[t]. That is, the state variable X[t] represents the error between the performance of the performance part by the performer and the reproduction part by the sound emitting device 15.
  • the state variable X[t] in the first embodiment is a two-dimensional vector including a position error x1[t] and a speed error x2[t].
  • control information U[t] includes a playback position u1[t] based on the reference position and a playback speed u2[t] based on the reference speed.
  • a state transition expressed by the following equation (5) can be assumed. Note that matrix B in Equation (5) is a quadratic unit matrix.
  • the symbol Q[s1] in Equation (6) is the cost (hereinafter referred to as "state cost") regarding the state variable X[t] at each performance position s1[t] of the target music piece.
  • the state variable X[t] means the error between the performance information S[t] and the playback information P[t]. Therefore, the state cost Q[s1] means the cost for the error between the performance information S[t] and the playback information P[t] at the performance position s1[t] of the target music piece. That is, the state cost Q[s1] is a cost for the fact that the reproduction of the reproduction part does not follow the performance of the performance part. As understood from Equation (6), the state cost Q[s1] is a quadratic square matrix.
  • control cost means the cost for the playback position u1[t] and the playback speed u2[t].
  • the playback position u1[t] means the amount of change in the playback position p1[t] with respect to the reference position
  • the playback speed u2[t] means the amount of change in the playback speed p2[t] with respect to the reference speed. do.
  • control cost R[p1] is expressed as a cost related to temporal changes in the playback position p1[t] and the playback speed p2[t] represented by the playback information P[t]. That is, the control cost R[p1] is a cost for a change in the reproduction information P[t]. As understood from Equation (6), the control cost R[p1] is a quadratic square matrix.
  • cost (objective function) J includes state variable X[t], control information U[t], state cost Q[s1], and control cost R[p1].
  • the symbol O in formula (7d) is a zero matrix. That is, the matrix Y[t] is a matrix that becomes a zero matrix at time ⁇ .
  • the symbol L[t] in Equation (7a) is a feedback gain for the state variable X[t], and is expressed by a quadratic square matrix. As understood from equation (7a), the control information U[t] may assume linear feedback with respect to the state variable X[t]. Further, the feedback gain L[t] does not depend on either the control information U[t] or the state variable X[t]. On the other hand, the feedback gain L[t] depends on the state cost Q[s1] and the control cost R[p1].
  • the information generation unit 32 in FIG. 3 uses a mathematical formula ( The control information U[t] of the playback part is calculated by the calculations 7a) to (7d). That is, the control information U[t] is calculated so that the cost J in Equation (6) is reduced.
  • the model predictive control by the predictive control unit 30 in FIG. includes an optimization process for generating control information U[t] suitable from the viewpoint of reducing cost J.
  • the arithmetic processing unit 33 in FIG. 3 generates a state cost Q[s1] and a control cost R[p1] that are applied to the generation of control information U[t].
  • the generation of the state cost Q[s1] and the control cost R[p1] will be described in detail below.
  • FIG. 4 is a schematic diagram of the state cost Q[s1] and the control cost R[p1].
  • state cost Q[s1] and control cost R[p1] when the target music piece is expressed by the musical score shown in FIG. 4 are illustrated.
  • the numerical value of the element in the first row and first column of the state cost Q[s1] and the numerical value of the element in the first row and first column of the control cost R[p1] are illustrated for convenience. ing.
  • the state cost Q[s1] is expressed by the following formula (8).
  • the symbol ⁇ in Equation (8) is a small value for stabilizing each value of the state cost Q[s1].
  • the symbol I means a quadratic unit matrix.
  • Equation (8) represents a time series (hereinafter referred to as "pulse train”) Hq of a plurality of pulses q corresponding to different sound generation positions s'.
  • Pulse train a time series (hereinafter referred to as "pulse train") Hq of a plurality of pulses q corresponding to different sound generation positions s'.
  • Each pulse q is centered at a time point a time ⁇ after the sound generation position s'. Note that since the variable ⁇ is a small numerical value, in FIG.
  • Equation (8) is a variable that determines the maximum value of the pulse q corresponding to the sounding position s'.
  • the symbol ⁇ is a variable that determines the pulse width of each pulse q.
  • the function value of the pulse train Hq corresponding to the performance position s1[t] corresponds to the state cost Q[s1].
  • the symbol Cq[s1] in Equation (8) is a weight value for weighting the state cost Q[s1]. That is, the larger the weight value Cq[s1], the more the influence of the state cost Q[s1] on the feedback gain L[t] increases.
  • the calculation processing unit 33 specifies each sounding position s' by analyzing the first musical score data D1, and calculates the state cost Q[s1] by executing the calculation of formula (8).
  • the feedback gain L[t] is set so that the performance of the performance part by the performer and the reproduction of the reproduction part by the sound emitting device 15 are sufficiently similar.
  • a difference between performance information S[t] and reproduction information P[t] is allowed.
  • the control cost R[p1] is expressed by the following formula (9).
  • the symbol ⁇ in Equation (9) is a small value for stabilizing each value of the control cost R[p1].
  • the symbol I means a quadratic unit matrix.
  • Equation (9) is a set of positions at which each note of the playback part is to be played. That is, the set Gr includes the position (hereinafter referred to as "sounding position") p' of the starting point of each note specified by the second musical score data D2 of the music data D.
  • Equation (9) represents a time series (hereinafter referred to as "pulse train") Hr of a plurality of pulses r corresponding to different sound generation positions p'.
  • Each pulse r is set to have a shape that gradually increases from a point in time before the sound generation position p' and sharply decreases after the sound generation position p' has passed.
  • the symbol ⁇ (p) is a window function representing one pulse r, and is expressed, for example, by the following equation (10).
  • the coefficient c1 and the coefficient c2 in Equation (10) are predetermined positive numbers.
  • the function value of the pulse train Hr corresponding to the reproduction position p1[t] corresponds to the control cost R[p1].
  • the symbol Cr[p1] in Equation (9) is a weight value for weighting the control cost R[p1]. That is, the larger the weight value Cr[p1], the greater the influence of the control cost R[p1] on the feedback gain L[t].
  • the calculation processing unit 33 specifies each sounding position p' by analyzing the second musical score data D2, and calculates the control cost R[p1] by executing the calculation of formula (9).
  • the feedback gain L[t] is set so that the larger the control cost R[p1] at each position p1, the more the control information U[t] is sufficiently reduced.
  • Ru In the vicinity of each sounding position p' of the performance part, the reference position and the playback position p1[t] are sufficiently approximated (the playback position u1[t] is sufficiently reduced), and the reference speed and the playback position p1[t] A sufficient approximation (sufficient reduction of the playback speed u2[t]) is required.
  • the feedback gain L[t] is set so that the reproduction of the reproduction part by the sound emitting device 15 sufficiently approximates the note sequence represented by the second musical score data D2.
  • changes in the reproduction information P[t] are allowed at reproduction positions p1[t] that are sufficiently distant from each sound generation position p'.
  • the variable setting unit 34 in FIG. 3 sets variables that are applied to the generation of control information U[t]. Specifically, the variable setting unit 34 sets each variable ( ⁇ , ⁇ s', ⁇ , ⁇ , Cq[s1]) included in formula (8) and each variable ( ⁇ , Cr [p1], c1, c2). For example, the variable setting unit 34 sets each variable included in formula (8) or formula (9) to a numerical value stored in the storage device 12. As described above, the variable setting unit 34 of the first embodiment sets one or more variables included in the cost J of formula (6). The arithmetic processing unit 33 calculates the state cost Q[s1] and the control cost R[p1] by calculation using the variables set by the variable setting unit 34.
  • FIG. 5 is a graph showing the relationship between the weighted value Cq[s1], the weighted value Cr[p1], and the feedback gain L[t]. Note that in FIG. 5, the numerical value of the element in the first row and first column of the feedback gain L[t] is illustrated for convenience. As understood from equation (7a), there is a tendency that the larger the feedback gain L[t] is, the more strongly the reproduction information P[t] of the reproduction part is corrected. For example, the larger the feedback gain L[t] is, the more the playback of the playback part is corrected to approximate the performance information S[t] of the playback part.
  • the target music piece is divided into a section ⁇ 1, a section ⁇ 2, and a section ⁇ 3 on the time axis.
  • the interval ⁇ 1 is an interval in which the performance part is sounded and the reproduction part is kept silent.
  • the interval ⁇ 2 is an interval in which the playback part is sounded and the performance part is kept silent.
  • the interval ⁇ 3 is an interval in which both the performance part and the playback part are sounded.
  • the graph V1 in FIG. 5 is the feedback gain L[t] when the weighted value Cq[s1] and the weighted value Cr[p1] are set to equal values (case 1).
  • the feedback gain L[t] is set to a large value near the sounding position s' of the performance part within the interval ⁇ 1. That is, the reproduction of the reproduction part is strongly corrected so that the error between the performance information S[t] and the reproduction information P[t] is sufficiently reduced.
  • the feedback gain L[t] is maintained at a sufficiently small value within the interval ⁇ 2. That is, the reproduction of the reproduction part is hardly corrected in the interval ⁇ 2.
  • the feedback gain L[t] is maintained at a large value near the sound generation position s', although it is not as large as within the interval ⁇ 1. That is, in the section ⁇ 3 in which both the performance part and the playback part are sounded, the playback of the playback part is strongly corrected, although not as much as in the section ⁇ 1.
  • Graph V2 in FIG. 5 is the feedback gain L[t] when the weighted value Cq[s1] is sufficiently smaller than the weighted value Cr[p1] (case 2). Specifically, the weight value Cq[s1] was set to 0.1, and the weight value Cr[p1] was set to 1.0. In case 2, the feedback gain L[t] is maintained at a small value overall. That is, the correction of the error between the performance information S[t] and the playback information P[t] is suppressed compared to Case 1.
  • Graph V3 in FIG. 5 is the feedback gain L[t] when the weighted value Cq[s1] is sufficiently larger than the weighted value Cr[p1] (case 3). Specifically, the weight value Cq[s1] was set to 1.0, and the weight value Cr[p1] was set to 0.1. In case 3, the feedback gain L[t] is set to a large value in the vicinity of the sound generation position s' of the performance part, regardless of whether or not the playback part is generating sound. That is, the reproduction of the reproduced part is strongly corrected so that the error between the performance information S[t] and the reproduced information P[t] is sufficiently reduced, regardless of whether or not the reproduced part is sounded.
  • the reproduction behavior of the reproduction part with respect to the performance part changes according to the weight value Cq[s1] and the weight value Cr[p1].
  • the relationship between the performance part and the playback part changes depending on the magnitude relationship between the weighted value Cq[s1] and the weighted value Cr[p1].
  • FIG. 6 is a flowchart of the process (hereinafter referred to as "control process") executed by the control device 11. The control process is repeated at predetermined intervals.
  • control device 11 (analysis unit 311) estimates the performance time t[k] and the performance position s[k] by analyzing the performance data E (Sa1). Further, the control device 11 (prediction unit 312) generates performance information S[t] for time t after performance time t[k] by using the prediction model (Sa2: prediction process).
  • the control device 11 sets variables applied to the generation of control information U[t] (Sa3).
  • the control device 11 (arithmetic processing unit 33) generates a state cost Q[s1] and a control cost R[p1] (Sa4). Specifically, the control device 11 generates the state cost Q[s1] by analyzing the first musical score data D1, and generates the control cost R[p1] by analyzing the second musical score data D2.
  • the variables set in step Sa3 are applied to generate the state cost Q[s1] and the control cost R[p1].
  • the control device 11 calculates the formula (6) by calculating the formulas (7a) to (7d) applying the state variable X[t], the state cost Q[s1], and the control cost R[p1]. ) is calculated so that the cost J is reduced (Sa5: optimization process).
  • Sa2 prediction processing
  • Sa5 optimization processing
  • the control device 11 controls the playback of the playback part by the sound emitting device 15 according to the control information U[t] (Sa6). Specifically, the control device 11 generates reproduction information P[t] from the control information U[t], and reproduces a portion of the acoustic signal Z corresponding to the reproduction information P[t] on the sound emitting device 15.
  • control information U[t] since model predictive control is used to generate the control information U[t], it is possible to appropriately control the reproduction of the reproduction part according to the performance by the performer.
  • the control information U[t] is generated so that the cost J including the state variable X[t] representing the error between the performance information S[t] and the playback information P[t] is reduced. be done. Therefore, it is possible to link the reproduction of the reproduction part to the performance by the performer.
  • the cost J includes a state cost Q[s1] related to the state variable X[t] and a control cost R[p1] related to temporal changes in the reproduction information P[t].
  • state cost Q[s1] the error (state variable X(t)) between the performance information S[t] and the reproduction information P[t] is effectively reduced.
  • control cost R[p1] excessive changes in the reproduction information P[t] are suppressed. Therefore, the error between the performance information S[t] and the reproduction information P[t] and the excessive change in the reproduction information P[t] can be effectively reduced.
  • the second embodiment differs from the first embodiment in the operation of the variable setting unit 34.
  • the configuration and operation of elements other than the variable setting section 34 are the same as in the first embodiment. Therefore, the second embodiment also achieves the same effects as the first embodiment.
  • the variable setting unit 34 of the first embodiment sets the variable applied to the generation of the control information U[t] to a numerical value stored in advance in the storage device 12.
  • the variable setting unit 34 of the second embodiment sets variables to be applied to the generation of the control information U[t] in response to a user's instruction to the operating device 14 (Sa3). Specifically, the variable setting unit 34 sets each variable ( ⁇ , ⁇ s', ⁇ , ⁇ , Cq[s1]) in equation (8) and each variable ( ⁇ , Cr[p1], c1 , c2) are set variably according to instructions from the user.
  • the calculation processing unit 33 calculates the state cost Q[s1] and the control cost R[p1] by calculation using the variables set by the variable setting unit 34 (Sa4).
  • the variable related to the cost J in Equation (6) is set according to an instruction from the user, so that the user's intention can be reflected in the reproduction of the reproduction part.
  • variable setting unit 34 of the second embodiment sets the weighted value Cq[s1] of Equation (8) and the weighted value Cr[p1] of Equation (9).
  • the weight value Cq[s1] is an example of a "first weight value”
  • the weight value Cr[p1] is an example of a "second weight value”.
  • FIG. 7 is a schematic diagram of the setting screen 141 for the user to change the weight value Cq[s1] and the weight value Cr[p1].
  • the variable setting unit 34 displays a setting screen 141 on the display device 13.
  • the setting screen 141 includes the musical score 142 of the target song represented by the song data D.
  • the musical score 142 includes a musical score 143 of the performance part represented by the first musical score data D1, and a musical score 144 of the reproduction part represented by the second musical score data D2.
  • the user can specify an arbitrary section (hereinafter referred to as a "set section") 145 within the musical score 142 by operating the operating device 14.
  • the variable setting unit 34 accepts the designation of a setting section 145 by the user. Note that a plurality of setting sections 145 may be specified in the musical score 142.
  • the variable setting unit 34 accepts selections by the user.
  • the variable setting unit 34 displays the changed image 146 in FIG. 7 on the display device 13.
  • the modified image 146 includes the current value (synchrony) of the weight value Cq[s1].
  • the user can instruct increase (Increase synchrony) or decrease (Decrease synchrony) of weight value Cq[s1] by operating on change image 146.
  • the variable setting unit 34 changes the weight value Cq[s1] within the setting section 145 in response to an instruction from the user.
  • the variable setting unit 34 sets a weight value Cq[s1] for each setting section 145 specified by the user. Note that the variable setting unit 34 may set the weight value Cq[s1] to a numerical value directly specified by the user.
  • variable setting unit 34 displays the changed image 147 in FIG. 8 on the display device 13.
  • the modified image 147 includes the current value (rigidity) of the weight value Cr[p1].
  • the user can instruct increase (Increase rigidity) or decrease (Decrease rigidity) of weight value Cr[p1] by operating on change image 147.
  • the variable setting unit 34 changes the weight value Cr[p1] within the setting section 145 in response to an instruction from the user.
  • the variable setting unit 34 sets a weight value Cr[p1] for each setting section 145 specified by the user. Note that the variable setting unit 34 may set the weight value Cr[p1] to a numerical value directly designated by the user.
  • the arithmetic processing unit 33 of the second embodiment generates the state cost Q[s1] and the control cost R[p1] according to the weighted value Cq[s1] and the weighted value Cr[p1] set by the variable setting unit 34. (Sa4). Specifically, the arithmetic processing unit 33 calculates the performance position s1[t] within the set section 145 of the target music according to formula (8) by applying the weighted value Cq[s1] of the set section 145. Calculate the state cost Q[s1].
  • the calculation processing unit 33 calculates the control cost R by calculating the playback position p1[t] within the set section 145 of the target music according to formula (9) to which the weighted value Cr[p1] of the set section 145 is applied. Calculate [p1]. For sections other than the set section 145 of the target song, the weight value Cq[s1] and the weight value Cr[p1] are set to predetermined initial values.
  • the reproduction behavior of the reproduction part with respect to the performance part changes according to the weight value Cq[s1] and the weight value Cr[p1].
  • the relationship between the performance part and the playback part can be changed according to the setting of the weight value Cq[s1] and the weight value Cr[p1] by the variable setting section 34.
  • each of the weight value Cq[s1] and the weight value Cr[p1] is set according to instructions from the user. Therefore, the user can change the relationship between the performance parts and the playback parts.
  • the music data D of the third embodiment includes N pieces of first score data D1 and M pieces of second score data D2.
  • the N pieces of first musical score data D1 correspond to different performance parts of the target musical piece.
  • the M pieces of second musical score data D2 correspond to different reproduction parts of the target musical piece.
  • the storage device 12 also stores M audio signals Z corresponding to different reproduction parts.
  • the acoustic signal Z of each reproduction part represents the waveform of the musical tone of the reproduction part.
  • the performance prediction unit 31 predicts performance information S[t] for each of the N performance parts using a prediction model. That is, performance information S[t] is predicted for each of the N performers.
  • the process of predicting performance information S[t] is the same as in the first embodiment.
  • Performance information S[t] of each performance part is predicted from the performance of the performance part (ie, performance data E).
  • the performance prediction unit 31 may predict the performance information S[t] of each performance part using a separate prediction model for each performance part, or may use a prediction model common to N performance parts. The performance information S[t] of each performance part may be predicted by doing so.
  • FIG. 9 is an explanatory diagram of the state variable X[t] and state cost Q[s1] in the third embodiment.
  • the state variable X[t] includes state variables Xn,m[t] for all combinations of selecting one of the N performance parts and one of the M reproduction parts.
  • the state variable Xn,m[t] corresponds to the state variable X[t] of the first embodiment.
  • the state variable Xn,m[t] is a two-dimensional vector representing the error between the performance information S[t] of the nth performance part and the reproduction information P[t] of the mth performance part. . That is, the state variable Xn,m[t] represents the error between the performance of the n-th performance part and the m-th reproduction part by the sound emitting device 15.
  • the state cost Q[s1] is a block diagonal matrix whose diagonal components are N ⁇ M submatrices Qn,m[s1]. Elements of the state cost Q[s1] other than the submatrix Qn,m[s1] are set to zero. Specifically, the state cost Q[s1] includes submatrices Qn,m[s1] for all combinations of selecting one of the N performance parts and one of the M reproduction parts. The submatrix Qn,m[s1] corresponds to the state cost Q[s1] of the first embodiment.
  • the submatrix Qn,m[s1] is composed of the performance information S[t] at the performance position s1[t] of the nth performance part and the reproduction information P[t] of the mth performance part. is the cost for the error.
  • the arithmetic processing unit 33 calculates the submatrix Qn,m[s1] using Equation (8) similarly to the state cost Q[s1] in the first embodiment.
  • the set Gq of formula (8) applied to the calculation of the submatrix Qn,m[s1] is the sounding position s' of each note specified by the first score data D1 of the n-th performance part of the music data D. It is.
  • the variable setting unit 34 of the third embodiment individually sets the weight value Cq[s1] of formula (8) for each submatrix Qn,m[s1].
  • the storage device 12 stores a plurality of different setting data. N ⁇ M weight values Cq[s1] corresponding to different combinations of performance parts and playback parts are registered in each of the plurality of setting data. The numerical value of each weight value Cq[s1] differs for each setting data.
  • the variable setting unit 34 selects any one of the plurality of setting data according to a user's instruction to the operating device 14. Selection of setting data corresponds to setting of weight value Cq[s1] corresponding to each submatrix Qn,m[s1].
  • the calculation processing unit 33 calculates the submatrix Qn,m[s1] by calculating the formula (8) applying each weight value Cq[s1] registered in the setting data. As understood from the above description, the weight value Cq[s1] applied to the generation of each submatrix Qn,m[s1] is changed according to instructions from the user. Note that the variable setting unit 34 may individually set each of the N ⁇ M weight values Cq[s1] according to instructions from the user.
  • FIG. 10 is an explanatory diagram of control information U[t] and control cost R[p1] in the third embodiment.
  • the control information U[t] of the third embodiment includes M pieces of control information U1[t] to UM[t] corresponding to different playback parts of the target song.
  • the control information Um[t] corresponds to the control information U[t] of the first embodiment. Therefore, the control information Um[t] is a two-dimensional vector including the playback position u1[t] and the playback speed u2[t].
  • the playback control unit 40 controls the playback of the m-th playback part by the sound emitting device 15 according to the control information Um[t].
  • the playback control unit 40 generates playback information Pm[t] from the control information Um[t], and outputs the m-th playback part to the sound emitting device 15 according to the playback information Pm[t]. Let it play. That is, the reproduction control unit 40 causes the sound emitting device 15 to reproduce a portion of the audio signal Z of the m-th reproduction part that corresponds to the reproduction information Pm[t]. Therefore, the musical tones of M reproduction parts of the target music piece are reproduced in parallel.
  • the control cost R[p1] is a block diagonal matrix whose diagonal components are M submatrices R1[p1] to RM[p1]. Elements of the control cost R[p1] other than the submatrix Rm[p1] are set to zero.
  • the submatrix Rm[p1] corresponds to the control cost R[p1] of the first embodiment. Specifically, it is a cost related to a change in the reproduction information Pm[t] at the reproduction position p1[t] of the m-th reproduction part.
  • the arithmetic processing unit 33 calculates the submatrix Rm[p1] using Equation (9) similarly to the control cost R[p1] of the first embodiment.
  • the set Gr of formula (9) to which the calculation of the submatrix Rm[p1] is applied is the pronunciation position p' of each note specified by the second score data D2 of the m-th playback part of the music data D. .
  • the variable setting unit 34 of the third embodiment individually sets the weight value Cr[p1] of formula (9) for each submatrix Rm[p1].
  • the storage device 12 stores a plurality of different setting data.
  • M weight values Cr[p1] corresponding to different playback parts are registered in each of the plurality of setting data.
  • the numerical value of each weight value Cr[p1] differs for each setting data.
  • the variable setting unit 34 selects any one of the plurality of setting data according to a user's instruction to the operating device 14. Selection of setting data corresponds to setting of weight value Cr[p1] corresponding to each submatrix Rm[p1].
  • the arithmetic processing unit 33 calculates the submatrix Rm[p1] by calculating the formula (8) applying each weight value Cr[p1] registered in the setting data. As understood from the above description, the weight value Cr[p1] applied to the generation of each submatrix Rm[p1] is changed in accordance with instructions from the user. Note that the variable setting unit 34 may individually set each of the M weight values Cr[p1] according to instructions from the user.
  • the information generation unit 32 calculates the following equations (7a) to (7d) using the state variable X[t], state cost Q[s1], and control cost R[p1]. Calculate the control information U[t] of the playback part (Sa5). That is, the information generation unit 32 generates control information Um (U1[t] to UM[t]) for each of the M reproduction parts. Therefore, the third embodiment also achieves the same effects as the first embodiment.
  • the total number N of performance parts and the total number M of reproduction parts are generalized.
  • performance information S[t] is predicted for each of the plurality of performers (performance parts). Therefore, the reproduction of the reproduction parts can be appropriately controlled according to the performances by a plurality of performers.
  • control information Um[t] is generated for each of the plurality of reproduction parts. Therefore, the reproduction of each of the plurality of reproduction parts can be controlled according to the performance by the performer.
  • both the total number N of performance parts and the total number M of reproduction parts are 2 or more, the reproduction of each of the plurality of reproduction parts can be controlled in accordance with the performances by the plurality of performers.
  • N ⁇ M weight values Cq[s1] corresponding to different combinations of performance parts and playback parts are controlled.
  • M weight values Cr[p1] corresponding to different playback parts are controlled.
  • the relationship between the reproduction of the playback part and the performance of the performance part depends on the weight value Cq[s1] and the weight value Cr[p1]. Therefore, the relationship between each of the N performance parts and each of the M playback parts can be controlled in detail according to the weighting value Cq[s1] and the weighting value Cr[p1]. That is, the degree to which the reproduction of the reproduction part is linked to the performance of the performance part can be individually controlled for each combination of each performance part and each reproduction part. For example, various types of control can be realized, such as strongly linking the playback of a specific playback part with the performance of a specific performance part, while hardly linking the playback of other playback parts with the performance of the performance part.
  • the prediction control unit 30 uses a prediction model that predicts the performance information S[t] including the performance position s1[t] in the target music piece for at least one performer.
  • Control information U[t] is generated for at least one playback part of the target song using model predictive control.
  • the acoustic signal Z stored in the storage device 12 is used to reproduce the reproduction part, but the method of reproducing the musical tone of the reproduction part is not limited to the above examples.
  • the audio signal Z may be generated by the reproduction control section 40 sequentially supplying the second musical score data D2 to the sound source section.
  • the second musical score data D2 is supplied to the sound source section in parallel with the performance of the performance part by the performer.
  • the playback control section 40 functions as a sequencer that processes the second musical score data D2.
  • the sound source section is a hardware sound source or a software sound source.
  • the reproduction control unit 40 controls the timing of supplying the second musical score data D2 to the sound source unit according to the control information U[t].
  • the musical tone of the reproduction part is reproduced by the sound emitting device 15, but the reproduction part is not limited to the above examples.
  • the playback control unit 40 may cause an electronic musical instrument capable of automatic performance to play the musical tone of the playback part. That is, the playback control unit 40 causes the electronic musical instrument to automatically perform the playback part by controlling the electronic musical instrument according to the control information U[t].
  • the playback control unit 40 may, for example, control the playback of a video related to the playback part (hereinafter referred to as "target video").
  • the target video is a video that shows a specific performer playing the playback part of the target song.
  • the target video may be a captured video of a real performer playing the playback part on an instrument, or a composite video generated by image processing of a virtual performer playing the playback part. Ru. Note that it does not matter whether or not there is sound in the target video.
  • Video data representing the target video is stored in the storage device 12.
  • the playback control unit 40 displays the target moving image on the display device 13 by outputting the moving image data.
  • the playback control unit 40 controls playback of the target video according to the control information U[t].
  • the playback control unit 40 generates playback information P[t] from the control information U[t], and displays a portion of the target video that corresponds to the playback information P[t] on the display device 13. . That is, the playback position p1[t] and playback speed p2[t] of the target moving image are controlled in conjunction with the performance of the performance part by the performer.
  • the virtual performer (hereinafter referred to as "virtual performer”) represented by the target video is, for example, an avatar existing in the virtual space.
  • the playback control unit 40 displays on the display device 13 a virtual performer and a background image photographed by a virtual camera in the virtual space.
  • the display device 13 may be installed in an HMD (Head Mounted Display) that is worn on the user's head.
  • HMD Head Mounted Display
  • the position and direction of the virtual camera in the virtual space are dynamically controlled according to the behavior (eg, position and direction) of the user's head. Therefore, by moving their head appropriately, the user can visually recognize the virtual performer from any position and direction in the virtual space.
  • the video data for displaying the virtual performer in the virtual space includes, for example, motion data representing the movements of the skeleton and joints of the virtual performer.
  • the motion data specifies, for example, changes in relative angle and position over time for each of the skeleton and joints.
  • the reproduction control unit 40 controls movement of the skeleton and joints represented by the motion data according to control information U[t] (or reproduction information P[t]).
  • the playback control unit 40 generates a virtual performer in a posture specified by the motion data as an object in the virtual space.
  • the virtual performer in the virtual space is controlled to have a posture corresponding to the skeleton and joints specified by the portion of the motion data that corresponds to the playback information P[t].
  • the reproduction control unit 40 changes the speed of movement of the skeleton and joints specified by the motion data according to the control information U[t]. Therefore, the performance by the virtual performer in the virtual space progresses in conjunction with the performance by the performer in the real space. For example, image processing such as modeling and texturing is used to generate a three-dimensional virtual performer. Then, the playback control unit 40 generates a planar image (target moving image) of the virtual performer in the virtual space captured by the virtual camera, through image processing such as rendering, for example. As mentioned above, the position and direction of the virtual camera change depending on the behavior of the user's head. The playback control unit 40 displays the target video generated by the above processing on the display device 13.
  • image processing such as modeling and texturing
  • the user can view the virtual performer playing the playback part from any position and direction in the virtual space.
  • a performer wearing an HMD can check from any position and direction in the virtual space how a virtual performer is playing a playback part in conjunction with the performer's performance of the playback part.
  • a virtual performer who plays the playback part is displayed, but for example, a virtual dancer who dances in conjunction with the progress of the playback part may be displayed on the display device 13.
  • Virtual performers and virtual dancers are collectively represented as virtual performers.
  • the display device 13 is attached to the user's head. person may be displayed.
  • the playback control unit 40 is comprehensively expressed as an element that controls the playback of the playback part.
  • "Reproduction of the reproduction part” includes reproduction of the musical tone of the reproduction part and reproduction of the moving image (target moving image) of the reproduction part.
  • the display device 13 and the sound emitting device 15 are playback devices that play back the playback part.
  • the first to third embodiments it is possible to control the reproduction of musical tones related to the reproduction part according to the performance by the performer.
  • this modification it is possible to control the reproduction of the moving image related to the reproduction part in accordance with the performance by the performer.
  • the performance data E representing the performance by the performer is supplied to the playback control system 10, but the input information corresponding to the performance by the performer is limited to the performance data E.
  • a signal representing the waveform of a musical tone played by a performer (hereinafter referred to as a "performance signal”) may be supplied to the playback control system 10 instead of the performance data E.
  • the performance signal is a signal generated by collecting musical tones produced by a musical instrument during a performance by a performer using a microphone.
  • the performance prediction unit 31 generates performance information S[t] by analyzing the performance signal. For example, the analysis unit 311 estimates the performance time t[k] and the performance position s[k] by analyzing the performance signal.
  • the prediction unit 312 generates performance information S[t] using a prediction model, as in the first embodiment. The above configuration also achieves the same effects as those of the above-described embodiments.
  • the state space model is exemplified as the prediction model used to predict the performance information S[t], but the form of the prediction model is not limited to the above examples.
  • a statistical model such as a deep neural network or a hidden Markov model may be used as a predictive model.
  • the performance information S[t] includes the performance position s1[t] and the performance speed s2[t], but the format of the performance information S[t] is as follows. Not limited to examples. For example, the performance speed s2[t] may be omitted. That is, the performance information S[t] is comprehensively expressed as information including the performance position s1[t].
  • the reproduction information P[t] is not limited to information including the reproduction position p1[t] and the reproduction speed p2[t]. For example, the playback speed p2[t] may be omitted. That is, the playback information P[t] is comprehensively expressed as information including the playback position p1[t].
  • the formats of the state variable X[t] and the control information U[t] are not limited to the examples in each of the above-mentioned forms.
  • the speed error x2[t] may be omitted from the state variable X[t].
  • the playback speed u2[t] may be omitted from the control information U[t].
  • the predictive control unit 30 generates the control information U[t] by model predictive control using one predictive model, but a plurality of different predictive models are selected. It may be used for The prediction control unit 30 generates control information U[t] for one or more playback parts of the target song using any one of the plurality of prediction models.
  • a prediction model is prepared for each performer.
  • Each performer's prediction model is a state space model that reflects the performance tendency of the performer.
  • the predictive control unit 30 generates control information U[t] for one or more playback parts of the target song by using a prediction model corresponding to the performer of the performance part from among the plurality of prediction models.
  • a prediction model may be prepared for each set of a plurality of performers (for example, for each orchestra).
  • a prediction model may be prepared for each attribute of the target song, for example.
  • the attributes of the target song are, for example, the music genre of the target song (for example, rock, pop, jazz, trance, hip-hop, etc.) or the musical impression (for example, "a song with a bright impression", "a song with a dark impression", etc.).
  • the prediction control unit 30 generates control information U[t] for one or more playback parts of the target song by using a prediction model corresponding to the attribute of the target song among the plurality of prediction models.
  • the reproduction of the reproduction part can be controlled in various ways according to the selection conditions of the prediction model (for example, performer or attribute).
  • the playback control system 10 may be realized by a server device that communicates with a terminal device such as a mobile phone or a smartphone.
  • the predictive control unit 30 of the playback control system 10 generates the control information U[t] by processing the performance data E (or performance signal) received from the terminal device.
  • the music data D stored in the storage device 12 of the playback control system 10 or the music data D transmitted from the terminal device is used to generate the control information U[t].
  • the playback control unit 40 transmits a portion of the audio signal Z (or video data of the target video) that corresponds to the control information U[t] to the terminal device. Note that in a configuration in which the playback control unit 40 is installed in a terminal device, the control information U[t] may be transmitted from the playback control system 10 to the terminal device.
  • the functions of the playback control system 10 are realized through cooperation between one or more processors forming the control device 11 and the program stored in the storage device 12. .
  • the programs exemplified above may be provided in a form stored in a computer-readable recording medium and installed on a computer.
  • the recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but any known recording medium such as a semiconductor recording medium or a magnetic recording medium is used. Also included are recording media in the form of.
  • the non-transitory recording medium includes any recording medium excluding transitory, propagating signals, and does not exclude volatile recording media.
  • a recording medium that stores a program in the distribution device corresponds to the above-mentioned non-transitory recording medium.
  • a playback control method includes a method for controlling at least one of the pieces of music through model predictive control using a prediction model that predicts performance information including a performance position in the piece of music for at least one performer.
  • control information is generated for the playback part, and the playback of the playback part in the music piece is controlled by the control information generated for the at least one playback part.
  • model predictive control is used to generate the control information, it is possible to appropriately control the reproduction of the reproduction part according to the performance by the performer.
  • Performance information is data in any format including the performance position.
  • the performance information includes a performance position and a performance speed.
  • the performance position is the position in the song where the performer is playing.
  • the performance speed is the speed (tempo) at which the performer plays the music.
  • control information is data in any format for controlling reproduction of a reproduction part.
  • the control information includes the amount of change in playback position and the amount of change in playback speed.
  • the information and processing used to predict performance information are arbitrary. For example, it is assumed that performance data representing a performance by a performer or a performance signal representing a waveform of a musical tone played by a performer is used for predicting performance information. Further, for example, a video of a user playing a performance may be used for predicting performance information.
  • Various prediction models are used to predict performance information. As the prediction model, for example, a state space model such as a Kalman filter is used.
  • the “playback part” is a music part that is to be controlled by the control information among the plurality of music parts that make up the song.
  • “Reproduction of a playback part” includes not only playback of audio (for example, automatic performance) related to the playback part, but also playback of video related to the playback part.
  • the at least one performer is a plurality of performers, and in predicting the performance information, the performance information is predicted for each of the plurality of performers. According to the above aspect, it is possible to appropriately control the reproduction of the reproduction part according to performances by a plurality of performers.
  • the at least one reproduction part is a plurality of reproduction parts, and in generating the control information, the control information is generated for each of the plurality of reproduction parts. do.
  • the reproduction of each of the plurality of reproduction parts can be controlled according to the performance by the performer.
  • the playback control includes controlling the playback of musical tones related to the at least one playback part of the music piece. According to the above aspect, it is possible to control the reproduction of musical tones related to the reproduction part according to the performance by the performer.
  • the playback control includes controlling the playback of a video related to the at least one playback part of the song.
  • the moving image is, for example, a moving image in which a virtual performer (for example, a performer or a dancer) in a virtual space performs a reproduction part.
  • any one of aspects 1 to 5 (aspect 6), in the model predictive control, playback including performance information predicted for the at least one performer and a playback position of the at least one playback part.
  • Generating control information for the at least one playback part such that a cost including a state variable representing an error in the information and the at least one playback part is reduced.
  • the control information is generated so that the cost including the state variable representing the error between the performance information and the playback information is reduced. Therefore, it is possible to link the reproduction of the reproduction part to the performance by the performer.
  • “Reproduction information” is data in any format including the reproduction position.
  • the playback information includes a playback position and a playback speed.
  • the playback position is the position within the song where the song is being played.
  • the playback speed is the speed at which the song is played.
  • At least one variable included in the cost is further set in accordance with an instruction from the user. According to the above aspect, since the variable related to the cost is set according to the instruction from the user, the user's intention can be reflected in the reproduction of the reproduction part.
  • variables of a cost are various variables that are applied to calculations related to the cost. Specifically, in a form where the objective variable includes a state cost and a control cost, the first weight value for the state cost and the second weight value for the control cost are set as "variables" according to instructions from the user. is set.
  • the cost includes the state variable and the control information, a state cost and a control cost
  • the state cost is a cost related to the state variable
  • the control cost is a cost related to temporal changes in the reproduction information.
  • the costs include state costs related to state variables and control costs related to temporal changes in reproduction information.
  • a first weight value and a second weight value are further set, the state cost is a cost weighted by the first weight value, and the control cost is a cost weighted by the first weight value, and the control cost is a cost weighted by the first weight value. This is the cost weighted by the second weight value.
  • the state cost is weighted by the first weight value, and the control cost is weighted by the second weight value. Therefore, the relationship between the performance by the performer and the reproduction of the reproduction part can be changed according to the settings of the first weight value and the second weight value.
  • aspect 10 in setting the first weight value and the second weight value, the first weight value and the second weight value are changed according to instructions from the user.
  • each of the first weight value and the second weight value is set according to an instruction from the user. Therefore, the user can change the relationship between the performance by the performer and the reproduction of the reproduction part.
  • the performance information in predicting the performance information, is predicted by using any of a plurality of prediction models for predicting the performance information. Predict performance information about.
  • the playback of the playback part can be controlled in various ways according to the selection conditions of the prediction model.
  • a playback control system performs model prediction control using a prediction model that predicts performance information including a performance position in a song for at least one performer. a predictive control unit that generates control information for the playback part; and a playback control unit that controls playback of the playback part in the song based on the control information generated for the at least one playback part.
  • a program according to one aspect (aspect 13) of the present disclosure performs playback of at least one piece of music through model predictive control using a prediction model that predicts performance information including a performance position in a piece of music for at least one performer.
  • the computer system is caused to function as a predictive control unit that generates control information for the part, and a playback control unit that controls playback of the playback part in the music piece using the control information generated for the at least one playback part.
  • An information processing method provides a model predictive control using a predictive model that predicts performance information including a performance position in a song for at least one performer. generates control information for the playback part, controls the movement of the skeleton and joints represented by the motion data according to the control information, and moves a virtual demonstrator in a posture corresponding to the controlled skeleton and joints in a virtual space.
  • An image of the virtual space captured by a virtual camera whose position and direction are controlled according to the behavior of the user's head is displayed on the display device.
  • Performance system 10
  • Playback control system 11
  • Control device 12
  • Storage device 13
  • Display device 14
  • Operating device 15
  • Sound emitting device 20
  • Keyboard instrument 30
  • Prediction control unit 31
  • Performance prediction section 311
  • Analysis section 312
  • Prediction section 32
  • Information generation section 33
  • Arithmetic processing section 34
  • Variable setting section 40... Playback control section.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

L'invention concerne un système de commande de reproduction comprenant : une unité de commande de prédiction qui génère des informations de commande concernant au moins une partie reproduite d'une composition musicale en utilisant une commande de prédiction de modèle qui utilise un modèle de prédiction pour prédire des informations de performance, y compris la position de performance, au sein de la composition musicale, d'au moins un musicien ; et une unité de commande de reproduction qui commande la reproduction de la partie reproduite de la composition musicale en utilisant les informations de commande générées concernant ladite partie reproduite.
PCT/JP2022/009776 2022-03-07 2022-03-07 Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme WO2023170757A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/009776 WO2023170757A1 (fr) 2022-03-07 2022-03-07 Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/009776 WO2023170757A1 (fr) 2022-03-07 2022-03-07 Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme

Publications (1)

Publication Number Publication Date
WO2023170757A1 true WO2023170757A1 (fr) 2023-09-14

Family

ID=87936234

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/009776 WO2023170757A1 (fr) 2022-03-07 2022-03-07 Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme

Country Status (1)

Country Link
WO (1) WO2023170757A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007241181A (ja) * 2006-03-13 2007-09-20 Univ Of Tokyo 自動伴奏システム及び楽譜追跡システム
JP2018063295A (ja) * 2016-10-11 2018-04-19 ヤマハ株式会社 演奏制御方法および演奏制御装置
JP2019139295A (ja) * 2018-02-06 2019-08-22 ヤマハ株式会社 情報処理方法および情報処理装置
US10643593B1 (en) * 2019-06-04 2020-05-05 Electronic Arts Inc. Prediction-based communication latency elimination in a distributed virtualized orchestra
JP2021043258A (ja) * 2019-09-06 2021-03-18 ヤマハ株式会社 制御システム、及び制御方法
EP3869495A1 (fr) * 2020-02-20 2021-08-25 Antescofo Synchronisation améliorée d'un accompagnement musical pré-enregistré sur la lecture de musique d'un utilisateur

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007241181A (ja) * 2006-03-13 2007-09-20 Univ Of Tokyo 自動伴奏システム及び楽譜追跡システム
JP2018063295A (ja) * 2016-10-11 2018-04-19 ヤマハ株式会社 演奏制御方法および演奏制御装置
JP2019139295A (ja) * 2018-02-06 2019-08-22 ヤマハ株式会社 情報処理方法および情報処理装置
US10643593B1 (en) * 2019-06-04 2020-05-05 Electronic Arts Inc. Prediction-based communication latency elimination in a distributed virtualized orchestra
JP2021043258A (ja) * 2019-09-06 2021-03-18 ヤマハ株式会社 制御システム、及び制御方法
EP3869495A1 (fr) * 2020-02-20 2021-08-25 Antescofo Synchronisation améliorée d'un accompagnement musical pré-enregistré sur la lecture de musique d'un utilisateur

Similar Documents

Publication Publication Date Title
CN109478399B (zh) 演奏分析方法、自动演奏方法及自动演奏系统
US11557269B2 (en) Information processing method
CN111052223B (zh) 播放控制方法、播放控制装置及记录介质
US10504498B2 (en) Real-time jamming assistance for groups of musicians
US8887051B2 (en) Positioning a virtual sound capturing device in a three dimensional interface
JP7432124B2 (ja) 情報処理方法、情報処理装置およびプログラム
US11609736B2 (en) Audio processing system, audio processing method and recording medium
US7504572B2 (en) Sound generating method
WO2023170757A1 (fr) Procédé de commande de reproduction, procédé de traitement d'informations, système de commande de reproduction, et programme
JP2018032316A (ja) 映像生成装置、映像生成モデル学習装置、その方法、及びプログラム
JP3233103B2 (ja) 運指データ作成装置及び運指表示装置
JP2018155936A (ja) 音データ編集方法
JP6838357B2 (ja) 音響解析方法および音響解析装置
JP4238237B2 (ja) 楽譜表示方法および楽譜表示プログラム
WO2024004564A1 (fr) Système d'analyse acoustique, procédé d'analyse acoustique et programme
Lin et al. VocalistMirror: A Singer Support Interface for Avoiding Undesirable Facial Expressions
US20230244646A1 (en) Information processing method and information processing system
WO2023182005A1 (fr) Procédé de sortie de données, programme, dispositif de sortie de données et instrument de musique électronique
WO2023181571A1 (fr) Procédé de sortie de données, programme, dispositif de sortie de données et instrument de musique électronique
WO2023181570A1 (fr) Procédé de traitement d'informations, système de traitement d'informations, et programme
WO2024085175A1 (fr) Procédé et programme de traitement de données
WO2022074754A1 (fr) Procédé de traitement d'informations, système de traitement d'informations et programme
JP2023154236A (ja) 情報処理システム、情報処理方法およびプログラム
JP2023154288A (ja) 制御装置
Weinberg et al. Robotic musicianship.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22930743

Country of ref document: EP

Kind code of ref document: A1