WO2023199772A1

WO2023199772A1 - Information processing method, information processing device, and program

Info

Publication number: WO2023199772A1
Application number: PCT/JP2023/013784
Authority: WO
Inventors: 菜津美土岐; 英里香北原; 真生野口
Original assignee: ヤマハ株式会社
Priority date: 2022-04-14
Filing date: 2023-04-03
Publication date: 2023-10-19
Also published as: JP2023157132A

Abstract

An information processing device according to the present invention comprises: a musical instrument identification unit that identifies the type of a musical instrument; and a notification control unit that notifies a user of a sound collecting position appropriate to the identified type as a position where musical performance sounds of the musical instrument should be collected.

Description

Information processing method, information processing device and program

The present disclosure relates to technology that supports sound collection by users.

Various techniques have been proposed in the past for notifying users of the position at which a sound collection device should be installed (hereinafter referred to as "sound collection position"). For example, Patent Document 1 discloses a technique for notifying users of the position of a sound collection device where the echo canceling function can be fully utilized in a remote conference.

Japanese Patent Application Publication No. 2012-205201

By the way, with the spread of video distribution by general users, for example, the number of cases in which musical instrument performances are recorded using simple sound recording devices is increasing. However, suitable sound pickup positions for musical instruments vary depending on the type of musical instrument. Therefore, without specialized knowledge regarding the collection of performance sounds, it is practically difficult to install the sound collection device at an appropriate position relative to the musical instrument. If the position of the sound collection device is inappropriate, problems such as excessive or insufficient recording volume or deterioration of the timbre of the recorded sound may occur. In consideration of the above circumstances, one aspect of the present disclosure aims to enable a user to grasp a suitable sound collection position for a musical instrument without requiring knowledge regarding the collection of performance sounds. shall be.

In order to solve the above problems, an information processing method according to one aspect of the present disclosure specifies the type of musical instrument, and sets the position where the performance sound of the musical instrument should be collected according to the specified type. Notify the user of the sound location.

An information processing device according to one aspect of the present disclosure includes an instrument identifying unit that identifies the type of musical instrument, and a user who selects a sound collection position according to the identified type as a position where the performance sound of the musical instrument should be collected. and a notification control unit that notifies the user.

A program according to one aspect of the present disclosure causes a computer system to function as an instrument specifying unit that specifies the type of musical instrument, and a notification control unit that notifies the user of a sound collection position according to the specified type.

FIG. 2 is an explanatory diagram of a situation in which the information processing device of the first embodiment is used. FIG. 1 is a block diagram illustrating the configuration of an information processing device. It is a schematic diagram of a guide image. 1 is a block diagram illustrating a functional configuration of an information processing device. FIG. It is a flowchart of musical instrument identification processing. It is a flowchart of control processing. FIG. 2 is a block diagram illustrating a functional configuration of an information processing device in a second embodiment. FIG. 3 is a block diagram illustrating a functional configuration of an information processing device in a third embodiment. FIG. 7 is a block diagram illustrating a functional configuration of an information processing device in a fourth embodiment. It is a schematic diagram of the guide image in 4th Embodiment. It is a flowchart of control processing in a 4th embodiment. It is a flowchart of control processing in a 5th embodiment.

A: First Embodiment FIG. 1 is an explanatory diagram of a situation in which the information processing apparatus 100 of the first embodiment is used. A user U of the information processing device 100 plays a musical instrument 200. In FIG. 1, a trumpet is illustrated as the musical instrument 200. The information processing device 100 is used as a recording system that records a video (video and audio) of a scene in which the user U plays the musical instrument 200. Specifically, the information processing device 100 captures a video of the user U playing the musical instrument 200 and captures the performance sound emitted from the musical instrument 200 in parallel. The performance sound is a musical sound emitted from the musical instrument 200 in response to the performance by the user U.

The position at which the performance sound should be collected (hereinafter referred to as "sound collection position") differs depending on the type of musical instrument 200 (hereinafter referred to as "instrument type"). The information processing device 100 of the first embodiment notifies the user U of the optimal sound collection position according to the type of musical instrument 200 that the user U actually plays. Therefore, the user U can collect performance sounds at the optimal sound collection position for the musical instrument 200 that the user U is playing.

FIG. 2 is a block diagram illustrating the configuration of the information processing device 100. The information processing device 100 includes a control device 11 , a storage device 12 , a communication device 13 , an operating device 14 , a sound collection device 15 , an imaging device 16 , a display device 17 , and a sound output device 18 . The information processing device 100 is realized by a portable information device such as a smartphone or a tablet terminal. Note that the information processing device 100 may be realized by a portable or stationary information device such as a personal computer. Further, the information processing device 100 is realized not only as a single device but also as a plurality of devices configured separately from each other.

The control device 11 is one or more processors that control each element of the information processing device 100. Specifically, for example, CPU (Central Processing Unit), GPU (Graphics Processing Unit), SPU (Sound Processing Unit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array), or ASIC (Application Specific Integrated Circuit). The control device 11 is composed of one or more types of processors such as the following.

The sound collection device 15 is a microphone that generates the audio signal A by collecting surrounding sounds. Specifically, the performance sound of the musical instrument 200 is collected by the sound collection device 15. That is, the acoustic signal A is a signal representing the waveform of the performance sound. For example, the sound collection device 15 may be a single microphone or a microphone array in which a plurality of microphones are arranged in a straight line or in a matrix. Note that illustration of an A/D converter that converts the audio signal A from analog to digital and an amplifier that amplifies the audio signal A are omitted for convenience.

The imaging device 16 is a camera that generates a video signal V by imaging surrounding objects. Specifically, an image of the user U playing the musical instrument 200 is captured. That is, the video signal V represents a moving image of a scene in which the user U plays the musical instrument 200. For example, the imaging device 16 includes an optical system such as a photographic lens, an imaging element that receives incident light from the optical system, and a processing circuit that generates a video signal V according to the amount of light received by the imaging element.

As described above, the sound collecting device 15 and the imaging device 16 record the moving image (video and audio) of the scene in which the user U plays the musical instrument 200. The sound collection device 15 and the imaging device 16 are integrally installed in the information processing device 100. That is, both the sound collection device 15 and the imaging device 16 are housed and supported in the portable housing of the information processing device 100.

As illustrated in FIG. 1, the information processing device 100 is installed at a specific position within the space where the user U is located. The user U moves while playing the musical instrument 200 so that the information processing device 100 (sound collection device 15) is located at the sound collection position notified by the information processing device 100. By adjusting the positional relationship between the information processing device 100 and the musical instrument 200 through the above-described procedure, it becomes possible to collect the performance sound of the musical instrument 200 at an appropriate sound collection position. In the first embodiment, the sound collection device 15 is installed in the information processing device 100, so the sound collection position notified by the information processing device 100 can also be referred to as the location where the information processing device 100 should be installed.

The communication device 13 in FIG. 2 communicates with an external device via a communication network 300 such as the Internet, for example. Note that communication between the communication device 13 and the communication network 300 may be either wired or wireless. Further, a communication device 13 that is separate from the information processing device 100 may be connected to the information processing device 100 by wire or wirelessly.

The communication device 13 of the first embodiment communicates with the distribution system 400 via the communication network 300. Specifically, the communication device 13 transmits video content C representing the performance of the musical instrument 200 by the user U to the distribution system 400. Video content C is content representing a video corresponding to audio signal A and video signal V. Distribution system 400 distributes video content C to other terminal devices (not shown). As described above, the user U can distribute the video content C containing his or her performance on the musical instrument 200 to a large number of terminal devices. Note that the communication device 13 may transmit the video content C to another terminal device via the communication network 300. That is, the distribution system 400 may be omitted. Furthermore, music content that does not include the video signal V may be transmitted from the information processing device 100.

The operating device 14 is an input device that accepts operations by the user U. For example, an operator operated by the user U or a touch panel that detects a touch by the user U is used as the operating device 14. Note that an operating device 14 separate from the information processing device 100 may be connected to the information processing device 100 by wire or wirelessly.

The storage device 12 is one or more memories that store programs executed by the control device 11 and various data used by the control device 11. For example, a known recording medium such as a semiconductor recording medium and a magnetic recording medium, or a combination of multiple types of recording media is used as the storage device 12. Note that, for example, a portable recording medium that can be attached to and detached from the information processing device 100 or a recording medium that can be accessed by the control device 11 via the communication network 300 (e.g., cloud storage) may be used as the storage device 12. It's okay.

The storage device 12 of the first embodiment stores reference data R and guide data G for each of a plurality of musical instrument types. The reference data R for each musical instrument type is data representing acoustic characteristics regarding the standard performance sound of the musical instrument 200 of the musical instrument type. For example, the reference data R is a feature value representing the characteristics of the timbre (frequency characteristics) of the performance sound, such as an intensity spectrum, MFCC (Mel-Frequency Cepstrum Coefficients), or MSLS (Mel-Scale Log Spectrum).

The guide data G for each musical instrument type is data representing the optimal sound collection position for the musical instrument 200 of the musical instrument type. That is, a plurality of sound collection positions corresponding to different types of musical instruments are stored in the storage device 12. Each guide data G represents an image (hereinafter referred to as a "guidance image") and a sound (hereinafter referred to as a "guidance voice") for notifying the user U of the sound collection position. The guide image may be either a moving image or a still image. As understood from the above description, the storage device 12 of the first embodiment stores a plurality of guide images corresponding to different types of musical instruments and a plurality of guide voices corresponding to different types of musical instruments.

FIG. 3 is a schematic diagram of a guide image where the musical instrument type is a trumpet. For example, if the performance sound is collected in front of the bell of a trumpet, the airflow blown from the bell will directly collide with the sound collection device 15, and there is a possibility that wind noise will be recorded along with the performance sound. Therefore, as illustrated in FIG. 3, the guide image corresponding to the type of musical instrument, trumpet, for example, is created at a predetermined position from the tip of the bell on a straight line forming a predetermined angle (30° to 40°) with respect to the central axis of the bell. A position separated by a distance of (1 m to 2 m) is guided as the sound collection position.

Further, for a woodwind instrument such as a clarinet or an oboe, for example, a position facing the side of the main body of the musical instrument 200 is guided as a sound collection position by the guide data G. For example, for a flute, a position diagonally above the main body of the musical instrument 200 is guided by the guidance data G as a sound collection position. As described above, the sound collection position represented by the guide data G differs depending on the type of musical instrument.

The display device 17 in FIG. 2 displays images under the control of the control device 11. For example, the display device 17 of the first embodiment displays the guide image shown in FIG. 3 represented by the guide data G. Note that a display device 17 that is separate from the information processing device 100 may be connected to the information processing device 100 by wire or wirelessly.

The sound emitting device 18 emits sound waves under the control of the control device 11. For example, the sound emitting device 18 of the first embodiment reproduces the guidance voice represented by the guidance data G. The sound emitting device 18 is, for example, a speaker or headphones. Note that a sound emitting device 18 that is separate from the information processing device 100 may be connected to the information processing device 100 by wire or wirelessly.

FIG. 4 is a block diagram illustrating the functional configuration of the information processing device 100. The control device 11 has a plurality of functions (instrument identification section 21, notification control section 22, recording processing section 23) for notifying the user U of the sound collection position by executing a program stored in the storage device 12. Realize.

The musical instrument identification unit 21 identifies the type of musical instrument 200 played by the user U. The musical instrument identification unit 21 of the first embodiment identifies the type of musical instrument by analyzing the acoustic signal A generated by the sound collection device 15 by collecting performance sounds.

FIG. 5 is a flowchart of a process in which the musical instrument identifying unit 21 identifies the type of musical instrument (hereinafter referred to as "musical instrument identifying process"). When the musical instrument identification process is started, the musical instrument identification unit 21 generates feature data Q from the acoustic signal A (Sa11). The feature data Q is data representing the acoustic features of the acoustic signal A. Specifically, the same type of feature as the reference data R is generated as the feature data Q. That is, the feature data Q is exemplified by a feature amount representing the characteristics of the timbre (frequency characteristics) of the performance sound, for example.

The musical instrument identification unit 21 calculates the degree of similarity between the reference data R and the feature data Q for each of the plurality of reference data R stored in the storage device 12 (Sa12). The degree of similarity is an index representing the degree of similarity between the reference data R and the feature data Q. For example, distance or correlation is exemplified as the degree of similarity.

The musical instrument identifying unit 21 identifies the musical instrument type corresponding to the reference data R having the maximum degree of similarity to the feature data Q among the plurality of reference data R as the type of musical instrument 200 played by the user U (Sa13). . Note that maximum similarity means minimum distance or maximum correlation. As understood from the above description, in the first embodiment, the type of musical instrument is identified by focusing on the timbre (frequency characteristics) of the performance sound.

The notification control unit 22 in FIG. 4 notifies the user U of the sound collection position according to the type of musical instrument. The notification control unit 22 of the first embodiment generates guidance data (hereinafter referred to as “specific guidance data”) G corresponding to the musical instrument type identified by the musical instrument identification unit 21 among the plurality of guidance data G corresponding to different musical instrument types. Using this, the user U is notified of the sound collection position. That is, the user U is notified of the sound collection position represented by the specific guidance data G. Specifically, the notification control unit 22 displays the guidance image represented by the specific guidance data G on the display device 17, and reproduces the guidance sound represented by the specific guidance data G by the sound emitting device 18.

As understood from the above description, the notification control unit 22 displays the guide image corresponding to the type of musical instrument 200 played by the user U on the display device 17, from among the plurality of guide images corresponding to different types of musical instruments. indicate. Therefore, the user U can visually confirm a suitable sound collection position for the musical instrument 200 that is assumed to be the target of sound collection by the sound collection device 15.

Additionally, the notification control unit 22 causes the sound emitting device 18 to reproduce the guidance voice corresponding to the type of musical instrument 200 played by the user U, among the plurality of guidance voices corresponding to different types of musical instruments. Therefore, the user U can aurally confirm a suitable sound collection position for the musical instrument 200 that is assumed to be the target of sound collection by the sound collection device 15.

The recording processing unit 23 generates video content C by recording the performance of the musical instrument 200 by the user U. The recording processing unit 23 performs various types of audio processing on the audio signal A collected by the sound collection device 15. The acoustic processing is signal processing that adjusts the acoustic characteristics of the performance sound represented by the acoustic signal A. For example, any known sound processing is performed on the sound signal A, such as effect adding processing that adds various sound effects such as reverb or compressor to the sound signal A, or equalizing processing that adjusts the signal level for each frequency band. be done. Note that the sound processing may be omitted. The recording processing unit 23 generates video content C including the video signal V generated by the imaging device 16 and the audio signal A after the audio processing. Furthermore, the recording processing unit 23 transmits the video content C from the communication device 13 to the distribution system 400.

FIG. 6 is a flowchart of the process (hereinafter referred to as "control process") executed by the control device 11. For example, the control process is started in response to an instruction from the user U to the operating device 14. When the user U instructs to start the control process, the user U plays the musical instrument 200. Specifically, the user U plays, for example, a specific pitch by operating the musical instrument 200.

When the control process is started, the control device 11 (musical instrument identifying unit 21) identifies the type of musical instrument 200 played by the user U through the musical instrument identifying process illustrated in FIG. 5 (Sa1). Specifically, the control device 11 identifies the type of musical instrument by analyzing the acoustic signal A representing the performance sound of a specific pitch played by the user U.

The notification control unit 22 notifies the user U of the sound collection position according to the type of musical instrument (Sa2, Sa3). Specifically, the notification control unit 22 displays the guidance image represented by the specific guidance data G on the display device 17 (Sa2), and reproduces the guidance sound represented by the specific guidance data G by the sound emitting device 18 (Sa3). As understood from the above description, the display device 17 and the sound emitting device 18 function as a notification device that notifies the user U of the sound collection position. The user U moves with the musical instrument 200 so that the information processing device 100 (sound collection device 15) is located at the sound collection position notified by the information processing device 100.

After adjusting the positional relationship between the information processing device 100 and the musical instrument 200 through the above procedure, the user U instructs to start recording by operating the operating device 14. When the user U instructs to start recording, the user U starts playing the musical instrument 200.

The control device 11 (recording processing unit 23) waits until the start of recording is instructed by the user U (Sa4: NO). When the start of recording is instructed (Sa4: YES), the control device 11 (recording processing unit 23) creates a video using the audio signal A generated by the sound pickup device 15 and the video signal V generated by the imaging device 16. Content C is generated (Sa5). The control device 11 (recording processing unit 23) transmits the video content C from the communication device 13 to the distribution system 400 (Sa6).

As explained above, in the first embodiment, the user U is notified of the sound collection position according to the type of musical instrument. Therefore, the user U can grasp a suitable sound collection position for the musical instrument 200 that he or she is playing without requiring any knowledge regarding the collection of performance sounds. Furthermore, the user U can collect the performance sound of the musical instrument 200 under suitable conditions by installing the information processing device 100 (sound collection device 15) at the notified sound collection position.

Further, in the first embodiment, a sound collection device 15 that collects performance sounds is installed in the portable information processing device 100 together with the musical instrument identification section 21 and the notification control section 22. Therefore, the user U can perform a performance using the sound collection device 15 by simply adjusting the positional relationship between the information processing device 100 and the musical instrument 200 while checking the sound collection position notified by the notification control unit 22. Sound collection can be achieved.

B: Second Embodiment The second embodiment will be described. In addition, in each aspect illustrated below, for elements whose functions are similar to those in the first embodiment, the same reference numerals as in the description of the first embodiment are used, and detailed descriptions of each are omitted as appropriate.

FIG. 7 is a block diagram illustrating the functional configuration of the information processing device 100 in the second embodiment. The control device 11 of the second embodiment realizes the same functions as those of the first embodiment (musical instrument identification section 21, notification control section 22, recording processing section 23). In the second embodiment, the operation of the musical instrument identifying section 21 is different from that in the first embodiment.

The musical instrument identification unit 21 of the second embodiment identifies the type of musical instrument by analyzing the video signal V generated by the imaging device 16 by imaging the musical instrument 200. The instrument identification unit 21 uses object detection using a learned model, for example, to identify the type of instrument. The trained model is a statistical estimation model that has learned the relationship between the video signal V and the type of musical instrument through machine learning using a large amount of training data. The trained model is composed of a deep neural network such as a convolutional neural network, for example. Therefore, in the second embodiment, the reference data R illustrated in the first embodiment is omitted. Note that the method of identifying the type of musical instrument using the video signal V is not limited to the above example.

The control processing procedure in the second embodiment is the same as that in the first embodiment (FIG. 6). However, in the musical instrument identifying process (Sa1) of the second embodiment, the control device 11 (musical instrument identifying unit 21) identifies the type of musical instrument by analyzing the video signal V. The operation in which the notification control unit 22 notifies the user U of the sound collection position according to the type of musical instrument, and the operation in which the recording processing unit 23 generates the video content C are the same as in the first embodiment.

The same effects as in the first embodiment are achieved in the second embodiment as well. Furthermore, in the second embodiment, the type of musical instrument is identified by analyzing the video signal V generated by the imaging device 16. Therefore, the type of musical instrument can be identified even in an environment where the type of musical instrument cannot be identified with high precision from the acoustic signal A, such as an environment where there is noise. On the other hand, in the first embodiment, the type of musical instrument is identified by analyzing the acoustic signal A generated by the sound collection device 15. Therefore, the type of musical instrument can be identified even in an environment where the type of musical instrument cannot be identified with high precision from the video signal V, such as an environment where there is not a sufficient amount of light. Note that the musical instrument identifying section 21 may identify the type of musical instrument by analyzing both the audio signal A and the video signal V.

C: Third Embodiment FIG. 8 is a block diagram illustrating the functional configuration of an information processing apparatus 100 in a third embodiment. The control device 11 of the third embodiment realizes the same functions as those of the first embodiment (musical instrument identification section 21, notification control section 22, recording processing section 23). In the third embodiment, the operation of the musical instrument identifying section 21 is different from the first embodiment.

The musical instrument identification unit 21 of the third embodiment identifies the type of musical instrument according to an instruction from the user U. The user U can instruct the information processing device 100 about the type of musical instrument by operating the operating device 14 . For example, the user U selects the type of musical instrument 200 that he or she plays from among the multiple candidates displayed on the display device 17. The musical instrument identifying unit 21 identifies the type instructed by the user U as the musical instrument type. Therefore, in the third embodiment as well, the reference data R is omitted as in the second embodiment.

The control processing procedure in the third embodiment is the same as that in the first embodiment (FIG. 6). However, in the musical instrument identifying process (Sa1) of the third embodiment, the control device 11 (musical instrument identifying unit 21) identifies the type of musical instrument according to an instruction from the user U. The operation in which the notification control unit 22 notifies the user U of the sound collection position according to the type of musical instrument, and the operation in which the recording processing unit 23 generates the video content C are the same as in the first embodiment.

The same effects as in the first embodiment are achieved in the third embodiment as well. Further, in the third embodiment, the type of musical instrument is specified according to an instruction from the user U. Therefore, even in an environment where the type of musical instrument cannot be identified with high precision from the audio signal A or the video signal V, for example, the type of musical instrument can be identified. On the other hand, according to the first embodiment in which the type of musical instrument is specified from the audio signal A or the second embodiment in which the type of musical instrument is specified from the video signal V, no instruction from the user U is required to specify the type of musical instrument. Therefore, the effort required by the user U to specify the type of musical instrument can be reduced.

D: Fourth Embodiment FIG. 9 is a block diagram illustrating the functional configuration of an information processing apparatus 100 in a fourth embodiment. The control device 11 of the fourth embodiment functions as a feature extraction section 24 in addition to the same functions as those of the first embodiment (musical instrument identification section 21, notification control section 22, recording processing section 23).

The feature extraction unit 24 generates feature data Q from the acoustic signal A generated by the sound collection device 15, similar to the musical instrument identification unit 21 of the first embodiment. The feature data Q is data representing the acoustic features of the acoustic signal A. Specifically, the feature data Q includes an observed value qx of the first feature amount and an observed value qy of the second feature amount. The first feature amount and the second feature amount are different types of feature amounts regarding the performance sound. For example, the first feature amount is a feature amount representing the timbre characteristic of the performance sound, and the second feature amount is a feature amount related to the volume of the performance sound. The observed value qx is the numerical value of the first feature extracted from the acoustic signal A, and the observed value qy is the numerical value of the second feature extracted from the acoustic signal A. The observed value qx and the observed value qy change depending on the positional relationship between the information processing device 100 and the musical instrument 200.

Similar to the first embodiment, the storage device 12 of the fourth embodiment stores a plurality of reference data R corresponding to different types of musical instruments. Each reference data R includes a reference value rx of the first feature amount and a reference value ry of the second feature amount. The reference value rx corresponding to each musical instrument type is a numerical value of the first feature amount regarding the standard performance sound of the musical instrument 200 of the musical instrument type. The reference value ry corresponding to each musical instrument type is a numerical value of the second feature amount regarding the standard performance sound of the musical instrument 200 of the musical instrument type. That is, the reference value rx corresponds to a standard or ideal numerical value of the observed value qx, and the reference value ry corresponds to a standard or ideal numerical value of the observed value qy. A suitable sound collection position for a specific type of musical instrument 200 is such that the observed value qx approximates (ideally matches) the reference value rx in the reference data R of the relevant musical instrument type, and the reference value R in the reference data R of the relevant musical instrument type This is the point where the observed value qy approximates (ideally matches) the reference value ry within.

The musical instrument identifying unit 21 of the fourth embodiment identifies the musical instrument type through musical instrument identifying processing that compares the feature data Q generated by the feature extracting unit 24 and the reference data R of each musical instrument type. Specifically, the musical instrument identifying unit 21 selects the musical instrument type corresponding to the reference data R having the maximum degree of similarity with the feature data Q among the plurality of reference data R as the type of the musical instrument 200 played by the user U. Identify (Sa12, Sa13).

The notification control unit 22 of the fourth embodiment displays the guide image of FIG. 10 on the display device 17. An X-axis and a Y-axis that are perpendicular to each other are set in the guide image. The X-axis is a number line representing the numerical value of the first feature quantity (timbre), and the Y-axis is a number line representing the numerical value of the second characteristic quantity (volume).

The reference point Zr in FIG. 10 corresponds to the reference value rx and the reference value ry of the reference data (hereinafter referred to as "specific reference data") R that corresponds to the musical instrument type identified by the instrument identifying section 21 among the plurality of reference data R. It is a spot. That is, the X coordinate of the reference point Zr corresponds to the reference value rx, and the Y coordinate of the reference point Zr corresponds to the reference value ry. The reference point Zr is a point determined depending on the type of musical instrument. On the other hand, observation point Zq in FIG. 10 is a point corresponding to feature data Q. That is, the X coordinate of observation point Zq corresponds to observation value qx, and the Y coordinate of observation point Zq corresponds to observation value qy. Therefore, the observation point Zq moves within the XY plane depending on the positional relationship between the musical instrument 200 played by the user U and the information processing device 100 (sound collection device 15). When the information processing device 100 is at an ideal sound collection position with respect to the musical instrument 200, the observation point Zq sufficiently approximates (ideally coincides with) the reference point Zr. That is, the specific reference data R and the feature data Q are sufficiently approximate. The user U moves with the musical instrument 200 relative to the information processing device 100 while checking the guide image so that the observation point Zq approaches the reference point Zr. As can be understood from the above description, the notification control unit 22 of the fourth embodiment causes the feature data Q to approach the specific reference data R corresponding to the musical instrument type identified by the musical instrument identification unit 21 among the plurality of reference data R. Then, the user U is notified of the sound collection position.

FIG. 11 is a flowchart of control processing in the fourth embodiment. For example, the control process is started in response to an instruction from the user U to the operating device 14. When the user U instructs to start the control process, the user U plays the musical instrument 200. Specifically, the user U plays, for example, a specific pitch by operating the musical instrument 200.

When the control process is started, the control device 11 (feature extraction unit 24) generates feature data Q from the acoustic signal A generated by the sound collection device 15 (Sb1). The control device 11 (musical instrument identifying unit 21) identifies the type of musical instrument through musical instrument identifying processing using the feature data Q (Sb2). Specifically, the control device 11 specifies, from among the plurality of reference data R, the type of musical instrument corresponding to the reference data R having the maximum degree of similarity to the feature data Q. The degree of similarity may be calculated between a portion of the reference data R and a portion of the feature data Q. For example, the degree of similarity may be calculated between the observed value qx of the feature data Q and the reference value rx of the reference data R. As understood from the above description, in the fourth embodiment as well, the type of musical instrument is identified by analyzing the acoustic signal A, similarly to the first embodiment.

On the other hand, the user U moves with the musical instrument 200 while visually checking the guide image displayed on the display device 17. When the control device 11 specifies the type of musical instrument, the following process is executed to sequentially update the guide images in parallel with the movement by the user U.

The control device 11 (feature extraction unit 24) generates feature data Q from the acoustic signal A (Sb3). Then, the control device 11 (notification control section 22) displays the guide image of FIG. 10 on the display device 17 (Sb4). Specifically, the guide image includes the reference point Zr corresponding to the specific reference data R and the observation point Zq corresponding to the feature data Q, as described above. As described above, the specific reference data R is the reference data R that corresponds to the type of musical instrument 200 played by the user U, out of the plurality of reference data R stored in the storage device 12.

The control device 11 (notification control unit 22) determines whether the observation point Zq has sufficiently approached the reference point Zr (Sb5). Specifically, the control device 11 determines whether the degree of similarity between the specific reference data R and the feature data Q exceeds a predetermined threshold. That is, it is determined whether the observation point Zq is located within a predetermined range that includes the reference point Zr.

If the observation point Zq is not close to the reference point Zr (Sb5: NO), the control device 11 moves the process to step Sb3. That is, the generation of the feature data Q (Sb3) and the updating of the guide image (Sb4) are repeated until the observation point Zq approaches the reference point Zr. The user U moves with the musical instrument 200 so that the observation point Zq approaches the reference point Zr while checking the guide image. As the user U moves, the position of the observation point Zq in the guide image is changed at any time.

When the observation point Zq is sufficiently close to the reference point Zr, it means that the information processing device 100 is at a suitable sound collection position with respect to the musical instrument 200. Therefore, when the observation point Zq approaches the reference point Zr sufficiently (Sb5: YES), the control device 11 (notification control section 22) takes advantage of the fact that the information processing device 100 and the musical instrument 200 are in a suitable positional relationship. Notify person U (Sb6). For example, the user U is notified that the information processing device 100 and the musical instrument 200 are in a suitable positional relationship, for example, by displaying an image on the display device 17 or reproducing audio using the sound emitting device 18. As understood from the above description, in the fourth embodiment as well, the user U is notified of the sound collection position according to the type of musical instrument 200 played by the user U, similarly to the first embodiment. However, in the fourth embodiment, the guide data G in the first embodiment is unnecessary.

The control device 11 (recording processing unit 23) waits until the start of recording is instructed by the user U (Sb7: NO). When the start of recording is instructed (Sb7: YES), the control device 11 (recording processing unit 23) creates a video using the audio signal A generated by the sound pickup device 15 and the video signal V generated by the imaging device 16. Content C is generated (Sb8). The control device 11 (recording processing unit 23) transmits the video content C from the communication device 13 to the distribution system 400 (Sb9). Note that if the start of recording is not instructed (Sb7: NO), the control device 11 may shift the process to step Sb3.

As explained above, in the fourth embodiment, the user U is notified of the sound collection position according to the type of musical instrument. Therefore, similarly to the first embodiment, the user U can grasp a suitable sound collection position for the musical instrument 200 that he/she plays without needing knowledge regarding the collection of performance sounds. Furthermore, the user U can collect the performance sound of the musical instrument 200 under suitable conditions by installing the information processing device 100 (sound collection device 15) at the notified sound collection position.

In the fourth embodiment, the sound collection position is notified so that the feature data Q representing the characteristics of the performance sound of the musical instrument 200 approaches the reference data R corresponding to the type of the musical instrument 200. Therefore, the user is not limited to the sound collection position assumed in advance for the musical instrument 200, but rather selects a sound collection position that can properly collect performance sounds under the environment in which the user U actually plays the musical instrument 200. Can notify U.

In the fourth embodiment, in particular, the acoustic signal A and the feature extraction section 24 are shared. Therefore, the configuration and processing of the information processing device 100 can be simplified compared to the first embodiment in which the guide data G separate from the acoustic signal A is required to notify the sound collection position.

In the above description, the type of musical instrument is specified by analyzing the acoustic signal A, but the method according to the second embodiment or the third embodiment is used to identify the type of musical instrument (Sb2) in the fourth embodiment. may be applied. That is, in the fourth embodiment, there is a configuration in which the type of musical instrument is identified by analyzing the video signal V (second embodiment), or a configuration in which the type of musical instrument is identified in response to an instruction from the user U (third embodiment). , may be adopted.

E: Fifth Embodiment The functional configuration of the information processing apparatus 100 in the fifth embodiment is the same as that in the fourth embodiment (FIG. 9). FIG. 12 is a flowchart of control processing in the fifth embodiment. For example, the control process is started in response to an instruction from the user U to the operating device 14. When the user U instructs to start the control process, the user U plays the musical instrument 200.

When the control process is started, the control device 11 (feature extraction unit 24) generates feature data Q from the acoustic signal A (Sc1). The control device 11 (musical instrument identifying unit 21) identifies the type of musical instrument through musical instrument identifying processing using the feature data Q (Sc2). That is, in the fifth embodiment as well, the type of musical instrument is identified by analyzing the acoustic signal A, similarly to the first to fourth embodiments.

The control device 11 (notification control unit 22) executes the first process (Sc3). The first process is a process of notifying the user U of the sound collection position using the specific guidance data G stored in the storage device 12, as in the first embodiment. Specifically, the first process includes a process (Sa2) of displaying the guide image shown in FIG. (Sa3). That is, in the first process, the control device 11 (notification control unit 22) selects, from among the plurality of sound collection positions stored in the storage device 12, the sound collection position corresponding to the musical instrument type specified by the musical instrument identification process. Notify U.

The control device 11 (notification control unit 22) executes the second process after executing the first process (Sc4). The second process is a process of notifying the user U of the sound collection position so that the feature data Q approaches the specific reference data R, as in the fourth embodiment. Specifically, the second process includes repeating the generation of feature data Q (Sb3) and the updating of the guide image (Sb4) until the observation point Zq approaches the reference point Zr, and This includes a process (Sb5, Sb6) of notifying the user U of the approach with Zr.

When the positional relationship between the information processing device 100 and the musical instrument 200 is adjusted through the above procedure, the control device 11 (recording processing unit 23) controls the recording by the user U, as in the first to fourth embodiments. In response to the instruction (Sc5), generation (Sc6) and transmission (Sc7) of video content C are executed.

The same effects as the first and fourth embodiments are achieved in the fifth embodiment as well. In the fifth embodiment, in particular, in the first process (Sc3), the user U is informed of the approximate sound collection position stored in advance in the storage device 12, and in the second process (Sc4), the feature data Q The user U is notified of a specific sound collection position where the sound source approaches the reference data R. Therefore, compared to the first embodiment in which only the general sound pickup position stored in advance is notified to the user U, the user U can hear the performance sound in an environment where the user U actually plays the musical instrument 200. The user U can be informed of a sound collection position where sound can be collected well. Further, after the positional relationship between the information processing device 100 and the musical instrument 200 is roughly adjusted by the first process, it can be finely adjusted according to the performance environment by the second process. Therefore, the information processing device 100 and the musical instrument 200 are adjusted to an appropriate positional relationship, compared to the fourth embodiment in which the user U is notified only of the sound collection positions where the feature data Q and the reference data R approach each other. This makes the work easier.

F: Modifications Specific modifications added to each of the embodiments exemplified above will be exemplified below. A plurality of aspects arbitrarily selected from the above-described embodiment and the modified examples illustrated below may be combined as appropriate to the extent that they do not contradict each other.

(1) The method of notifying the user U of the sound collection position according to the type of musical instrument is not limited to the above example. For example, the notification control unit 22 calculates the volume of the performance sound (hereinafter referred to as "observed volume") by analyzing the acoustic signal A, and sends the user U so that the observed volume approaches (ideally matches) a predetermined value. The sound collection position may be notified. For example, when the observed sound volume is below a predetermined value, the notification control unit 22 notifies the user U that the sound collection position will be moved closer to the musical instrument 200. On the other hand, when the observed sound volume exceeds the predetermined value, the notification control unit 22 notifies the user U to move the sound collection position away from the musical instrument 200.

(2) In each of the above embodiments, a scene in which one musical instrument 200 is played is assumed, but each of the above embodiments is also applicable to a scene in which a plurality of musical instruments 200 are played. Each of the plurality of musical instruments 200 is played by a different user U. For example, each of the plurality of users U who constitute one performance part of a band plays the musical instrument 200 of the performance part in one acoustic space. In the above situation, the notification control unit 22 notifies each user U of suitable sound collection positions for the plurality of musical instruments 200. Specifically, the notification control unit 22 notifies each user U of the sound collection positions where the characteristic data Q of the performance sounds of each musical instrument 200 are similar to each other across a plurality of musical instruments 200. For example, the user U is notified of sound collection positions where the volume or tone of the performance sound of each musical instrument 200 is similar to each other across a plurality of musical instruments 200. Therefore, the user U can grasp suitable sound collection positions for the plurality of musical instruments 200 without requiring knowledge regarding collection of performance sounds.

For example, a plurality of guide data G corresponding to different combinations of musical instrument types and the number of musical instruments 200 are stored in the storage device 12. The guide data G corresponding to a combination of a specific musical instrument type and a specific number represents a suitable sound collection position in an environment where the musical instruments 200 of the musical instrument type are present in the corresponding number. User U can input the number of musical instruments 200 by operating the operating device 14 . The notification control unit 22 acquires the guidance data G (specific guidance data G) corresponding to the combination of the musical instrument type identified by the musical instrument identification unit 21 and the number of musical instruments 200 instructed by the user U from the storage device 12. , informs the user U of the sound collection position represented by the guidance data G. According to the above configuration, in a scene where a plurality of musical instruments 200 are played in parallel, the user U can be informed of a comprehensively suitable sound collection position for the plurality of musical instruments 200.

(3) In each of the above embodiments, the performance sound of the musical instrument 200 is collected by one information processing device 100 (sound collection device 15), but the sound of the musical instrument 200 is collected by a plurality of sound collection devices 15 installed at different positions. The above embodiments are also applied to the case where 200 performance sounds are collected. Specifically, the notification control unit 22 sequentially notifies the user U of each of a plurality of sound collection positions having different relationships with the musical instrument 200, using the methods of the first to fourth embodiments.

For example, a plurality of guide data G corresponding to different combinations of musical instrument types and positional relationships are stored in the storage device 12. That is, for one type of musical instrument, a plurality of pieces of guide data G corresponding to different positional relationships are stored. For example, when the type of musical instrument is a piano, guide data G is stored for each of a plurality of positional relationships such as lateral, upper, and lower positions of the musical instrument 200. Additionally, if the type of instrument is a drum set consisting of multiple different drums, guide data G is stored for each of multiple positional relationships such as near the bass drum, near the hi-hat, and near the snare drum. be done. For each of the plurality of positional relationships, the notification control unit 22 acquires the instrument type specified by the musical instrument identification unit 21 and the guidance data G corresponding to the positional relationship from the storage device 12, and acquires the collection represented by the guidance data G. Notify the user U of the sound position. According to the above configuration, the user U can be informed of a plurality of different sound collection positions for the musical instrument 200.

(4) In each of the above embodiments, the information processing device 100 is equipped with the sound collection device 15, but the sound collection device 15, which is separate from the information processing device 100, is connected to the information processing device by wire or wireless 100. Similarly, regarding the imaging device 16, the imaging device 16 separate from the information processing device 100 may be connected to the information processing device 100 by wire or wirelessly.

(5) In each of the above-mentioned embodiments, the case where the user U moves with the musical instrument 200 relative to the information processing device 100 was illustrated, but the method for adjusting the positional relationship between the information processing device 100 and the musical instrument 200 is as described above. The invention is not limited to this example. For example, by moving the information processing device 100 relative to the user U, the positional relationship between the information processing device 100 and the musical instrument 200 may be adjusted.

(6) In each of the above-mentioned embodiments, the case where the performance sound of the musical instrument 200 is collected is illustrated, but each of the above-mentioned embodiments also applies when the sound of a singer's singing is collected by the sound collection device 15. Applicable. The musical instrument 200 and the singer are collectively represented as sound sources that radiate musical sounds. Further, the configuration for notifying the user U of the sound collection position in each of the above-described embodiments is similarly applied to the case where sound emitted from any sound source not limited to the musical instrument 200 and the singer is collected.

(7) For example, the information processing device 100 in each of the above embodiments may be realized by a server device that communicates with a terminal device such as a smartphone or a tablet terminal. The sound collection device 15 and the imaging device 16 are mounted on or connected to a terminal device. The terminal device transmits an acoustic signal A generated by the sound collection device 15 and a video signal V generated by the imaging device 16 to the information processing device 100. The information processing device 100 functions as the musical instrument specifying section 21 and the notification control section 22 illustrated in each of the above-described embodiments, and transmits notification data for notifying a suitable sound collection position for the type of musical instrument to the terminal device. The notification data is, for example, image data for displaying the guide image illustrated in FIG. 3 or FIG. 10 on the terminal device.

(8) As described above, the functions of the information processing device 100 according to each of the above embodiments are realized through cooperation between one or more processors that constitute the control device 11 and the programs stored in the storage device 12. . The programs exemplified above may be provided in a form stored in a computer-readable recording medium and installed on a computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but any known recording medium such as a semiconductor recording medium or a magnetic recording medium is used. Also included are recording media in the form of. Note that the non-transitory recording medium includes any recording medium excluding transitory, propagating signals, and does not exclude volatile recording media. Furthermore, in a configuration in which a distribution device distributes a program via a communication network, a recording medium that stores a program in the distribution device corresponds to the above-mentioned non-transitory recording medium.

G: Supplementary Note From the forms exemplified above, for example, the following configurations can be understood.

An information processing method according to one aspect (aspect 1) of the present disclosure specifies the type of musical instrument, and selects a sound collection position corresponding to the identified type as a position where the performance sound of the musical instrument should be collected. to be notified. In the above aspect, the user is informed of the sound collection position according to the type of musical instrument. Therefore, the user can grasp suitable sound collection positions for various musical instruments without requiring any knowledge regarding collection of performance sounds. Furthermore, by installing the sound collection device at the notified sound collection position, the user can collect the sound of the musical instrument under suitable conditions.

A "musical instrument" is a natural musical instrument or an electronic musical instrument that emits sound waves into space according to the performance movement. "Types of musical instruments" is a collection of musical instruments classified according to, for example, the nature or form of the musical instruments. For example, in addition to classifications according to the mechanism of pronunciation, such as wind instruments, string instruments, percussion instruments, or keyboard instruments, "types of musical instruments" include classifications that are further subdivided from the above classifications. For example, for wind instruments, there are intermediate types such as brass instruments and woodwind instruments, as well as specific types of instruments that belong to brass instruments (for example, trumpets, trombones, or horns), and specific types of instruments that belong to woodwinds. Types (eg, clarinet, saxophone, or oboe) are also included in "type of musical instrument." Regarding stringed instruments, there are intermediate types such as bowed string instruments and plucked string instruments, as well as specific types of instruments that belong to bowed string instruments (for example, violins, cellos, or double basses), and specific types of instruments that belong to plucked string instruments (for example, violins, cellos, or double basses). For example, guitar, ukulele, mandolin, etc.) are also included in the "type of musical instrument." When focusing on keyboard instruments, types such as pianos (string instruments) and organs are included in the "types of musical instruments", for example. Focusing on percussion instruments, in addition to intermediate classifications such as membranous instruments and somatic instruments, there are also specific types of instruments that belong to membranous instruments (for example, drums, drums, or timpani), and specific types that belong to somatic instruments. The type of musical instrument (for example, cymbal or triangle) is included in the "type of musical instrument".

"Sound collection position" means a suitable position for collecting the sound of a musical instrument. A suitable sound collection position for collecting performance sounds differs depending on the type of musical instrument. That is, the "sound collection position according to the type of musical instrument" is a suitable position for collecting the performance sound of the musical instrument belonging to the relevant type. Therefore, if the type of musical instrument is different, the sound pickup position will also be different. For example, the sound collection position corresponding to the first type of musical instrument is different from the sound collection position corresponding to the second type of musical instrument.

"Notification" means outputting the sound collection position in a manner that the user can recognize. For example, it is possible to notify the user of the sound collection position by displaying a guide image, or to notify the user of the sound collection position by playing a guidance voice. Furthermore, the user may be notified of the sound collection position, for example, by printing a guide image.

In the specific example of Aspect 1 (Aspect 2), in identifying the type, the type of the musical instrument is identified by analyzing an acoustic signal generated by a sound collection device by collecting the performance sound of the musical instrument. In the above embodiment, the type of musical instrument is identified by analyzing the acoustic signal generated by the sound pickup device. Therefore, the effort required by the user to specify the type of musical instrument can be reduced. Furthermore, the type of musical instrument can be identified even in an environment where the type of musical instrument cannot be identified with high precision from the video signal, such as an environment where there is not a sufficient amount of light.

Known techniques may be arbitrarily adopted for the process of identifying the type of musical instrument by analyzing the acoustic signal. For example, feature data representing the features of the acoustic signal is compared with each of a plurality of reference data corresponding to different types of musical instruments, and the type of musical instrument corresponding to the reference data similar to the feature data is identified.

In a specific example of aspect 1 or aspect 2 (aspect 3), in identifying the type, the type of the musical instrument is identified by analyzing a video signal generated by an imaging device by imaging the musical instrument. In the above embodiment, the type of musical instrument is identified by analyzing the video signal generated by the imaging device. Therefore, the effort required by the user to specify the type of musical instrument can be reduced. Further, the type of musical instrument can be identified even in an environment where the type of musical instrument cannot be identified with high precision from the acoustic signal, such as an environment where there is noise.

A known technique is arbitrarily adopted for the process of identifying the type of musical instrument by analyzing the video signal. For example, a trained model for object detection that has learned the relationship between a video signal and the type of musical instrument is used to identify the type of musical instrument.

In a specific example of any one of aspects 1 to 3 (aspect 4), in specifying the type, the type of the musical instrument is specified according to an instruction from the user. In the above aspect, since the type of musical instrument is specified according to an instruction from the user, the type of musical instrument can be specified even in an environment where the type of musical instrument cannot be specified with high precision from an audio signal or a video signal.

The "instruction from the user" is, for example, an instruction for the user to select the type of musical instrument that the user envisions as the target of sound collection from a plurality of candidates prepared in advance. For example, an operation on an operating device is accepted as an instruction from a user.

In a specific example of any one of Aspects 1 to 4 (Aspect 5), in notifying the sound collection position, one of a plurality of guide images for guiding the sound collection position for different types of musical instruments is selected based on the identified type. A corresponding guide image is displayed on the display device. In the above aspect, the user can visually confirm a suitable sound collection position for a musical instrument assumed to be a sound collection target.

In a specific example of any one of Aspects 1 to 5 (Aspect 6), in notifying the sound collection position, one of a plurality of guidance voices that guide the sound collection position for different types of musical instruments is selected according to the specified type. A corresponding guidance sound is emitted by a sound emitting device. In the above aspect, the user can aurally confirm a suitable sound collection position for a musical instrument assumed to be a sound collection target.

In a specific example of any one of aspects 1 to 6 (aspect 7), feature data representing acoustic characteristics of an acoustic signal generated by a sound collection device by collecting the performance sound of the musical instrument is generated, In notifying the position, the sound collection position is notified to the user so that the characteristic data approaches reference data corresponding to the identified type among a plurality of reference data corresponding to different types of musical instruments. . In the above aspect, the sound collection position is notified so that the feature data representing the characteristics of the sound played by the musical instrument approaches the reference data corresponding to the type of musical instrument. Therefore, the user is not limited to the sound collection position assumed in advance for the musical instrument, but can inform the user of the sound collection position where the performance sound can be well captured in the environment in which the user actually plays the instrument. .

"Feature data" and "reference data" are information representing acoustic characteristics of performance sounds. For example, feature amounts representing the volume or timbre of performance sounds are used as "feature data" and "reference data." That is, for example, under the condition that the volume represented by the feature data and the volume represented by the reference data are brought close to each other, it is possible to notify the user of a sound collection position where the sound of a musical instrument can be collected at an appropriate volume. Further, for example, according to the condition that the timbre represented by the feature data and the timbre represented by the reference data are brought close to each other, it is possible to inform the user of a sound collection position where the sound of the musical instrument can be collected with an appropriate timbre. Examples of the feature amount representing the timbre include a frequency spectrum envelope or a mel spectrum.

In a specific example of Aspect 7 (Aspect 8), in identifying the type, the type of musical instrument is identified by analyzing the acoustic signal. In the above embodiment, the acoustic signal is commonly used to identify the type of musical instrument and to generate characteristic data for reporting the sound collection position. Therefore, the configuration and processing can be simplified compared to a format in which data separate from the acoustic signal is required to notify the sound collection position.

Note that, for example, feature data extracted from the acoustic signal is used to identify the type of musical instrument. For example, among a plurality of reference data corresponding to different types of musical instruments, the type corresponding to the reference data most similar to the feature data is specified. In the above aspect, the configuration and processing for generating feature data from an acoustic signal are shared for identifying the type of musical instrument and notifying the sound collection position.

In a specific example of aspect 7 or aspect 8 (aspect 9), the feature data includes a first observed value of the first feature amount and a second observed value of the second feature amount, and the reference data includes the first observed value of the first feature amount and the second observed value of the second feature amount. A first reference value of the feature quantity and a second reference value of the second feature quantity are included. displaying a guide image in which two axes are set on a display device, and an observation point corresponding to the first observation value and the second observation value, and a guide image corresponding to the first reference value and the second reference value. The reference point is displayed on the guide image. In the above aspect, the user can move to a suitable sound collection position by moving while playing the musical instrument so that the observation point approaches the reference point.

In a specific example of any one of aspects 1 to 6 (aspect 10), feature data representing acoustic characteristics of an acoustic signal generated by a sound collection device by collecting the performance sound of the musical instrument is generated, The position notification includes a first process and a second process, and in the first process, a position corresponding to the specified type is selected from among a plurality of sound collection positions stored in the storage device for different types of musical instruments. The user is notified of the sound collection position, and in the second process, the characteristic data is made to approach reference data corresponding to the identified type among a plurality of reference data corresponding to different types of musical instruments. , notifies the sound collection position. In the above aspect, in the first process, the user is informed of the general sound collection position stored in advance in the storage device, and in the second process, the specific sound collection position where the feature data approaches the reference data is notified to the user. will be notified to the user. Therefore, compared to a system in which only the general sound collection position stored in advance is notified to the user, it is possible to better capture the performance sound in the environment where the user actually plays the instrument. The user can be notified of the sound collection position. Further, after the positional relationship between the musical instrument and the information processing device is roughly adjusted by the first process, it can be finely adjusted according to the performance environment by the second process. Therefore, compared to a configuration in which the user is notified of only the sound collection position where the feature data and the reference data are close to each other, it is easier to adjust the musical instrument and the information processing device to an appropriate positional relationship.

In a specific example of any one of aspects 1 to 10 (aspect 11), in notifying the sound collection position, sound collection positions for a plurality of musical instruments including the musical instrument are reported. In the above aspect, the user is informed of the sound collection positions for a plurality of musical instruments. Therefore, the user can grasp suitable sound collection positions for a plurality of musical instruments without requiring any knowledge regarding collection of performance sounds. Therefore, for example, in an environment where a plurality of instruments of the same type belonging to one performance part are played in parallel, the user can grasp the appropriate sound collection position.

In a specific example of aspect 11 (aspect 12), in notifying the sound collection position, feature data representing the acoustic characteristics of the acoustic signal generated by the sound collection device by collecting the performance sounds is mutually transmitted for the plurality of musical instruments. The user is notified of a sound collection position that is close to . According to the above aspect, it is possible to notify the user of a comprehensively suitable sound collection position for a plurality of musical instruments.

An information processing device according to one aspect (aspect 13) of the present disclosure includes an instrument identifying unit that identifies the type of musical instrument, and a position where the performance sound of the musical instrument is to be collected, and a sound collection according to the identified type. and a notification control unit that notifies the user of the location. In the above aspect, the user is informed of the sound collection position according to the type of musical instrument. Therefore, the user can grasp suitable sound collection positions for various musical instruments without requiring any knowledge regarding collection of performance sounds. Furthermore, by installing the sound collection device at the notified sound collection position, the user can collect the sound of the musical instrument under suitable conditions.

In a specific example of aspect 13 (aspect 14), the information processing device is a portable device that is equipped with a sound collection device that collects the sound of the musical instrument, and that is movable to the sound collection position. In the above aspect, a sound collection device that collects performance sounds is installed in a portable information processing device together with the musical instrument identification section and the notification control section. Therefore, the user can collect performance sounds using the sound collection device by simply moving the information processing device to the sound collection position while checking the sound collection position notified by the notification control unit. can.

In a specific example of aspect 14 (aspect 15), the instrument includes an imaging device that generates a video signal by capturing an image of the musical instrument, and the musical instrument identifying section identifies the type of the musical instrument by analyzing the video signal. In the above aspect, an imaging device that captures an image of a musical instrument is installed in the information processing device along with a sound pickup device that captures performance sounds, a musical instrument identification section, and a notification control section. Therefore, the user can collect performance sounds by simply taking an image of the musical instrument using the imaging device of the information processing device and then moving the information processing device to the sound collection position.

A program according to one aspect (aspect 16) of the present disclosure uses a computer as an instrument specifying section that specifies the type of musical instrument, and a notification control section that notifies the user of a sound collection position according to the specified type. Make it work.

100... Information processing device, 200... Musical instrument, 300... Communication network, 400... Distribution system, 11... Control device, 12... Storage device, 13... Communication device, 14... Operating device, 15... Sound collection device, 16... Imaging device , 17...Display device, 18...Sound emitting device, 21...Musical instrument identification section, 22...Notification control section, 23...Recording processing section, 24...Feature extraction section.

Claims

Identify the type of instrument,
An information processing method implemented by a computer that notifies a user of a sound collection position corresponding to the identified type as a position where the performance sound of the musical instrument should be collected.
In specifying the above type,
The information processing method according to claim 1, wherein the type of the musical instrument is identified by analyzing an acoustic signal generated by a sound collection device by collecting the performance sound of the musical instrument.
In specifying the above type,
The information processing method according to claim 1, wherein the type of the musical instrument is identified by analyzing a video signal generated by an imaging device by imaging the musical instrument.
In specifying the above type,
The information processing method according to claim 1, further comprising specifying the type of the musical instrument according to an instruction from the user.
In notifying the sound collection position,
The information processing method according to any one of claims 1 to 4, wherein a guide image corresponding to the identified type is displayed on a display device among a plurality of guide images that guide sound collection positions for different types of musical instruments.
In notifying the sound collection position,
The information processing method according to any one of claims 1 to 4, wherein a sound emitting device emits a guidance sound corresponding to the identified type among a plurality of guidance sounds guiding sound collection positions for different types of musical instruments.
Further, generating feature data representing the acoustic characteristics of the acoustic signal generated by the sound collecting device by collecting the performance sound of the musical instrument,
In notifying the sound collection position, the sound collection position is set by the user so that the characteristic data approaches the reference data corresponding to the specified type among the plurality of reference data corresponding to different types of musical instruments. The information processing method according to claim 1.
In specifying the above type,
The information processing method according to claim 7, wherein the type of the musical instrument is identified by analyzing the acoustic signal.
The feature data includes a first observed value of a first feature amount and a second observed value of a second feature amount,
The reference data includes a first reference value of the first feature amount and a second reference value of the second feature amount,
In notifying the sound collection position,
Displaying a guide image on a display device in which a first axis representing the first feature amount and a second axis representing the second feature amount are set, and the guide image corresponds to the first observation value and the second observation value. The information processing method according to claim 7 or 8, wherein an observation point and a reference point corresponding to the first reference value and the second reference value are displayed on the guide image.
Further, generating feature data representing the acoustic characteristics of the acoustic signal generated by the sound collecting device by collecting the performance sound of the musical instrument,
The notification of the sound collection position includes a first process and a second process,
In the first process, the user is notified of a sound collection position corresponding to the identified type among a plurality of sound collection positions stored in a storage device for different types of musical instruments;
In the second process,
The information processing method according to claim 1, further comprising notifying the user of the sound collection position so that the feature data approaches the reference data corresponding to the identified type among a plurality of reference data corresponding to different types of musical instruments. .
In notifying the sound collection position,
The information processing method according to claim 1, further comprising notifying sound collection positions for a plurality of musical instruments including the musical instrument.
In notifying the sound collection position,
Information processing according to claim 11, wherein the user is informed of sound collection positions where feature data representing the acoustic characteristics of the sound signal generated by the sound collection device by collecting performance sounds are mutually approximate for the plurality of musical instruments. Method.
an instrument identification section that identifies the type of instrument;
An information processing device comprising: a notification control unit that notifies a user of a sound collection position corresponding to the specified type as a position where the performance sound of the musical instrument should be collected.
comprising a sound collection device that collects the sound of the musical instrument,
The information processing device according to claim 13, wherein the information processing device is a portable device that can be moved to the sound collection position.
comprising an imaging device that generates a video signal by imaging the musical instrument,
The information processing apparatus according to claim 14, wherein the musical instrument identifying section identifies the type of the musical instrument by analyzing the video signal.
an instrument identification section that identifies the type of instrument; and
a notification control unit that notifies the user of a sound collection position according to the identified type;
A program that makes a computer function as