WO2023153033A1

WO2023153033A1 - Information processing method, program, and information processing device

Info

Publication number: WO2023153033A1
Application number: PCT/JP2022/040701
Authority: WO
Inventors: 秀一松本
Original assignee: ヤマハ株式会社
Priority date: 2022-02-10
Filing date: 2022-10-31
Publication date: 2023-08-17
Also published as: JP2023116866A

Abstract

An information processing device 10 generates, on the basis of score data SD representing a score including at least one performance mark, an acoustic signal representing a sound relating to the performance mark.

Description

Information processing method, program, and information processing device

The present disclosure relates to technology for assisting in grasping the content of musical scores.

Conventionally, Braille musical scores have been used to enable visually impaired people to understand the content of musical scores. For example, Japanese Patent Laid-Open No. 2002-200000 discloses an automatic score translation system that uses a computer to transcribe score data into Braille and outputs the transcribed score data to a Braille typewriter to generate a Braille score.

JP-A-60-119594

Expert knowledge is required to understand Braille music scores, and there is a problem that it is difficult for beginners and small children to understand. In addition, when comparing a normal musical score and a Braille musical score, the Braille musical score requires about three times as much paper space as the normal musical score to write the same musical score, and there is a problem that it takes time to read. In consideration of the above circumstances, one aspect of the present disclosure aims at presenting the contents of a musical score by sound.

In order to solve the above problems, an information processing method according to one aspect of the present disclosure is implemented by a computer system, and based on musical score data representing a musical score including one or more performance symbols, sounds related to the performance symbols are generated. Generates an acoustic signal representing the

A program according to one aspect of the present disclosure causes a computer system to function as a generation unit that generates an acoustic signal representing a sound related to one or more performance symbols based on musical score data representing a musical score including one or more performance symbols.

An information processing apparatus according to one aspect of the present disclosure includes a generation unit that generates an acoustic signal representing a sound associated with one or more performance symbols based on musical score data representing a musical score including one or more performance symbols.

1 is a block diagram illustrating the configuration of an information processing apparatus 10 according to a first embodiment; FIG. 4 is a diagram illustrating a configuration of data stored in a storage device 12; FIG. It is a figure which shows the kind of musical symbol. 3 is a block diagram illustrating the functional configuration of the control device 11; FIG. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. 4 is a diagram schematically showing processing of a text generation unit 32; FIG. FIG. 4 is a diagram illustrating a musical score; FIG. 10 is a diagram schematically showing read-out timings of read-out text; FIG. 10 is a diagram schematically showing read-out timings of read-out text; FIG. 11 is a diagram illustrating a display screen during reading of a musical score; 4 is a flowchart illustrating a specific procedure of processing for the control device 11 to execute a musical score reading application. 4 is a diagram exemplifying an instruction reception screen by an instruction reception unit 30. FIG. FIG. 11 is a diagram illustrating a display screen during execution of a table-of-contents presentation function; FIG. 11 is a block diagram illustrating a functional configuration of a control device 11A in a third embodiment; FIG.

A: First Embodiment FIG. 1 is a block diagram illustrating the configuration of an information processing apparatus 10 according to the first embodiment. The information processing device 10 is a computer system that includes a control device 11 , a storage device 12 , a sound collection device 13 , a sound emission device 14 , an operation device 15 and a display device 16 . The information processing device 10 is realized by an information terminal such as a smart phone, a tablet terminal, or a personal computer, for example. In this embodiment, the information processing device 10 is assumed to be a smart phone. The information processing apparatus 10 may be implemented as a single device, or may be implemented as a plurality of devices configured separately from each other (for example, a client-server system).

The control device 11 is composed of one or more processors that control each element of the information processing device 10 . For example, the control device 11 includes a CPU (Central Processing Unit), an SPU (Sound Processing Unit), a DSP (Digital Signal Processor), an FPGA (Field Programmable Gate Array), or an ASIC (Application Specific Integral). one or more types such as rated circuit) It consists of a processor.

The storage device 12 is a single or multiple memories that store the program PG (see FIG. 2) executed by the control device 11 and various data used by the control device 11 . The storage device 12 is composed of a known recording medium such as a magnetic recording medium or a semiconductor recording medium, or a combination of a plurality of types of recording media. A portable recording medium that can be attached to and detached from the information processing device 10, or a recording medium (for example, cloud storage) that can be written or read by the control device 11 via a communication network is used as the storage device 12. You may

The sound collection device 13 detects ambient sounds (air vibrations) and outputs them as acoustic signals. The sound pickup device 13 is, for example, a microphone. Note that the sound collecting device 13, which is separate from the information processing device 10, may be connected to the information processing device 10 by wire or wirelessly.

The sound emitting device 14 reproduces the sound represented by the acoustic signal. The sound emitting device 14 is, for example, a speaker or headphones. A D/A converter that converts an acoustic signal from digital to analog and an amplifier that amplifies the acoustic signal are omitted from the drawing for the sake of convenience. Further, the sound emitting device 14, which is separate from the information processing device 10, may be connected to the information processing device 10 by wire or wirelessly.

The operation device 15 is an input device that receives instructions from the user. The operation device 15 is, for example, an operator operated by a user or a touch panel T that detects contact by the user. In this embodiment, the touch panel T shall be used as the operating device 15. FIG. In this case, the touch panel T serves as both the operation device 15 and the display device 16, which will be described later. An operation device 15 (for example, a mouse or a keyboard) separate from the information processing device 10 may be connected to the information processing device 10 by wire or wirelessly.

The display device 16 displays images under the control of the control device 11 . For example, various display panels such as a liquid crystal display panel or an organic EL (Electroluminescence) panel are used as the display device 16 . Note that the display device 16, which is separate from the information processing device 10, may be connected to the information processing device 10 by wire or wirelessly.

FIG. 2 is a diagram illustrating the configuration of data stored in the storage device 12. As shown in FIG. FIG. 3 is a diagram showing types of musical symbols. As shown in FIG. 2, the storage device 12 stores a program PG executed by the control device 11, voice data VD, score data SD, and symbol text data TD. In this embodiment, the program PG is a program for executing a musical score reading application. The score read-aloud application is an application that generates an acoustic signal indicating sounds related to various information written on the score corresponding to the score data SD, and reproduces the acoustic signal. More specifically, the score-to-speech application reads aloud the text corresponding to the musical symbols so that the user can perceive the contents of the score using the sense of hearing. Hereinafter, in the present embodiment, reading out a musical score corresponds to generating an acoustic signal indicating sounds related to musical symbols included in the musical score and reproducing the acoustic signal.

The voice data VD is data for generating synthesized voice that reads out the musical score. The speech data VD is a speech synthesis library containing a plurality of speech segments. Each phonetic segment is a single phoneme (for example, a vowel or a consonant), which is the minimum unit of linguistic meaning, or a phoneme chain in which a plurality of phonemes are connected. In this embodiment, the voice data VD includes male voice data representing a male voice and female voice data representing a female voice. In addition, in this embodiment, the voice data VD includes Japanese voice data for pronouncing Japanese and English voice data for pronouncing English. That is, the voice data VD includes at least four combinations of two genders and two languages. By using a plurality of types of voice data VD, it is possible to change the type of reading voice for each part of a musical score or according to the type of symbols in the musical score, thereby improving convenience.

The musical score data SD is data representing the musical score of a musical piece. The musical score data SD is distributed, for example, by being distributed via a network from a distribution device (not shown) such as a web server, or by selling recording media recording the musical score data SD at stores. The musical score data SD is general data that can be obtained regardless of whether the user is a healthy person or a hearing-impaired person. In this embodiment, the musical score data SD describes the content of the musical score of a piece of music in a specific data description language. Specifically, the musical score data SD is a file for expressing musical scores (for example, a file in MusicXML format) in which elements of musical scores such as musical symbols are expressed as logical information.

The musical score data SD is stored in the storage device 12 after being distributed to the information processing device 10 via a communication network such as the Internet from, for example, a distribution device (typically a web server). A plurality of score data SD may be stored in the storage device 12 . Generally, one score data SD is created corresponding to one piece of music.

A musical score is a representation of a piece of music using musical symbols, including performance symbols. As shown in FIG. 3, musical symbols include musical note symbols, clefs, time signatures, key signatures, and performance symbols. Musical note symbols include notes, rests and accidentals that are attached to notes in a musical score. The clef is marked on the left end of the staff and specifies the relationship between the position on the staff and the pitch of the sound. The time signature specifies the number of beats in one measure and the type of note that constitutes one beat. A key signature is a set of accidental symbols for designating the key of a piece of music.

　Performance symbols are written on the score as a supplement to indicate to the performer the nuances that cannot be expressed with only notes and rests when performing a piece of music. Performance symbols include speed symbols such as adagio and andante, expressions such as affettuoso and agitato, dynamics such as fortissimo and pianissimo, and articulation symbols such as tenuto and staccato (hereinafter referred to as articulation symbols). ”), repeat marks such as da capo and seño, decorative marks such as trills and turns, abbreviations such as ottava alta and ottava bassa, and playing style marks indicating specific playing styles for instruments such as pedals and pizzicato. be In the present embodiment, the performance symbols also include finger numbers that designate fingers used when playing the notes written in the musical score.

As shown in FIG. 2, the musical score data SD includes attribute information B for each musical symbol forming the target musical piece. The attribute information B is information that defines musical attributes of each musical symbol, and includes a beat identifier B1 and a symbol identifier B2. The beat identifier B1 is information specifying the temporal position of the musical symbol in the target music. If the music symbol is a musical note symbol or performance symbol, the number of beats from the beginning of the target song to the corresponding music symbol (for example, the beat number obtained by counting an eighth note as one beat) is preferably used as the beat identifier B1. The symbol identifier B2 is information for identifying the type of musical symbol. For example, if the music symbol is a note symbol, the symbol identifier B2 includes the note name (note number) and note value. The note name represents the pitch of the note, and the note value represents the duration of the note on the score. Also, when the musical symbol is other than the note symbol, a character string indicating the name of the musical symbol is preferably used as the symbolic identifier B2.

The musical score data SD also includes tempo information TP that specifies the tempo of the music indicated by the musical score. The tempo information TP includes, for example, the number of beats per minute (unit time) and the type of note that constitutes one beat.

Also, the musical score data SD includes musical score image data MD. The musical score image data MD is data representing an image of the musical score of the target musical piece (hereinafter referred to as "score image"). Specifically, for example, an image file (for example, a PDF file) representing a musical score image as a plane image in raster format or vector format is suitable as the musical score image data MD.

The symbol text data TD is data containing text corresponding to musical symbols written on the musical score. The symbolic text data TD includes symbolic identifiers C1, name texts C2 and semantic texts C3. The symbolic identifier C1 is information for identifying the type of musical symbol, and is information in the same format as the symbolic identifier B2 of the attribute information B. FIG. The symbolic identifier C1 corresponding to the musical note symbol may be only the pitch name.

The name text C2 is text indicating the name of the symbol specified by the symbol identifier C1. The symbolic identifier C1 and the name text C2 may be the same character string. If the musical symbol is a note symbol, the name text C2 is "do", "re", and so on. If the musical symbol is a performance symbol, the name text C2 is "crescendo", "forte", or the like. Here, in the present embodiment, it is assumed that the musical score reading application is capable of reading musical scores in multiple languages. As an example, the languages that can be selected during reading are Japanese or English. Therefore, the name text C2 includes Japanese text indicating the name of the musical symbol and English text indicating the name of the musical symbol.

The meaning text C3 is text indicating the meaning of the symbol specified by the symbol identifier C1. For example, if the name of the musical symbol is "adagio", the semantic text C3 is "slowly". If the musical symbols are note symbols, the semantic text C3 may not be provided. The semantic text C3 also includes Japanese text indicating the meaning of the musical symbol and English text indicating the meaning of the musical symbol.

A non-verbal notification sound, for example, may be used as the name text C2 corresponding to the rest. In this case, in the generation of the acoustic signal, an acoustic signal representing a non-verbal notification sound is generated as the sound related to the rest. A non-verbal notification sound is a sound that has no linguistic meaning, and includes, for example, a metronome sound, a click sound, a beep sound, and the like. By using a non-verbal notification sound as the sound related to the rest, the user can immediately recognize that the notification sound corresponds to the rest when the notification sound is pronounced. In this embodiment, a click sound is used as the name text C2 corresponding to the rest. A click sound corresponding to a quarter note is "click", and a click sound corresponding to an eighth note is "ka". Also, the click sound corresponding to the rest is not limited to "click" and "ka".

Also, as the name text C2 corresponding to the rest, a word used as a phrase indicating the rest when taking a beat, such as "Un" for a quarter rest, "U" for an eighth rest, etc. may be used.

FIG. 4 is a block diagram illustrating the functional configuration of the control device 11. As shown in FIG. The control device 11 generates sound signals representing sounds related to musical symbols forming a musical score (hereinafter referred to as “symbolic sounds”) based on musical score data SD representing the musical score. The musical score represented by the musical score data SD includes one or more performance symbols. Therefore, it can also be said that the control device 11 generates an acoustic signal representing a sound related to the performance symbol based on the musical score data SD representing a musical score including one or more performance symbols. The sound related to the musical performance symbol is, for example, a sound indicating the name of the musical performance symbol or a sound indicating a phrase corresponding to the meaning of the musical performance symbol. In this embodiment, the names of the performance symbols correspond to the name texts C2 of the performance symbols, and the words corresponding to the meanings of the performance symbols correspond to the meaning texts C3 of the performance symbols. Further, in this embodiment, the musical score includes musical note symbols in addition to performance symbols. Accordingly, the control device 11 generates acoustic signals representing sounds related to performance symbols and sounds related to musical note symbols. A sound related to a musical note symbol is, for example, a sound indicating the pitch name of the note indicated by the musical note symbol.

The acoustic signal is a signal for causing the sound emitting device 14 to reproduce the symbol sound. By executing the program PG stored in the storage device 12, the control device 11 has a plurality of functions for generating and reproducing sound signals (the instruction receiving unit 30, the text generating unit 32, the voice synthesizing unit 34, the performance The analysis unit 38 and the output control unit 40) are realized.

The instruction receiving unit 30 receives instructions from the user to the operation device 15. The instruction receiving unit 30 displays a screen for receiving instructions from the user on the touch panel T, for example. The user inputs an instruction by performing a touch operation on the reception screen displayed on the touch panel T. FIG.

5 to 9 are diagrams exemplifying instruction reception screens by the instruction reception unit 30. FIG. When the score reading application is activated, the instruction receiving unit 30 causes the touch panel T to display a reception screen SC1 for selecting a song to be read aloud, as shown in FIG. 5, for example. Indications NA1 to NA5 indicating the data names of the musical score data SD stored in the storage device 12, for example, are displayed on the acceptance screen SC1. The user designates the musical score data SD to be read out by touching the display NA1 to NA5 corresponding to the desired musical score data SD. In the example of FIG. 5, musical score data "yyy.xml" corresponding to display NA2 is specified. When the user touches the OK button BT in this state, the specification of the musical score data "yyy.xml" is confirmed. In the following reception screens as well, the user's selection instruction is confirmed by touching the OK button BT. It should be noted that instead of the displays NA1 to NA5 indicating the data names, titles of songs corresponding to the musical score data SD may be displayed.

When the musical score data SD is specified, the instruction receiving unit 30 causes the touch panel T to display a reception screen SC2 for selecting a staff notation to be read out of the musical score data SD, as shown in FIG. The reception screen SC2 displays options NB1 and NB2 for designating a staff notation to be read aloud out of the grand staff. The option NB1 designates the reading of the right-hand staff positioned above the grand staff. The option NB2 designates the reading of the staff notation for the left hand located on the lower stage of the grand staff. The user designates the staff notation to be read out by checking the check box CK of at least one of the options NB1 and NB2.

When the staff notation to be read aloud is designated, the instruction receiving unit 30 causes the touch panel T to display a reception screen SC3 for selecting the type of symbol to be read aloud, as shown in FIG. 7, for example. Options NC1 to NC11 for designating the type of symbol to be read aloud are displayed on the acceptance screen SC3. Choice NC1 designates reading of musical note symbols. Option NC2 designates the reading of performance symbols. Of the music symbols shown in FIG. 3, the clef, time signature, and key signature are not subject to continuous reading in this embodiment because they rarely change within one score. On the other hand, the user may be able to specify whether or not to read aloud the clef, time signature, and key signature as well as the musical note symbol and performance symbol.

Also, for performance symbols, it is possible to specify in more detail the types of symbols to be read aloud. Choice NC3 designates reading of the speed symbol. Option NC4 designates the reading of expression symbols. Option NC5 designates reading of dynamic symbols. Choice NC6 specifies the reading of articulation symbols. Choice NC7 specifies the reading of repeat symbols. Option NC8 designates reading of decorative symbols. Choice NC9 designates reading of ellipsis. Option NC10 designates reading of rendition style symbols. Option NC11 designates the reading of finger numbers.

That is, the musical score represented by the musical score data SD includes a plurality of performance symbols, and each of the plurality of performance symbols belongs to one of a plurality of classifications. A plurality of classifications correspond to the types of performance symbols shown in FIG. The instruction receiving unit 30 receives selection of at least one of a plurality of classifications of performance symbols. The control device 11 generates acoustic signals for performance symbols belonging to one or more selected categories among the plurality of performance symbols.

When an item to be read aloud is designated, the instruction receiving unit 30 causes the touch panel T to display a reception screen SC4 for selecting a setting for reading out the musical score data SD, as shown in FIG. 8, for example. Options ND1 and ND2 for designating information to be output when reading a musical score are displayed in the upper area E1 of the reception screen SC4. The option ND1 designates only the reading of musical scores. That is, the option ND1 designates only audio output. The option ND2 designates displaying a musical score image in addition to reading out the musical score. That is, the option ND2 designates output of audio and images. Either option ND1 or ND2 can be selected using a radio button. By touching the radio button corresponding to the option ND1 or the radio button corresponding to the option ND2, the user specifies the information to be output when reading out the musical score.

Also, options NE1 to NE4 for specifying the tempo for reading out the score are displayed in the lower region E2 of the reception screen SC4. The option NE1 designates reading at the tempo designated by the musical score. When the option NE1 is selected, it may not be possible to read out all the symbols specified on the reception screen SC3 due to the relationship between the readout tempo and the number of readout syllables. In this case, the symbols to be read are reduced accordingly. The option NE2 designates reading out all the symbols designated on the reception screen SC3 regardless of the tempo designated by the musical score. Option NE3 designates reading aloud at a tempo synchronized with the performance of the musical score by the user. For option NE4, the user designates an arbitrary tempo. In the illustrated example, the user specifies the reading tempo by specifying the number of beats per minute.

In addition, in connection with specifying the reading tempo, for example, it may be possible to set the speaking time for one syllable (in other words, the number of reading syllables per unit time) during reading. For example, when the option NE1 is selected, the shorter the utterance time of one syllable, the more symbols specified on the reception screen SC3 can be read out. Further, when the option NE2 is selected, the shorter the utterance time of one syllable, the shorter the reading can be completed.

When the OK button BT on the reception screen SC4 is touched, the instruction reception unit 30 causes the touch panel T to display a selection instruction reception screen SC5 for further setting, as shown in FIG. 9, for example. Options NF1 to NF2 for designating the language to be used when reading out the musical score are displayed in the upper area E3 of the reception screen SC5. Option NF1 designates reading in Japanese. Reading in Japanese corresponds to, for example, using "do-re-mi-fa-so-la-si" as the note name, using Japanese-like pronunciation when uttering the names of performance symbols, and the like. Choice NF2 designates reading in English. Reading out in English corresponds to, for example, using "C, D, E, F, G, A, B" as note names, using English-like pronunciation when uttering names of performance symbols, and the like. A language other than Japanese and English may be specified on the reception screen SC5. In this case, the symbolic text data TD includes the name text C2 and the meaning text C3 of the language.

Also, in the central area E4 of the reception screen SC5, options NG1 to NG2 are displayed for designating the contents of reading out the performance symbols. The option NG1 designates reading out the name of the performance symbol. When the option NG1 is specified, the name text C2 of the symbol text data TD is read out when the performance symbols are read out. The option NG2 designates the reading of a phrase indicating the meaning of the performance symbol. When the option NG2 is specified, the meaning text C3 of the symbol text data TD is read out when reading out the performance symbol.

In addition, options NH1 to NH2 for specifying the type of reading voice are displayed in the lower region E5 of the reception screen SC5. For example, if both the staff notation for the right hand and the staff notation for the left hand are designated as the staff notation to be read out on the reception screen SC2, texts indicating a plurality of note symbols may be read out at the same time. In this embodiment, in order to make it easier for the user to identify the read-out text, it is possible to specify different types of voices for reading the staff notation for the right hand and for reading the staff notation for the left hand. That is, the instruction receiving unit 30 can individually set the type of sound for each of the multiple parts of the music. In this embodiment, as the type of voice, male voice and female voice can be specified. The option NH1 designates either a male voice or a female voice as the voice for reading out the staff notation for the right hand. Option NH2 designates either a male voice or a female voice for reading the staff notation for the left hand.

It should be noted that, for example, the type of voice for reading musical note symbols and the type of voice for reading performance symbols may be designated as different types. Also, if four or more voice types can be specified, for example, the right-hand staff note symbol, the left-hand staff note symbol, the right-hand staff performance symbol, and the left-hand staff symbol It may also be possible to designate the performance symbols to be read out in different voices.

Also, if the sound emitting device 14 is a stereo speaker, it may be set so that the right speaker reads out the staff notation for the right hand, and the left speaker reads out the staff notation for the left hand. Further, when the sound emitting device 14 is a stereo speaker, it may be possible to designate a speaker for outputting the reading sound of musical note symbols and a speaker for outputting the reading sound of performance symbols separately.

In addition, for example, when reading out a chord, the user may be able to select whether to read out each note constituting the chord individually or read out the code name corresponding to the chord. In this case, for example, the name text C2 of the symbol text data TD may be stored with the text indicating the pitch name of each note forming the chord, and the semantic text C3 may be stored with the text indicating the chord name of the chord.

When these settings are completed and the OK button BT in FIG. 9 is pressed, the control device 11 starts the acoustic signal generation process. Further, the touch panel T displays, for example, a button for instructing the start of reading out a musical score (hereinafter referred to as "performance start button"). The user presses the performance start button at an appropriate timing to start reading out the musical score.

The text generation unit 32 shown in FIG. 4 generates text indicating the content of the musical score. FIG. 10 is a diagram schematically showing processing of the text generation unit 32. As shown in FIG. The text generator 32 reads the score data SD specified on the reception screen SC1 shown in FIG. 5 (S1). The text generation unit 32 classifies the musical score data SD into right-hand data representing a staff notation for the right hand and left-hand data representing a staff notation for the left hand (S2). Of the right hand data and left hand data, the data corresponding to the staff notation to be read aloud specified on the reception screen SC2 shown in FIG. 6 is to be processed thereafter.

The staff notation data to be read out includes attribute information B of all types of musical symbols (S3). The text generation unit 32 extracts the attribute information B of the symbol to be read aloud specified on the reception screen SC3 shown in FIG. ).

The text generation unit 32 collates the symbolic identifier B2 of the extracted attribute information B with the symbolic identifier C1 of the symbolic text data TD, and reads out the name text C2 or the meaning text C3 corresponding to the symbolic identifier C1 (S5). Which of the name text C2 and the meaning text C3 is read depends on which of the options NG1 or NG2 on the reception screen SC5 shown in FIG. 9 is selected. Further, which of the Japanese text and the English text is read depends on which of the options NF1 and NF2 on the reception screen SC5 is selected. In the drawing, these selection contents are described as "designation of reading contents". The read texts are arranged in the same order as the attribute information B (time series). Through the above processing, a text indicating the content of the musical score (hereinafter referred to as "read-aloud text") is generated (S6).

Also, the text generation unit 32 adds a timing label to the read-out text (S7). The timing label is information specifying the reading timing of the reading text. Here, even if the read-aloud text has the same content, the read-aloud speed differs according to the designated tempo of the read-aloud of the musical score on the reception screen SC4 shown in FIG. Therefore, the text generation unit 32 gives the read-out text a timing label corresponding to the setting of the read-aloud tempo of the musical score.

FIG. 11 is a diagram illustrating musical scores. 12 and 13 are diagrams schematically showing read-out timings of read-out texts. A musical score G shown in FIG. 11 indicates, for example, the first two bars of the musical score data (yyy.xml) specified to be read out on the reception screen SC1 shown in FIG. The musical score G includes a musical score for the right hand and a musical score for the left hand. Based on the tempo information TP, the musical score G is specified as 120 beats per minute, with one quarter note as one beat.

For example, in the reception screen SC4 shown in FIG. 8, when reading aloud at the tempo of the music (option NE1) is specified, a timing label is added so as to read aloud at the timing shown in FIG. FIG. 12 shows right-hand read-out sounds representing right-hand read-out sounds and left-hand read-out sounds representing left-hand read-out sounds. A time axis is shown between the reading sound for the right hand and the reading sound of the musical score for the left hand. One scale (t1) of the time axis is based on the eighth note, which is the shortest note in the musical score G. Based on the tempo of the musical score G described above, one scale (t1) on the time axis is 0.25 seconds.

I will explain the reading sound for the right hand. When the reading is started, "mesopiano", "staccato" and "mi" are read out for 0.25 seconds (period P1). The reading order of "mesopiano", "staccato" and "mi" is arbitrary. Next, "Staccato" and "Fa" are read aloud for 0.25 seconds (period P2). Comparing the period P1 and the period P2, the number of syllables to be read out per unit time is larger in the period P1, so the reading speed in the period P1 needs to be faster than that in the period P2. Next, for 0.5 seconds (period P3), "So" is read out. When the period P2 and the period P3 are compared, since the number of syllables to be read out per unit time is smaller in the period P3, the reading speed in the period P3 is lower than that in the period P2. Next, "mi" is read aloud for 0.5 seconds (period P4). When the period P3 and the period P4 are compared, since the number of syllables to be read out per unit time is the same in both periods, the reading speed in the period P3 and the period P4 is substantially the same. Next, for 0.5 seconds (period P5), a click sound (“click”) indicating a quarter note rest is read out.

That is, when reading aloud at the tempo of a piece of music is designated, the control device 11 controls when the target point in time at which the piece of music progresses at a speed corresponding to the tempo designated by the tempo information TP reaches the point in time corresponding to the performance symbol. , generates an acoustic signal so that the sound associated with the performance symbol is pronounced.

It should be noted that, for example, when the number of syllables to be read out per unit time is large and the user's understanding may be difficult, as in period P1, the text generation unit 32 may reduce the text to be read out. For example, if the speech time for one syllable is determined during reading, it is possible to determine whether or not reading is possible at the time the read-aloud text is generated. The text generation unit 32 determines whether or not the read-out text can be read out within the time based on the generated words, the tempo of the music, and the utterance time of one syllable.

If the text generation unit 32 determines that the reading cannot be performed within the time, it may delete the text corresponding to the performance symbol from the reading text and only read the musical note symbol. Alternatively, when the text generation unit 32 determines that the reading cannot be performed within the time, the text generating unit 32 may superimpose the reading sound of each symbol. Taking the period P1 as an example, "mesopiano", "staccato" and "mi" may be pronounced together.

Further, when it is determined that reading cannot be performed within the time period during which a plurality of performance symbols are to be read aloud (for example, period P1 in FIG. 12), the text generation unit 32 reads a part of the plurality of performance symbols. The remaining performance symbols may be excluded from reading. Taking the period P1 as an example, the text generation unit 32 does not need to read out one of "mesopiano" or "staccato" and not read out the other. That is, when "mesopiano" and "staccato" are pronounced together, the control device 11 selects either "mesopiano" or "staccato" and reads out the text corresponding to the selected performance symbol. include in "Mezzo piano" is an example of a sound associated with the first performance symbol, and "staccato" is an example of a sound associated with the second performance symbol.

It should be noted that the instruction receiving unit 30 may allow the user to set the reading priority for each classification of performance symbols. In this case, the text generator 32 deletes from the read-aloud text in order from the text of the performance symbols belonging to the lower priority category.

Alternatively, for example, if a non-verbal sound corresponding to each performance symbol is determined in advance and it is determined that the reading cannot be performed within the time limit, the non-verbal sound is included in the read-aloud text instead of the text corresponding to the performance symbol. good too.

I will explain the reading sound for the left hand. The left-hand score shows triads consisting of three pitch classes. When the reading is started, the chord of the first measure is read. The reading of the chords of the first measure continues for two seconds. In this embodiment, the chord is not pronounced as "domiso", but "do", "mi", and "so" are read out as independent sounds. At this time, if reading out of "do", "mi" and "so" is started at the same time, the user may not be able to distinguish the sounds. Therefore, as shown in FIG. 12, reading start timings of "do", "mi" and "so" may be slightly shifted. Slightly may be, for example, a time less than the time corresponding to the shortest note in the musical score. In the case of the musical score G, the time less than 0.25 seconds corresponding to the eighth note corresponds slightly to .

In this way, by reading out the musical score according to the tempo of the musical piece, the user can grasp the rhythm of the musical piece as well as the pitches of the notes indicated in the musical score.

Also, for example, when reading all items (option NE2) is specified on the reception screen SC4 shown in FIG. 8, a timing label is added so that reading is performed at the timing shown in FIG. In reading out all items, the number of syllables read out per unit time is fixed. FIG. 13 shows only the reading sounds for the right hand and omits the reading sounds for the left hand. In FIG. 13, symbol Mti (i is an integer from 1 to 9) indicates a metronome sound. The user can grasp the break of the beat by the metronome sound Mti.

When the reading starts, the phrase "first bar" is read aloud, indicating that the first bar will be read. Next, a metronome sound Mt1 is pronounced, after which "mesopiano", "staccato", "mi", "staccato" and "fa" are read aloud. Next, the metronome sound Mt2 is pronounced, and "so" is pronounced. Likewise, the text indicating the pitch of the musical note symbol is read aloud between the metronome sounds Mti. Note that the last note of the second bar is a half note and has a length of two beats. In this case, after the metronome sound Mt7, the reading of "mi" is extended to "me" and continued, and the metronome sound Mt8 is superimposed with "me". After that, the metronome sound Mt9 is pronounced after the reading of "Me" is finished.

It is preferable to link the reading sound for the left hand with the reading sound for the right hand. For example, after the metronome sound Mt1 is pronounced, reading of "do", "me", and "so" is continued until immediately before the phrase "second bar" is read. The reading start timings of "do", "me", and "so" may be slightly shifted as described above. After that, after the metronome sound Mt5 is pronounced, reading out of "shi", "re", and "so" is continued until immediately before the metronome sound Mt9 is pronounced.

In this way, the control device 11 may generate an acoustic signal indicating a sound produced by pronouncing a sound related to the performance symbol regardless of the tempo of the music. As a result, the user can comprehend all the symbols of the specified type, and can comprehend the content of the musical score without omission.

Note that if the user designates reading at a tempo synchronized with the performance of the musical score by the user (option NE3) on the reception screen SC4 shown in FIG. may not be added. If the user designates an arbitrary tempo (choice NE4), the tempo of the music in the explanation of reading out at the tempo of the music (option NE1) is replaced with the tempo designated by the user, and the same process is performed. should be done.

In addition, in the reception screen SC4 shown in FIG. 8, the option NE4 is a mode in which the user designates an arbitrary tempo by designating the number of beats per minute. Alternatively, for example, the user may designate the progress of reading using the manipulator. The operator may be an operation button displayed on the touch panel T, for example. Further, for example, when the information processing apparatus 10 is connected to a musical instrument played by the user, the operator may be a member of the musical instrument. For example, if the musical instrument is a piano, pedals can be used as operators. In this case, for example, when the user steps on the damper pedal once, reading progresses in units of one note symbol or one bar, and when the user steps on the soft pedal once, the reading progresses by one note symbol. , or the reading may be reversed in units of one bar or the like.

The voice synthesizing unit 34 shown in FIG. 4 uses the read-out text generated by the text generating unit 32 and the voice data VD to generate an acoustic signal. The speech synthesizer 34 is an example of a generator. The speech synthesizing unit 34 sequentially selects speech segments corresponding to the text to be read out of a plurality of speech segments included in the speech data VD, adjusts the pitch of each speech segment, and then connects them to each other. to generate an acoustic signal. The pitch of the sound related to the musical note symbol in the reading text may be matched with the pitch of the musical note symbol, or may be a predetermined pitch. By supplying the sound signal generated by the voice synthesizing unit 34 to the sound emitting device 14 , the sound indicating the musical score is reproduced from the sound emitting device 14 .

The performance analysis unit 38 analyzes the performance of the musical instrument by the user. The performance analysis unit 38 analyzes, for example, a position (playing position) in a piece of music where the user is playing the musical instrument. The performance analysis unit 38, for example, picks up the performance sound of the musical instrument using the sound collection device 13, and analyzes the pitch and duration of the performance sound. The performance analysis unit 38 compares the pitches of the analyzed performance sounds with the pitches of the notes on the score data SD, and sequentially analyzes the performance positions in the music at each of a plurality of points in time on the time axis. .

Further, for example, if the musical instrument is an electronic musical instrument, the performance analysis unit 38 may acquire performance information indicating the operating state of the musical instrument from the electronic musical instrument. For example, if the electronic musical instrument is an electronic piano, the operating state is the identifier of the pressed key and the pressing force. In this case, the performance analysis section 38 uses the performance information to map the performance position at each point on the musical score.

In the first embodiment, the performance analysis unit 38 operates only when the user designates reading out at a tempo synchronized with the performance of the musical score (option NE3) on the acceptance screen SC4 shown in FIG. good.

The output control unit 40 controls the output of sound based on the acoustic signal and the output of the musical score image based on the musical score image data MD. For example, in the reception screen SC4 shown in FIG. 8, when only the reading of the musical score is specified (option ND1), the output control unit 40 causes the sound emitting device 14 to reproduce the sound represented by the acoustic signal generated by the voice synthesis unit 34. Let Further, when reading out the score and displaying the score image are designated on the reception screen SC4 (option ND2), the output control unit 40 outputs the sound represented by the acoustic signal generated by the voice synthesis unit 34 to the sound emitting device 14. The musical score image data MD is displayed on the display device 16 while being reproduced.

FIG. 14 is a diagram illustrating a display screen during reading of musical scores. For example, when the user touches a performance start button displayed on the touch panel T after finishing inputting a selection instruction to the reception screen SC5 shown in FIG. 14, the display of the touch panel T, which is the display device 16, is switched to the display screen SC6 shown in FIG. On the display screen SC6, a message 601 indicating that the reading sound of the musical score is being reproduced, a musical score image 602, a pause button 604, a fast forward button 606, a rewind button 608, a repeat button 610, An end button 612 is displayed.

The musical score image 602 is an image displaying the musical score image data MD included in the musical score data SD to be read aloud. A bar 603 indicating the reading position is superimposed on the musical score image 602 and displayed. The output control unit 40 scrolls the musical score image 602 based on the timing label attached to the reading text. At this time, the output control unit 40 adjusts the scrolling speed of the musical score image 602 so that the music symbol being read out and the bar 603 are superimposed. It should be noted that instead of displaying the read-out position with the bar 603, the musical symbols to be read-out may be highlighted.

Although FIG. 14 exemplifies a musical score using staff notation as the musical score image 602, the musical score image 602 may be displayed as a piano roll, for example. Further, in the reception screen SC4 shown in FIG. 8, when only the reading of the musical score is specified (option ND1), the musical score image 602 is not displayed.

A pause button 604, a fast-forward button 606, a rewind button 608, a review button 610, and an end button 612 accept operations related to reading out the musical score. When the pause button 604 is operated, the output control unit 40 pauses reading out the score. When the fast-forward button 606 is operated, the output control unit 40 fast-forwards reading of the musical score. For example, when the fast-forward button 606 is touched once, the output control unit 40 changes the readout position to the beginning of the bar next to the bar containing the current readout position. When the rewind button 608 is operated, the output control unit 40 rewinds the reading of the musical score. For example, when the rewind button 608 is touched once, the output control unit 40 changes the readout position to the beginning of the bar containing the current readout position. When the re-listen button 610 is operated, the output control section 40 re-reads the score being read aloud from the beginning. In other words, the output control unit 40 changes the reading position to the beginning of the first bar of the musical score being read. When the end button 612 is operated, the output control unit 40 ends the reading of the score being read.

It should be noted that the user may be able to specify the reading start position of the musical score. For example, the output control unit 40 displays a performance start button and a musical score image on the touch panel T while waiting for an instruction to start reading. Referring to FIG. 14, instead of the message 601 on the display screen SC6, a reading start button is displayed. The user scrolls the musical score image 602 so that the position on the musical score image 602 where he/she wants to start reading aloud overlaps with the bar 603 . When the read-out start button is touched in this state, reading-out is started from a position overlapping the bar 603 on the musical score image 602 . Alternatively, the reading start position of the musical score may be specified by specifying the number of the bar at which reading is to be started, for example.

8, the output control unit 40 selects reading aloud at a tempo synchronized with the performance of the musical score by the user (option NE3). Based on this, adjust the output timing of the reading sound. The output control unit 40 adjusts the output speed of the reading sound so that, for example, on the musical score, the position ahead of the performance position by a predetermined beat is read out. The predetermined beat may be specified by the user.

As another example, when the performance position is the N-th measure (N is an integer equal to or greater than 1), the output control unit 40 may control the music included in the N+1-th measure immediately before the performance of the N-th measure to end. The symbols may be read aloud. Immediately before the performance of the N-th measure ends is, for example, after the last note of the N-th measure is played. This imitates, for example, a teaching method in which a chorus conductor prereads and shows lyrics to be sung next to the chorus members.

FIG. 15 is a flow chart illustrating a specific procedure of processing for the control device 11 to execute the musical score reading application. For example, an instruction from the user to the operation device 15 triggers the start of the musical score read-aloud application.

When the score reading application is started, the control device 11 (instruction receiving unit 30) displays reception screens SC1 to SC5 shown in FIGS. S100). The various designations include, for example, designation of musical score data SD to be read aloud, designation of types of symbols to be read aloud, designation of a reading language, and the like.

The control device 11 (text generation unit 32) generates reading text based on the contents of various designations received in S100, the musical score data SD, and the symbol text data TD (S102). The control device 11 (speech synthesizing unit 34) uses the read-out text and the voice data VD to generate an acoustic signal that reads out the read-out text by voice synthesis (S104). The control device 11 (output control unit 40) waits until the user gives an instruction to read out the musical score (S106: NO). When the user instructs to read out the musical score (S106: YES), the control device 11 (output control unit 40) reproduces the sound signal from the sound emitting device 14 (S108), and ends the processing according to this flowchart.

It should be noted that the generation of the read-out text (S102) and the generation of the acoustic signal (S104) may be performed after the user instructs to read out the score (S106: YES).

As described above, in the first embodiment, acoustic signals representing sounds related to performance symbols are generated based on musical score data SD including one or more performance symbols. Therefore, the performance symbols included in the musical score can be comprehended aurally, and even visually impaired people, beginners and small children who are not accustomed to reading musical scores can easily comprehend the musical score.

In addition, in the first embodiment, the sound related to the performance symbol is a sound indicating the name of the performance symbol or a sound indicating a phrase corresponding to the meaning of the performance symbol. If the sound associated with the performance symbol is the sound indicating the name of the performance symbol, the description on the musical score can be accurately grasped. In addition, if the sound related to the musical performance symbol is a sound indicating a phrase corresponding to the meaning of the musical performance symbol, even if the user lacks knowledge of the musical performance symbol and cannot understand the meaning of the performance symbol only by the name of the musical performance symbol, Able to grasp the contents indicated by musical scores.

Also, in the first embodiment, sounds corresponding to performance symbols are pronounced at timings corresponding to the tempo of music. This makes it easier for the user to grasp the positions of the performance symbols in the music, thereby improving convenience.

Further, in the first embodiment, when a sound related to the first performance symbol and a sound related to the second performance symbol are superimposed, either the first performance symbol or the second performance symbol is selected. to generate an acoustic signal. As a result, the sound related to the first performance symbol and the sound related to the second performance symbol do not overlap, and the audibility of the sound related to the performance symbol can be improved.

Also, in the first embodiment, sounds corresponding to performance symbols are pronounced regardless of the tempo of the music. As a result, the sounds related to the performance symbols are not over-pronounced, and the audibility of the sounds related to the performance symbols can be improved.

Also, in the first embodiment, acoustic signals are generated for performance symbols belonging to a category selected from a plurality of categories. Therefore, it is possible to selectively produce sounds related to performance symbols required by the user, thereby improving convenience.

Also, in the first embodiment, in addition to the sound related to the performance symbol, an acoustic signal representing the sound related to the musical note symbol is generated. Therefore, the musical note symbols included in the musical score can be comprehended aurally, and the comprehension of the musical score can be further facilitated.

Also, in the first embodiment, a non-verbal notification sound is pronounced as the sound related to the rest. As a result, the user can immediately recognize that the sound associated with the rest corresponds to the rest when the sound related to the rest is pronounced.

B: Second Embodiment A second embodiment will be described. In each aspect illustrated below, elements having the same functions as those of the first embodiment are denoted by the same reference numerals as in the description of the first embodiment, and detailed descriptions thereof are appropriately omitted.

In the first embodiment, among the plurality of musical score data SD stored in the information processing device 10, the musical score data SD specified by the user is read out. On the other hand, even if a list of data names of the musical score data SD such as the reception screen SC1 shown in FIG. 5 is displayed, the user may not be able to identify the musical score data SD corresponding to the desired song. In the second embodiment, a part of a plurality of musical score data SD is continuously read out so that the user can specify the musical score data SD corresponding to the desired music piece. The function of continuously reading a part of a plurality of musical score data SD is hereinafter referred to as a "table of contents presentation function".

FIG. 16 is a diagram illustrating an instruction reception screen by the instruction reception unit 30. FIG. In the second embodiment, when the score reading application is activated, the instruction receiving unit 30 causes the touch panel T to display a menu selection instruction receiving screen SC7 as shown in FIG. 16, for example. Choices NI1 and NI2 are displayed on the acceptance screen SC7. The option NI1 specifies reading out the musical score data SD selected by the user as shown in the first embodiment. When the option NI1 is touched, the instruction receiving unit 30 displays the receiving screen SC1 shown in FIG. 5, and receives the designation of the musical score data SD to be read out from the user.

Option NI2 designates execution of the table of contents presentation function. In option NI2, the table of contents presentation function is called "melody table of contents". When the option NI2 is selected, the text generator 32 generates read-out text for reading a part of the musical score represented by the musical score data SD for each of the plurality of musical score data SD stored in the storage device 12. A portion of the musical score includes, for example, performance symbols and musical note symbols.

A part of the musical score is, for example, a part or all of a specific structural section among multiple sections (hereinafter referred to as "structural sections") that divide a piece of music according to its musical meaning. Structural sections are, for example, sections such as an intro, an A melody, a B melody, a chorus, and an outro. Specifically, the text generation unit 32 generates, for each of the plurality of musical score data SD, a text for reading out, for example, the structure section of the “chorus” of the music. Alternatively, the text generation unit 32 generates, for each of the plurality of musical score data SD, a text for reading out the structural section of the "intro" (predetermined number of bars at the beginning of the musical score) of the music, for example.

It should be noted that the musical score data SD may include information indicating the correspondence relationship between position information (for example, bar numbers) in the musical score and structure sections. Further, various instructions (see FIGS. 6 to 9) performed by the instruction receiving unit 30 in the first embodiment may be received even when the table of contents presentation function is executed. The voice synthesizing unit 34 generates an acoustic signal by using the reading text generated from each musical score data SD and the voice data VD. The output control unit 40 causes the sound emitting device 14 to reproduce sound based on the acoustic signal.

FIG. 17 is a diagram exemplifying the display screen during execution of the table of contents presentation function. Displays NA1 to NA5 indicating the data names of the musical score data SD stored in the storage device 12 are displayed on the display screen SC8. The displays NA1 to NA5 are arranged in the vertical direction, and a part of the musical score is read out in order from the musical score data "xxx.xml" indicated by the display NA1. When the reading of the musical score data "xxx.xml" is completed, the reading of the musical score data "yyy.xml" indicated by display NA2 is started. The display corresponding to the musical score data being read (display NA2 in FIG. 17) may be displayed in a background color different from that of other displays.

It should be noted that the display screen SC8 of FIG. 17 may allow the user to select the musical score data SD to be read aloud by the table of contents presentation function from among the plurality of musical score data SD stored in the storage device 12. Further, the user may be allowed to specify the order of reading out the musical score data SD in the table of contents presentation function.

Further, the display screen SC8 displays a pause button 604, a fast forward button 606, a rewind button 608, a repeat button 610, and an end button 612 similar to the display screen SC6 shown in FIG. . The user can use the pause button 604, the fast-forward button 606, the rewind button 608, the re-listen button 610, and the end button 612 to perform operations related to reading out the musical score even while the table of contents presentation function is being executed.

That is, in the second embodiment, the musical score data SD is the first musical score data, and the storage device 12 also stores second musical score data different from the first musical score data. The control device 11 generates a first acoustic signal representing a sound related to the performance symbol and a sound related to the musical note symbol included in a portion of the first musical score corresponding to the first musical score data, and a second acoustic signal corresponding to the second musical score data. Generating a second acoustic signal indicating sounds related to performance symbols and sounds related to musical note symbols included in a portion of the musical score. Further, the control device 11 causes the sound emitting device 14 to sequentially reproduce the first acoustic signal and the second acoustic signal. For example, the first musical score data is musical score data "xxx.xml", and the second musical score data is musical score data "yyy.xml".

According to the second embodiment, the control device 11 selects a portion from each of a plurality of musical score data, and sequentially reproduces sounds related to performance symbols and musical note symbols included in the selected portion. As a result, the user can easily grasp which musical score of each piece of musical score data SD corresponds to, and quickly select the desired musical score data SD from among the plurality of musical score data SD. can do.

C: Third Embodiment A third embodiment will be described. In each aspect illustrated below, elements having the same functions as those of the first embodiment are denoted by the same reference numerals as in the description of the first embodiment, and detailed descriptions thereof are appropriately omitted.

In the first embodiment, the information processing device 10 reads out the musical score data SD. In the third embodiment, in addition to reading out the score data SD, the information processing apparatus 10 assists the user in making the sound of the performance closer to the sound shown in the score.

FIG. 18 is a block diagram illustrating the functional configuration of the control device 11A in the third embodiment. The control device 11A includes a performance evaluation section 42 in addition to the configuration of the control device 11 (see FIG. 4) according to the first embodiment. The performance evaluation section 42 evaluates the performance of the musical instrument by the user based on the analysis result of the performance analysis section 38 . Here, in the first embodiment, the performance analysis unit 38 analyzed the performance position of the musical instrument by the user. In the third embodiment, the performance analysis unit 38 analyzes the volume of the performance sound of the musical instrument in addition to the analysis of the performance position.

The performance evaluation unit 42 evaluates whether the user's performance conforms to the musical symbols of the score. More specifically, the performance evaluation unit 42 detects the difference between the performance sound, which is the sound of the musical composition played by the user, and the sound indicated by the musical symbols included in the musical score representing the musical composition, and determines that the difference is determined in advance. Determine whether or not it is out of the allowable range.

For example, the evaluation of whether or not the performance follows the musical notation is based on the difference between the pitch of the sound played by the user and the pitch of the note on the score, and the duration of the sound in the performance and the sound of the note on the score. by detecting the difference between the For example, the smaller the difference, the performance evaluation unit 42 evaluates that the performance is being performed along the musical note symbols of the musical score, that is, the performance skill is high.

The user sets the allowable range of the difference, for example, based on his or her performance skill level. Generally, it is considered that the allowable difference becomes smaller as the skill level of the user increases. If there is a portion where the difference is out of the allowable range, the text generation unit 32 generates text pointing out the portion. Specifically, for example, "Right hand, 2nd measure, 'Fa, re, me' was changed to 'Fa, mi, le'." In addition, it generates a text that reads out the pitch and duration of the performance performed by the user. Such text is referred to as "supporting text".

Also, whether or not the performance follows the performance symbols is determined for each performance symbol. For example, when the performance symbol is a dynamic symbol, the performance evaluation unit 42 evaluates the performance by detecting the difference between the volume of the sound played by the user and the volume of the performance along the dynamic symbol. Further, the performance evaluation section 42 detects the difference between the duration of the sound played by the user when the performance symbol is an articulation symbol, and the duration when the performance is performed along the articulation symbol. by doing. For example, the smaller the difference, the performance evaluation unit 42 evaluates that the performance is performed along the performance symbols of the musical score, that is, the performance skill is high.

The user sets the allowable range of the difference, for example, based on his or her performance skill level. If there is a portion where the difference is out of the allowable range, the text generation unit 32 generates support text pointing out the portion. Specifically, for example, "Right hand, 1st measure, 'staccato, mi, staccato, fa', staccato momentum is weak," etc. Generate supporting text to indicate that the performance symbols were not reflected.

The speech synthesis unit 34 uses the supporting text and the speech data VD to generate an acoustic signal. Of the supporting text, the text indicating the pitch of the musical note symbol may be read aloud with a voice of a pitch corresponding to the pitch. The output control unit 40 causes the sound emitting device 14 to reproduce sound based on the acoustic signal.

The audio signal may be played after the user has finished playing the music, or may be played during the performance. When performing during the performance, the output control unit 40 may immediately reproduce the support text when the difference is out of the allowable range, for example. In this case, after reproducing the support text, the output control unit 40 may reproduce a voice prompting the player to replay the part where the difference is out of the allowable range (the part pointed out by the support text). . When the user plays the part pointed out by the support text, the performance evaluation unit 42 evaluates whether the performance follows the musical symbols of the score and repeats the above process. As a result, it is possible to encourage the user to repeatedly practice the parts that the user is not good at (the parts that are difficult to play according to the musical symbols of the score), so that the user can efficiently master the performance of the music represented by the score. be able to.

It should be noted that, for example, the performance sound of the user may be recorded, and the part of the recorded performance sound corresponding to the point pointed out in the support text may be reproduced together with the support text read aloud.

Also, when generating the support text, the performance symbols may always be read aloud regardless of the presence or absence of differences. In this case, for example, if the difference between the performance symbols and the performance is large, the read-out voice is increased (the greater the difference, the louder the read-out voice is) so that the user can check whether the performance follows the performance symbols. can be grasped.

That is, in the third embodiment, the control device 11 acquires the performance sound, which is the sound of the music played by the user, and detects the difference between the sound indicated by the musical symbols included in the musical score representing the music and the performance sound. do. When the difference is out of a predetermined allowable range, the control device 11 generates an acoustic signal representing a sound related to the musical symbol included in the portion of the musical score corresponding to the location where the difference occurs.

According to the third embodiment, the user can grasp the difference between his/her own performance and the content indicated by the musical score, and can efficiently master the performance of the musical piece indicated by the musical score. Specifically, the control device 11 indicates to the user the position on the musical score of the portion where the difference occurs by reading out the musical symbol of the portion where the difference occurs. As a result, the user can intuitively grasp the location on the musical score where the difference occurs, compared to, for example, simply reading out the position (bar number, etc.) in the musical score mechanically. Further, in the third embodiment, the content of the user's performance is verbalized. For example, the control device 11 reads aloud the pitch and value of the performance performed by the user. This allows the user to objectively grasp the details of his/her own error.

D: Modifications Examples of specific modifications added to the above-exemplified embodiments are given below. A plurality of aspects arbitrarily selected from the following examples may be combined as appropriate within a mutually consistent range.

(1) In each of the above-described embodiments, the speech synthesizing unit 34 performs segment-connected speech synthesis, but the method of speech synthesis is not limited to the above examples. For example, statistical model type speech synthesis using statistical models such as deep neural networks or HMMs (Hidden Markov Models) may be used.

(2) In each of the above embodiments, the musical score data SD is used to read out the sounds associated with the symbols included in the musical score. The method of using the score data SD is not limited to this, and performance sounds may be reproduced using the score data SD. Specifically, for example, the information processing apparatus 10 may reproduce the performance sound of the left-hand staff notation of the musical score represented by the score data SD, and read out the music symbols in the right-hand staff. The user practices playing with the right hand while listening to the reading sound of the musical symbols. By reproducing the performance sound of the staff notation for the left hand, the user can efficiently master the timing of performance with the right hand and the harmony of the music.

(3) The information processing device 10 may be implemented by a server device that communicates with an information device such as a smart phone or a tablet terminal. For example, the information processing device 10 receives designation of musical score data SD from the information device, and generates an acoustic signal by speech synthesis processing using the designated musical score data SD. The information processing device 10 transmits an acoustic signal generated by speech synthesis processing to the information device. The information device reproduces the acoustic signal.

(4) The functions of the information processing device 10 (the instruction receiving unit 30, the text generating unit 32, the speech synthesizing unit 34, the performance analyzing unit 38, the output control unit 40, and the performance evaluating unit 42) include the control device 11 as described above. It is realized by cooperation of one or more constituent processors and the program PG stored in the storage device 12 .

(5) In each of the above-described embodiments, the manner in which the user looks at the items displayed on the touch panel T and touches the touch panel T when performing various settings and instructions in the musical score reading application has been described. For example, the presentation of information (such as selection items in settings) to the user may be performed by reading aloud. Further, the input from the user to the information processing device 10 may be performed by voice input. In particular, when a visually impaired person uses a musical score reading application, the use of voice is effective.

The above program can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disc) such as a CD-ROM is a good example. Also included are recording media in the form of Note that the non-transient recording medium means a transient propagation signal (transitory,
volatile recording media are not excluded. Also, in a configuration in which a distribution device distributes a program via a communication network, a recording medium for storing the program in the distribution device corresponds to the non-transitory recording medium described above.

E: Supplementary Note The following configurations, for example, can be grasped from the above-exemplified forms.

An information processing apparatus according to one aspect (aspect 1) of the present disclosure is realized by a computer system, and generates an acoustic signal representing a sound related to the performance symbols based on musical score data representing a musical score including one or more performance symbols. do. Therefore, the performance symbols included in the musical score can be grasped by hearing, and even visually handicapped people, beginners who are not accustomed to reading musical scores, and small children can easily grasp the musical score.

In the specific example of aspect 1 (aspect 2), the sound related to the performance symbol is a sound indicating the name of the performance symbol or a sound indicating a phrase corresponding to the meaning of the performance symbol. In the above aspect, when the sound associated with the performance symbol is the sound indicating the name of the performance symbol, it is possible to accurately grasp the description on the musical score. In addition, when the sound associated with the performance symbol is a sound indicating a word or phrase corresponding to the meaning of the performance symbol, even if the user lacks knowledge of the performance symbol and cannot understand the meaning of the performance symbol only by the name of the performance symbol, the musical score can be used. It is possible to grasp the contents indicated by .

In a specific example of Aspect 1 or Aspect 2 (Aspect 3), the musical score data includes tempo information specifying a tempo of a piece of music indicated by the musical score, and in generating the acoustic signal, the tempo specified by the tempo information. The acoustic signal is generated such that when the target point in time at which the music progresses at the speed corresponding to reaches the point in time corresponding to the performance symbol, the sound associated with the performance symbol is pronounced. In the above mode, sounds corresponding to performance symbols are produced at timings corresponding to the tempo of music. Therefore, it becomes easier for the user to grasp the positions of the performance symbols in the music, and convenience can be improved.

In a specific example of any one of Aspects 1 to 3 (Aspect 4), the one or more performance symbols include a first performance symbol and a second performance symbol, and in generating the acoustic signal, When the sound related to the first performance symbol and the sound related to the second performance symbol are superimposed, selecting either the first performance symbol or the second performance symbol, generating said acoustic information representing a sound; In the above aspect, when a sound related to the first performance symbol and a sound related to the second performance symbol are superimposed, either the first performance symbol or the second performance symbol is selected. Generates acoustic information. Therefore, the sound related to the first performance symbol and the sound related to the second performance symbol do not overlap, and the audibility of the sound related to the performance symbol can be improved.

In the specific example of Aspect 1 (Aspect 5), the musical score data includes tempo information specifying the tempo of the music indicated by the musical score, and in generating the acoustic signal, the musical performance symbols are used regardless of the tempo of the music. Generating an acoustic signal indicative of the sound produced. In the above mode, sounds corresponding to performance symbols are produced regardless of the tempo of music. Therefore, the sounds related to the performance symbols are not over-pronounced, and the audibility of the sounds related to the performance symbols can be improved.

In a specific example of any one of Aspects 1 to 5 (Aspect 6), the one or more performance symbols are a plurality of performance symbols, and each of the plurality of performance symbols belongs to one of a plurality of classifications. , receiving a selection of at least one of the plurality of categories, and generating the acoustic signal for a musical performance symbol belonging to the one or more categories related to the selection among the plurality of musical performance symbols. Generate. In the above aspect, acoustic signals are generated for performance symbols belonging to the selected classification. Therefore, it is possible to selectively produce sounds related to the performance symbols required by the user, thereby improving convenience.

In a specific example of any one of Aspects 1 to 6 (Aspect 7), the musical score includes musical note symbols in addition to the performance symbols, and generating the acoustic signal includes sounds associated with the performance symbols and generating said acoustic signal representing a sound associated with a musical note symbol. In the above aspect, in addition to the sounds related to performance symbols, acoustic signals representing sounds related to musical note symbols are generated. Therefore, the musical note symbols included in the musical score can be comprehended aurally, and the comprehension of the musical score can be further facilitated.

In the specific example of aspect 7 (aspect 8), in generating the acoustic signal, an acoustic signal representing a non-verbal notification sound is generated as the sound related to the rest. In the above embodiments, non-verbal notification sounds are used as the sounds related to rests. Therefore, when a sound related to a rest is pronounced, the user can immediately recognize that the sound corresponds to the rest.

In a specific example of Aspect 7 or Aspect 8 (Aspect 9), the musical score data is first musical score data, and the generation of the acoustic signal is included in a portion of the first musical score corresponding to the first musical score data. generating a first acoustic signal indicating a sound related to a performance symbol and a sound related to the musical note symbol; , generating a second acoustic signal indicative of the sound associated with the musical note symbol, and further comprising causing a sound emitting device to sequentially reproduce the first acoustic signal and the second acoustic signal. In the above aspect, a portion is selected from each of a plurality of musical score data, and sounds associated with performance symbols and musical note symbols included in the selected portion are sequentially reproduced. Therefore, the user can easily grasp the musical score of which musical piece each of the plurality of musical score data corresponds to, and can quickly select the desired musical score data from the plurality of musical score data.

A program according to one aspect (a tenth aspect) of the present disclosure functions as a generation unit that generates an acoustic signal representing a sound related to one or more performance symbols based on musical score data representing a musical score including one or more performance symbols. Let

An information processing apparatus according to one aspect (aspect 11) of the present disclosure includes a generation unit that generates an acoustic signal representing a sound associated with one or more performance symbols based on musical score data representing a musical score including one or more performance symbols.

Here, Braille sheet music requires about three times as much paper space to write the same content as regular sheet music, and it takes time to read. For this reason, for example, when a user forgets the title of a piece of music and wants to find the desired score from the contents of the score, the user has to spend time reading a plurality of scores, which is inconvenient. .

An information processing apparatus according to one aspect of the present disclosure is implemented by a computer system, and includes a first musical symbol representing a sound related to the musical symbol included in a portion of a first musical score corresponding to first musical score data including one or more musical symbols. generating an acoustic signal, and generating a second acoustic signal indicating a sound associated with the musical symbol included in a portion of a second musical score that includes one or more of the musical symbols and corresponds to second musical score data that is different from the first musical score data; and causing a sound emitting device to sequentially reproduce the first acoustic signal and the second acoustic signal.

In addition, the purpose of musical instrument training is to correctly read the musical symbols written on the score and to play the sounds indicated by the musical symbols. On the other hand, there are cases in which the player himself/herself cannot identify whether the musical symbols written on the musical score are correctly read and whether the sounds indicated by the musical symbols are correctly played. For this reason, performers generally receive training by requesting guidance from instructors. However, it is not realistic to have an instructor by your side all the time, and opportunities to receive feedback on your performance are limited.

An information processing apparatus according to one aspect of the present disclosure is realized by a computer system, acquires a performance sound that is a sound of a musical piece played by a user, and obtains a sound indicated by a musical symbol included in a musical score indicating the musical piece, Detecting a difference from the performance sound, and generating an acoustic signal indicating a sound related to a musical symbol included in a portion of the musical score corresponding to the portion where the difference occurs when the difference is out of a predetermined allowable range. do.

DESCRIPTION OF SYMBOLS 10...

Information processing apparatus

11, 11A... Control apparatus 12... Storage apparatus 13... Sound collection apparatus 14... Sound emission apparatus 15... Operation apparatus 16... Display apparatus 30... Instruction reception part 32... Text generation Section 34...Speech synthesis section 38...Performance analysis section 40...Output control section 42...Performance evaluation section PG...Program SD...Score data T...Touch panel TD...Symbol text data VD...Voice data.

Claims

An information processing method implemented by a computer system, comprising generating an acoustic signal representing a sound associated with one or more performance symbols based on musical score data representing a musical score including one or more performance symbols.
2. The information processing method according to claim 1, wherein the sound associated with the musical performance symbol is a sound indicating the name of the musical performance symbol or a sound indicating a phrase corresponding to the meaning of the musical performance symbol.
the musical score data includes tempo information specifying the tempo of the music indicated by the musical score;
In the generation of the acoustic signal, when a target point in time at which the music piece progresses at a speed corresponding to the tempo specified by the tempo information reaches a point in time corresponding to the performance symbol, a sound associated with the performance symbol is produced. 3. The information processing method according to claim 1 or 2, wherein the acoustic signal is generated so as to be
the one or more performance symbols include a first performance symbol and a second performance symbol;
In the generation of the acoustic signal, when a sound related to the first performance symbol and a sound related to the second performance symbol are superimposed, either the first performance symbol or the second performance symbol is generated. 4. The information processing method according to any one of claims 1 to 3, wherein the acoustic signal representing the sound associated with the selected performance symbol is generated by selecting one of:
the musical score data includes tempo information specifying the tempo of the music indicated by the musical score;
2. The information processing method according to claim 1, wherein in the generation of the acoustic signal, the acoustic signal representing the sound of the sound associated with the performance symbol is generated regardless of the tempo of the music piece.
The one or more performance symbols are a plurality of performance symbols,
each of the plurality of performance symbols belongs to one of a plurality of classifications,
Receiving selection of at least one classification from the plurality of classifications;
6. The information processing according to any one of claims 1 to 5, wherein in generating the acoustic signal, the acoustic signal is generated for a musical performance symbol belonging to the one or more classifications related to the selection among the plurality of musical performance symbols. Method.
The musical score includes musical note symbols in addition to the performance symbols,
7. The information processing method according to any one of claims 1 to 6, wherein generating the acoustic signal includes generating the acoustic signal representing a sound related to the performance symbol and a sound related to the musical note symbol.
8. The information processing method according to claim 7, wherein in the generation of the acoustic signal, an acoustic signal representing a non-verbal notification sound is generated as the sound related to the rest.
The musical score data is first musical score data,
Generating the acoustic signal includes:
generating a first acoustic signal indicating a sound related to the performance symbol and a sound related to the musical note symbol included in a portion of the first musical score corresponding to the first musical score data;
generating a second acoustic signal indicating a sound related to the performance symbol and a sound related to the musical note symbol included in a portion of a second musical score corresponding to second musical score data different from the first musical score data;
9. The information processing method according to claim 7, further comprising causing a sound emitting device to sequentially reproduce the first acoustic signal and the second acoustic signal.
A program that causes a computer system to function as a generation unit that generates an acoustic signal representing a sound associated with one or more performance symbols, based on musical score data representing a musical score containing one or more performance symbols.
An information processing apparatus comprising: a generating unit configured to generate an acoustic signal representing a sound associated with one or more performance symbols based on musical score data representing a musical score including one or more performance symbols.